CDash:Design: Difference between revisions

From KitwarePublic
Jump to navigationJump to search
Line 43: Line 43:


The graphs are generated from an Ajax request to the database. The fetching time is proportional to the number of entries in the database.
The graphs are generated from an Ajax request to the database. The fetching time is proportional to the number of entries in the database.
= Users Statistics =
Started in CDash 1.2, user statistics are collected for each checking as long as the user is registered in CDash.
There are some limitations to the system since a build can reflect multiple checkins by different users.
* If only one author is responsible for the build, then all the statistics go to this author
* If several authors have checked in modifications but the modifications have cleaned some warnings, errors or tests, then each author gets the credit.
* If several authors have checked in modifications and the modifications have introduced some warnings, errors or test failures, CDash looks at the modified files and tries to see which ones are causing the problems and update the statistics accordingly.


= Global Project Revisions =  
= Global Project Revisions =  

Revision as of 15:01, 4 September 2008

< CDash Main Page

Time storage in DB

CDash stores all dates and time in GMT (also known as UTC) format. Therefore, when a build is submitted to CDash, a conversion is done from the current timezone to GMT. Some timezones are not currently solved by PHP and some other issues might happen, like Australian eastern timezone is expressed as EST which is the same as the American eastern time (EST), creating confusion.

As of CTest 2.6, the unix timestamp is now submitted as part of the XML submission. If present, CDash will use this time, instead of the date/timezone format. This should solve the problem stated above.

Test Timing

Added in CDash 1.0. CDash supports timing defects for tests. CDash keeps in the database a current weighted average of the mean and standard deviation for each test time for each build. In order to keep the computation as light as a process as possible, the following formula is used, involving only the previous build.

 newMean = (1-alpha)*oldMean + alpha*currentTime
 newSD = sqrt((1-alpha)*SD*SD + alpha*(currentTime-newMean)* (currentTime-newMean)

A test is defined as failing if it verifies the following:

 if previousSD < thresholdSD then previousSD = thresholdSD.
 if  currentTime > previousMean+multiplier*previousSD.

One can notice that alpha defines the current “window” for the computation. By default alpha is set to 0.3.

Daily Updates

Added in CDash 1.0, daily updates from the repository are stored in the database (instead of polling the CVS/SVN server every time the page was accessed.

The first build of the day triggers the daily updates from the SVN/CVS repository. If the webserver supports PHP_CURL then the request is done asynchronously using curl (calling the dailyupdatescurl.php page), otherwise the request is done synchronously and requires the client to wait until the request is done.

The SQL tables, dailyupdate and dailyupdatefile are used to store the nightly updates.

Map and Geolocation

CDash displays the current geolocation of a given site.

CDash uses the free service from hostip.info to get the geolocation from the IP address of the submitted builds. As a free service, it is not perfect and some IP addresses couldn't be resolved. This is also true for internal IP addresses. Curl support for PHP should be enabled on the webserver.

Every time, a build is submitted, CDash performs a cURL query to hostip.info and store the resulting geolocation in the database.

Graphs

CDash uses flot for the graphing library. Flot is a javascript graph library released under MIT License.

Flot currently doesn't support mouse over event, but it is a feature request on the flot library. As soon as the feature is added to flot, CDash will make use of it.

The graphs are generated from an Ajax request to the database. The fetching time is proportional to the number of entries in the database.

Users Statistics

Started in CDash 1.2, user statistics are collected for each checking as long as the user is registered in CDash. There are some limitations to the system since a build can reflect multiple checkins by different users.

  • If only one author is responsible for the build, then all the statistics go to this author
  • If several authors have checked in modifications but the modifications have cleaned some warnings, errors or tests, then each author gets the credit.
  • If several authors have checked in modifications and the modifications have introduced some warnings, errors or test failures, CDash looks at the modified files and tries to see which ones are causing the problems and update the statistics accordingly.

Global Project Revisions

Modern version control tools (svn, git, hg, bzr, ...) track commits to an entire project instead of on each file. Each commit may change many files and directories, and the changes come atomically in one piece. The current CDash change report page is file-based, which works well for cvs (and tolerably for svn) but not for other tools. In order to support more version control tools, we need a new way to store and present the versions of code being tested.

This will involve changes to both CDash and CTest. Knowledge specific to each VCS tool will have to be coded somewhere. Ideally we should come up with a core that does not depend on the VCS tool in use. Then we can have plugins or a programmable configuration mechanism to support arbitrary tools.

Revision Specifiers

One important part of this is to have a way to identify specifically a global revision of the project. The following table shows how to do this for some common VCS tools.

Tool Specifier Example
cvs <branch>, <date> ITK-3-4, "2008-06-07 16:34:56 -0400"
svn <dir>, <rev> trunk/Foo, r1234
git <sha1> 184e154429933effddb6bce0a8ee5a6b99fc450c
hg <rev>:<sha1> 6907:6dcbe191a9b5
bzr ?? <upstream>:<rev> some.upstream.net/path/to/repo, r1234

Hopefully we will be able to make these strings opaque from CDash's point of view, except perhaps for integration with online repository browser tools.

Nightly Revisions

Distributed version control tools, like git, separate the notion of committing changes and publishing them. The date/time associated with a commit is the date/time at which the user committed the change to his/her local repository. This time may be before the "nightly start time" for a project even if the change was not published until after the time. Furthermore, the topological order of commits in the history may not match the chronological order (a commit may have a time that is older than the time of it's parent commit).

For example, say I've implemented a new feature, and commit to my local repository at 5pm. It's too late in the day to get feedback from the continuous builds, so I don't want to publish the changes yet. The nightly start time is 8pm for my project, so I go home from work, wait until 8:30pm, and then publish my commit in the repository from which dashboard machines pull changes. Since the change was not published by 8pm, the nightly dashboard should not include it. Now say that a busy dashboard testing machine waits until 9pm to pull changes and test the project. If it pulls the latest published changes, and then looks for commits before 8pm, it will see the commit I published at 8:30pm because it was created at 5pm. It will incorrectly test my new feature.

The solution to this problem is to have testing clients explicitly request from a server what revision should be tested (when operating in Nightly mode). Somehow CDash should be able to take a given date/time, convert it to the most recent nightly start time that comes before it, and report what revision of the project should be tested. This revision should be saved permanently and associated with that day's dashboard (for each branch of the project). Each submission to the dashboard should include a specifier indicating the revision tested.

Determining the revision specifier for a nightly start time on a given day will be vcs-tool-specific. In the case of CVS, just the date/time is needed. In the case of svn, the repository server can convert a date/time to a global revision number. In the case of git, each repository has a "reflog" from which the revision published as of a given date/time can be computed (we'll need to come up with a reliable way for this information to be obtained by the CDash server). Once the revision specifier is determined, it should be saved by CDash for use in answering queries from testing machines.

Computing Revisions in GIT

Since git supports "dumb" transport protocols (rsync, http, etc.) there is no way through the git interface to access the reflog of a remote server (it could perhaps be added to git's native "smart" protocol, but we cannot require or depend on that). However, it is possible to write a simple CGI script to get access. In order to make it as easy as possible to set up testing with a git-based project, we should avoid requiring a custom CGI script, or even configuration of a web server. Instead we can create a post-receive (or other) hook script to store the mapping from date to nightly revision in a git object on an otherwise empty branch. Then clients can fetch and inspect that branch (without ever checking it out) to get the revision to be tested.

Reporting Changes

Currently CDash reports changes for all files separately. If one commit modifies multiple files, each file is reported separately with the same commit message. Instead we should report each commit that has been published between two nightly start times. The commit comes with author, date, and a list of files that have been added, removed, or modified. The changes page should show each commit by itself, with one occurrence of the author, time, and log message followed by the files involved. They should be in topological order (this part is probably a CTest-only thing).