CDash:Design

From KitwarePublic
Jump to navigationJump to search

< CDash Main Page

Time storage in DB

CDash stores all dates and time in GMT (also known as UTC) format. Therefore, when a build is submitted to CDash, a conversion is done from the current timezone to GMT. Some timezones are not currently solved by PHP and some other issues might happen, like Australian eastern timezone is expressed as EST which is the same as the American eastern time (EST), creating confusion.

As of CTest 2.6, the unix timestamp is now submitted as part of the XML submission. If present, CDash will use this time, instead of the date/timezone format. This should solve the problem stated above.

Test Timing

Added in CDash 1.0. CDash supports timing defects for tests. CDash keeps in the database a current weighted average of the mean and standard deviation for each test time for each build. In order to keep the computation as light as a process as possible, the following formula is used, involving only the previous build.

 newMean = (1-alpha)*oldMean + alpha*currentTime
 newSD = sqrt((1-alpha)*SD*SD + alpha*(currentTime-newMean)* (currentTime-newMean)

A test is defined as failing if it verifies the following:

 if previousSD < thresholdSD then previousSD = thresholdSD.
 if  currentTime > previousMean+multiplier*previousSD.

One can notice that alpha defines the current “window” for the computation. By default alpha is set to 0.3.

Daily Updates

Added in CDash 1.0, daily updates from the repository are stored in the database (instead of polling the CVS/SVN server every time the page was accessed.

The first build of the day triggers the daily updates from the SVN/CVS repository. If the webserver supports PHP_CURL then the request is done asynchronously using curl (calling the dailyupdatescurl.php page), otherwise the request is done synchronously and requires the client to wait until the request is done.

The SQL tables, dailyupdate and dailyupdatefile are used to store the nightly updates.

Map and Geolocation

CDash displays the current geolocation of a given site.

CDash uses the free service from hostip.info to get the geolocation from the IP address of the submitted builds. As a free service, it is not perfect and some IP addresses couldn't be resolved. This is also true for internal IP addresses. Curl support for PHP should be enabled on the webserver.

Every time, a build is submitted, CDash performs a cURL query to hostip.info and store the resulting geolocation in the database.

Graphs

CDash uses flot for the graphing library. Flot is a javascript graph library released under MIT License.

Flot currently doesn't support mouse over event, but it is a feature request on the flot library. As soon as the feature is added to flot, CDash will make use of it.

The graphs are generated from an Ajax request to the database. The fetching time is proportional to the number of entries in the database.

Users Statistics

Started in CDash 1.2, user statistics are collected for each checking as long as the user is registered in CDash. There are some limitations to the system since a build can reflect multiple checkins by different users.

  • If only one author is responsible for the build, then all the statistics go to this author
  • If several authors have checked in modifications but the modifications have cleaned some warnings, errors or tests, then each author gets the credit.
  • If several authors have checked in modifications and the modifications have introduced some warnings, errors or test failures, CDash looks at the modified files and tries to see which ones are causing the problems and update the statistics accordingly.

Global Project Revisions

Modern version control tools (svn, git, hg, bzr, ...) track commits to an entire project instead of on each file. Each commit may change many files and directories, and the changes come atomically in one piece. The current CDash change report page is file-based, which works well for cvs (and tolerably for svn) but not for other tools. In order to support more version control tools, we need a new way to store and present the versions of code being tested.

This will involve changes to both CDash and CTest. Knowledge specific to each VCS tool will have to be coded somewhere. Ideally we should come up with a core that does not depend on the VCS tool in use. Then we can have plugins or a programmable configuration mechanism to support arbitrary tools.

Revision Specifiers

One important part of this is to have a way to identify specifically a global revision of the project. The following table shows how to do this for some common VCS tools.

Tool Specifier Example
cvs <branch>, <date> ITK-3-4, "2008-06-07 16:34:56 -0400"
svn <dir>, <rev> trunk/Foo, r1234
git <sha1> 184e154429933effddb6bce0a8ee5a6b99fc450c
hg <rev>:<sha1> 6907:6dcbe191a9b5
bzr ?? <upstream>:<rev> some.upstream.net/path/to/repo, r1234

Hopefully we will be able to make these strings opaque from CDash's point of view, except perhaps for integration with online repository browser tools.

Nightly Revisions

Distributed version control tools, like git, separate the notion of committing changes and publishing them. The date/time associated with a commit is the date/time at which the user committed the change to his/her local repository. This time may be before the "nightly start time" for a project even if the change was not published until after the time. Furthermore, the topological order of commits in the history may not match the chronological order (a commit may have a time that is older than the time of it's parent commit).

For example, say I've implemented a new feature, and commit to my local repository at 5pm. It's too late in the day to get feedback from the continuous builds, so I don't want to publish the changes yet. The nightly start time is 8pm for my project, so I go home from work, wait until 8:30pm, and then publish my commit in the repository from which dashboard machines pull changes. Since the change was not published by 8pm, the nightly dashboard should not include it. Now say that a busy dashboard testing machine waits until 9pm to pull changes and test the project. If it pulls the latest published changes, and then looks for commits before 8pm, it will see the commit I published at 8:30pm because it was created at 5pm. It will incorrectly test my new feature.

The solution to this problem is to have testing clients explicitly request from a server what revision should be tested (when operating in Nightly mode). Somehow CDash should be able to take a given date/time, convert it to the most recent nightly start time that comes before it, and report what revision of the project should be tested. This revision should be saved permanently and associated with that day's dashboard (for each branch of the project). Each submission to the dashboard should include a specifier indicating the revision tested.

Determining the revision specifier for a nightly start time on a given day will be vcs-tool-specific. In the case of CVS, just the date/time is needed. In the case of svn, the repository server can convert a date/time to a global revision number. In the case of git, each repository has a "reflog" from which the revision published as of a given date/time can be computed (we'll need to come up with a reliable way for this information to be obtained by the CDash server). Once the revision specifier is determined, it should be saved by CDash for use in answering queries from testing machines.

Computing Revisions in GIT

Since git supports "dumb" transport protocols (rsync, http, etc.) there is no way through the git interface to access the reflog of a remote server (it could perhaps be added to git's native "smart" protocol, but we cannot require or depend on that). However, it is possible to write a simple CGI script to get access. In order to make it as easy as possible to set up testing with a git-based project, we should avoid requiring a custom CGI script, or even configuration of a web server. Instead we can create a post-receive (or other) hook script to store the mapping from date to nightly revision in a git object on an otherwise empty branch. Then clients can fetch and inspect that branch (without ever checking it out) to get the revision to be tested.

Instead of storing the mapping in a branch, it can be stored in the refs directory so that standard refspecs can reference it:

#!/bin/bash
# hooks/post-receive
while read oldrev newrev refname; do
  if echo $refname | grep "^refs/heads/" >/dev/null 2>&1; then
    branch="$(echo $refname | sed 's/^refs\/heads\///')"
    date="$(date +%Y-%m-%d)" # fix to account for nightly start time
    git update-ref -m "New commit published on this date" \
      "refs/nightly/$branch/$date" "$newrev"
  fi
done

This idea is associated with the workflow of testing changes published on a central repo. What about other workflows, such as one that tests published versions with no notion of a nightly start time? In such a workflow the CDash server might be queried by the client to ask what revision to test, or the entire testing setup might be push-based ("I want 'this' version to be tested everywhere").

Reporting Changes

Currently CDash reports changes for all files separately. If one commit modifies multiple files, each file is reported separately with the same commit message. Instead we should report each commit that has been published between two nightly start times. The commit comes with author, date, and a list of files that have been added, removed, or modified. The changes page should show each commit by itself, with one occurrence of the author, time, and log message followed by the files involved. They should be in topological order (this part is probably a CTest-only thing).

Example Reports

A modern VCS tool typically provides a way to summarize the changes made by each commit. Here are some examples

$ svn diff -c $rev --summarize
M      boost/test/framework.hpp
M      boost/test/test_case_template.hpp
M      boost/test/impl/framework.ipp
M      boost/test/impl/unit_test_main.ipp
M      boost/test/impl/xml_log_formatter.ipp
M      boost/test/impl/unit_test_suite.ipp
M      boost/test/unit_test_suite_impl.hpp
$ git show $sha1 --stat
commit e772a5d1e9f25b04f0827cb037b69e3e465bb990
Author: Brad King <brad.king@kitware.com>
Date:   Fri Oct 3 14:41:15 2008 +0000

    ENH: Add UNSUITABLE result to package version test
    
    Package version test files may now declare that they are unsuitable for
    use with the project testing them.  This is important when the version
    being tested does not provide a compatible ABI with the project target
    environment.

 Source/cmFindPackageCommand.cxx                    |   21 +++++++++++++------
 .../lib/zot/zot-config-version.cmake               |   10 +++++++++
 Tests/FindPackageTest/lib/zot/zot-config.cmake     |    2 +
 3 files changed, 26 insertions(+), 7 deletions(-)

E-Mail Notifications

In order to send email to authors of broken changes the update information needs to contain author information for all commits that occurred since the last time the same client submitted. If CDash is VCS-aware for the project and the client only tests published versions, the server can compute this information.

On the CTest side, a nightly build can easily identify the last and current versions using the tags from the server. A continuous build must track in the build tree somewhere what versions were tested. An experimental build may have commits that are local to the author's repository (or even working tree changes), so determining what has changed may be tricky. There will have to be a way to compare the tested version against the last published version or something like that.

Simpler Alternative

CDash does not really need to report global commits. It is easier and better to teach it how to refer to web-based viewers for more VCS tools. The updates page reported for each dashboard submission can stay file-based since it is just for quick reference and also contains work-tree modification and conflict information. The only extension for modern VCS tools can be to teach CTest to report the work-tree global revision before and after updating and teach CDash to turn this information into a link on the updates page to point at the web-based viewer.