ITK Release 4/Performance Experiments/Reducing CTest Output
Testing is critical to high quality software and the ITK developers are expected to produce unit tests for each class. However, some tests produce large amounts of output which may be useful to the developer, but places a burden on the cdash database. Furthermore, some hypothesize that test output size may affect the performance of cdash.
This experiment looks at the size of output produced by ITKv4 and looks for ways to reduce a test's output.
Approach
This experiment uses the DMAIC methodology of the Six Sigma management process to "Define", "Measure", "Analyze", "Improve" and "Control" test output in ITKv4. The basic methodology (from Wikipedia) consists of the following five steps:
- Define process goals that are consistent with customer demands and ITKv4's strategy.
- Measure key aspects of the current process and collect relevant data.
- Analyze the data to verify cause-and-effect relationships. Determine what the relationships are, and attempt to ensure that all factors have been considered.
- Improve or optimize the process.
- Control to ensure that any deviations from target are corrected before they result in defects. Set up pilot runs to establish software quality, move on to production, set up control mechanisms and continuously monitor the process.
Define
Reduce the total test output of ITKv4 without affecting code coverage or value of the tests.
Measure
As of October 1, 2011, there were 2202 ITKv4 tests producing 10.6 meg of test output for a single platform. 2 tests produced 18% of the output and 65 of the 2202 tests produced 60% of the output. This data was gathered from a cdash file provided by Dave Cole of Kitware.
Analyze
The top ten test output producers are:
- itkSampleToHistogramFilterTest4 reports failures regarding expected frequencies. AEven though the test fails, it does not indicate failure. It turns out that the test is flawed.
- itkSystemInformationTest echoes the output of several CMake files produced during the build process, e.g. CMakeCache.txt
- vnl_test_alignment is a third party test producing ver 400,000 characters of output.
- itkNumericTraitsTest provides information about numeric limits and capabilities for a given platform.
- itkImageRegionExclusionIteratorWithIndexTest provides useful information to the developer, but not necessarily for the test.
- itkSampleToHistogramFilterTest5 provides useful information to the developer, but not necessarily for the test.
- itkImageRegistrationMethodTest_13 produces intermediate results that are useless to the test.
- itkSliceIteratorTest provides useful information to the developer, but not necessarily for the test.
- itkCheckerBoardImageFilterTest produces useless output.
- itkTriangleMeshToBinaryImageFilterTest2 produces too much output.
Analysis of these ten tests reveals the following categories of tests:
- Tests that produce valuable output, even if they pass.
- itkSystemInformationTest
- itkNumericTraitsTest
- Tests that produce reasonable output < 1k characters
- Over 1100 of the 2202 tests produce < 1k characters
- Tests that produce reasonable, but not necessarily valuable output > 1k and < 5k characters
- There are 690 tests that produce between 1k and 5k characters
- Tests that produce reasonable, but not necessarily valuable output > 5k characters
- itkImageRegionExclusionIteratorWithIndexTest
- itkImageRegistrationMethodTest_13
- itkSliceIteratorTest
- itkSampleToHistogramFilterTest5
- Tests that produce erroneous output
- itkSampleToHistogramFilterTest4
- Tests that produce useless output
- itkCheckerBoardImageFilterTest
- itkTriangleMeshToBinaryImageFilterTest2
- Tests provided by Third Party software producing unreasonable output
- vnl_test_alignment
Improve
A manual analysis of the top test output offenders resulted in a number of gerrit patches. The discussion on the mailing lists motivated some developers to review their tests. Also patches were submitted to correct test errors. The patches required manual editing of the tests. New options were added to the itk test driver to permit full output and redirected output.
CMake and CTest provide capabilities that facilitate improvement:
- Limit the size of test output reported to cdash
- The variable CTEST_CUSTOM_MAXIMUM_PASSED_TEST_OUTPUT_SIZE limits the size of the output to the given value. The default is 1000 characters. This variable, if present is specified in the CMake/CTestCustom.cmake.in file.
- Override the test output size limit
- If a test outputs the string CTEST_FULL_OUTPUT, ctest will override the limit.
- The ITKv4 test driver flag --full-output permits a test to override the CTEST_CUSTOM_MAXIMUM_PASSED_TEST_OUTPUT_SIZE limit. This is a convenient way to override the output limits without changing the test.
- Redirect the output of a test to a file
- The ITKv4 test driver flag --redirect-output FILENAME redirects a test's output to a file, usually in ${ITK_BINARY_DIR}/Testing/Temporary
Control
The only automated mechanism to control test output is CTEST_CUSTOM_MAXIMUM_PASSED_TEST_OUTPUT_SIZE (default 1000), specified in CMake/CTestCustom.cmake.in.
Improved documentation for adding a test alerts developers to keep their test output to a minimum.
Gerrit reviewers are encouraged to look at the output of tests and suggest to the submitter to use an appropriate mechanism to limit the test output.