VTK/Threaded Image Algorithms

From KitwarePublic
< VTK
Revision as of 05:15, 17 March 2015 by Dgobbi (talk | contribs)
Jump to navigationJump to search

Review of Streaming Pipeline

The job of a vtkAlgorithm is to perform some operation upon some data (specifically, upon some data encapsulated in a vtkDataObject). One important aspect of VTK's streaming pipeline is that the vtkDataObject that the algorithm operates on might be one part of a larger data set. For example, a vtkImageData object might only contain a few slices of a large stack of image slices.

The portion of a data set that is contained within one vtkImageData object is described by the Extent array, which provides a (first, last) index for each of the three dimensions:

Extent = { first_idx_x, last_idx_x, first_idx_y, last_idx_y, first_idx_z, last_idx_z }

The pipeline tells the algorithm two important pieces of information:

  1. How large the entire data set is (the WHOLE_EXTENT).
  2. What part of the data set the output vtkImageData object should contain after the algorithm runs (the UPDATE_EXTENT).

Splitting the UPDATE_EXTENT among threads

The algorithm is responsible for producing a vtkImageData object whose size and position within the entire data set described by the UPDATE_EXTENT. But wait, what if we want to divide the work among several threads? Then the UPDATE_EXTENT must be broken into several chunks, each of which is also an extent. For example, let's say that the update extent is { 128, 255, 0, 255, 1, 100 }:

 UPDATE_EXTENT:     { 128, 255, 0, 255, 1, 100 }
 Thread 0 extent:   { 128, 255, 0, 255, 1, 25 }
 Thread 1 extent:   { 128, 255, 0, 255, 26, 50 }
 Thread 2 extent:   { 128, 255, 0, 255, 51, 75 }
 Thread 4 extent:   { 128, 255, 0, 255, 76, 100 }

In this example, the data has been divided along the Z direction. There is, in fact, a specific method in vtkThreadedImageAlgorithm whose only responsibility is to split the data into pieces:

int SplitExtent (int pieceExtent[6], int updateExtent[6], int piece, int total)

The total is the total number of pieces to divide the updateExtent into (this is always the number of threads to be used). Each thread calls this SplitExtent method with piece set to a different number (in the above example, with numbers 0 through 3). The SplitExtent method returns the the extent of data for that thread to operate on in the pieceExtent array.

The updateExtent is usually split up exactly as shown above, with the split occurring along the Z direction. This is a very poor choice if there are only two slices in the stack, but eight CPU cores available to work on the data! Only two of the cores get work to do, and the other six cores get nothing! To partly avoid the absolute worst case, if there is only one slice in the stack, then SplitExtent will do the division along the Y direction instead of the Z direction.

Multithreading

Now that we have seen how the work is divided into pieces, the next stop on our tour of the vtkThreadedImageAlgorithm is the main execution method, which is a virtual method that you would define in your own subclasses:

virtual void vtkThreadedImageAlgorithm::ThreadedRequestData(
  vtkInformation*request,
  vtkInformationVector **inputVector,
  vtkInformationVector *outputVector,
  vtkImageData ***inputData,
  vtkImageData **outputData,
  int pieceExtent[6],
  int piece)
{
  // code goes here
}

This method is called from each thread that is created to operate on the data. Let's ignore request, which is merely used for pipeline bookkeeping, and the inputVector,outputVector which are very important but beyond the current topic of discussion. What we have left are the inputData, the outputData, and the pieceExtent and piece that we discussed in the previous section. In the VTK codebase, you'll usually see threadId instead of piece, because each thread gets one piece of the data. The job of the code goes here is to fill in the voxel values that correspond to the pieceExtent. Sounds simple, right?

Progress Reporting

The algorithm can communicate to the rest of the algorithm while it is executing. In order to allow non-thread-safe modes of communication to be used, only one of the algorithm's threads is allowed to talk to the algorithm. Specifically, only the main thread (the one that gets piece number 0) can do this. This thread is special, because it is the main thread that the other threads were split from.

You will often see code like the following inside the VTK algorithms, where the algorithm tells the application what fraction of the work has been done. Of course, this is only an estimate of the progress, because this thread only knows how much of its own work has been done. It doesn't know what the other threads are doing!

 if (threadId == 0)
   {
   // Provide a fraction between 0.0 and 1.0
   this->UpdateProgress(progressFraction);
   }

Some Tricky Problems

Future