Writing ParaView Readers: Difference between revisions
DaveDemarle (talk | contribs) m (whitespace improvment) |
No edit summary |
||
Line 5: | Line 5: | ||
VTK is written in C++, and new readers should also be written in this language. A reader plays the role of a source in a VTK pipeline, and must be implemented as a class deriving from vtkAlgorithm or one of its subclasses. The best choice for the immediate superclass of a new reader depends on the reader's output type. For example, a reader producing an instance of vtkPolyData may derive from vtkPolyDataAlgorithm to simplify its implementation. In order for a reader to function properly within VTK's pipeline mechanism, it must be able to respond to standard requests. This is implemented by overriding one or more of the following methods from the chosen superclass. | VTK is written in C++, and new readers should also be written in this language. A reader plays the role of a source in a VTK pipeline, and must be implemented as a class deriving from vtkAlgorithm or one of its subclasses. The best choice for the immediate superclass of a new reader depends on the reader's output type. For example, a reader producing an instance of vtkPolyData may derive from vtkPolyDataAlgorithm to simplify its implementation. In order for a reader to function properly within VTK's pipeline mechanism, it must be able to respond to standard requests. This is implemented by overriding one or more of the following methods from the chosen superclass. | ||
The interface to these methods utilizes vtkInformation objects which are heterogeneous maps storing key/value pairs. Many of these methods have the same three arguments. The first argument is of type vtkInformation* and contains at least one key specifying the request itself. The second argument is of type vtkInformationVector** and stores information about the input connections to the algorithm. This can be ignored by readers because they have no input connections. The third argument is of type vtkInformationVector* and contains one vtkInformation object for each output port of the algorithm. Most readers will have only one output port, but some may have multiple output ports (see the next section). All output information and data from the reader will be stored in one of these information objects. | The interface to these methods utilizes vtkInformation objects, which are heterogeneous maps storing key/value pairs. Many of these methods have the same three arguments. The first argument is of type vtkInformation* and contains at least one key specifying the request itself. The second argument is of type vtkInformationVector** and stores information about the input connections to the algorithm. This can be ignored by readers because they have no input connections. The third argument is of type vtkInformationVector* and contains one vtkInformation object for each output port of the algorithm. Most readers will have only one output port, but some may have multiple output ports (see the next section). All output information and data from the reader will be stored in one of these information objects. | ||
'''ProcessRequest''': This method is the entry point into a vtkAlgorithm through which the pipeline makes requests. A reader may override this method and implement responses to all requests. The method should be placed in the public section of the reader class. It should return 1 for success and 0 for failure. Full documentation of this method is beyond the scope of this chapter. Most readers should derive from one of the output-type-specific classes and implement the request-specific methods described below. | '''ProcessRequest''': This method is the entry point into a vtkAlgorithm through which the pipeline makes requests. A reader may override this method and implement responses to all requests. The method should be placed in the public section of the reader class. It should return 1 for success and 0 for failure. Full documentation of this method is beyond the scope of this chapter. Most readers should derive from one of the output-type-specific classes and implement the request-specific methods described below. | ||
Line 58: | Line 58: | ||
The method should be placed in the protected section of the reader class. It should return 1 for success and 0 for failure. | The method should be placed in the protected section of the reader class. It should return 1 for success and 0 for failure. | ||
'''CanReadFile''': The purpose of this method is to determine whether this reader can read a specified data file. Its input parameter is a const char* specifying the name of a data file. In this method you should not actually read the data but determine whether it is the correct format to be read by this reader. This method should return an integer value: 1 indicates that the specified file is of the correct type; 0 indicates it is not. It is not absolutely required that this method be implemented, but ParaView will make use of it if it exists. | '''CanReadFile''': The purpose of this method is to determine whether this reader can read a specified data file. Its input parameter is a const char* specifying the name of a data file. In this method, you should not actually read the data but determine whether it is the correct format to be read by this reader. This method should return an integer value: 1 indicates that the specified file is of the correct type; 0 indicates it is not. It is not absolutely required that this method be implemented, but ParaView will make use of it if it exists. | ||
'''SetFileName''': This method allows you to specify the name of the data file to be loaded by your reader. The method is not required to have this exact name, but a method with this functionality must be implemented. The easiest way to implement SetFileName is with a vtkSetStringMacro in the header file for this class. (There is also an associated vtkGetStringMacro for implementing GetFileName.) This method handles allocating an array to contain the file name and lets the reader know that the pipeline should be updated when the name is changed. | '''SetFileName''': This method allows you to specify the name of the data file to be loaded by your reader. The method is not required to have this exact name, but a method with this functionality must be implemented. The easiest way to implement SetFileName is with a vtkSetStringMacro in the header file for this class. (There is also an associated vtkGetStringMacro for implementing GetFileName.) This method handles allocating an array to contain the file name and lets the reader know that the pipeline should be updated when the name is changed. | ||
Line 72: | Line 72: | ||
== Parallel Readers == | == Parallel Readers == | ||
Unless otherwise specified, a VTK reader used in ParaView will cause the entire data set to be read on the first process. ParaView will then redistribute the data to the other processes. It is more desirable to have ParaView do the reading in parallel as well so that the data set is already appropriately divided across the processors. Changes should be made in two methods for ParaView readers to operate in parallel: RequestInformation and RequestData. Exactly which changes should be made depends on whether structured or unstructured data set types are to be read. | Unless otherwise specified, a VTK reader used in ParaView will cause the entire data set to be read on the first process. ParaView will then redistribute the data to the other processes. It is more desirable to have ParaView do the reading in parallel as well, so that the data set is already appropriately divided across the processors. Changes should be made in two methods for ParaView readers to operate in parallel: RequestInformation and RequestData. Exactly which changes should be made depends on whether structured or unstructured data set types are to be read. | ||
=== Structured === | === Structured === | ||
Line 131: | Line 131: | ||
</source> | </source> | ||
In the RequestData method, the reader should first get the UPDATE_NUMBER_OF_PIECES and UPDATE_PIECE_NUMBER from the output information object. The value returned from getting the UPDATE_NUMBER_OF_PIECES specifies the number of pieces into which the output data set will be broken. Getting UPDATE_PIECE_NUMBER returns the piece number (0 to UPDATE_NUMBER_OF_PIECES-1) for which the current process is responsible. The reader should use this information to determine which part of the | In the RequestData method, the reader should first get the UPDATE_NUMBER_OF_PIECES and UPDATE_PIECE_NUMBER from the output information object. The value returned from getting the UPDATE_NUMBER_OF_PIECES specifies the number of pieces into which the output data set will be broken. Getting UPDATE_PIECE_NUMBER returns the piece number (0 to UPDATE_NUMBER_OF_PIECES-1) for which the current process is responsible. The reader should use this information to determine which part of the dataset the current process should read. Example code that demonstrates this is shown below. | ||
<source lang="cpp"> | <source lang="cpp"> | ||
Line 156: | Line 156: | ||
</source> | </source> | ||
It is possible that your data file can only be broken into a specified number of pieces, and that this number is different than the number of processors being used (i.e., the result of getting UPDATE_NUMBER_OF_PIECES). If the number of processors is larger than the possible number of pieces, then each processor beyond the number of available pieces should produce an empty output by calling Initialize() on the output. If the number of processors is smaller than the number of pieces, you should internally redistribute the extra data across the processors. For example, if your | It is possible that your data file can only be broken into a specified number of pieces, and that this number is different than the number of processors being used (i.e., the result of getting UPDATE_NUMBER_OF_PIECES). If the number of processors is larger than the possible number of pieces, then each processor beyond the number of available pieces should produce an empty output by calling Initialize() on the output. If the number of processors is smaller than the number of pieces, you should internally redistribute the extra data across the processors. For example, if your dataset can produce ten pieces, and you are using five processors for reading, then process 0 could read pieces zero and five; process 1 could read pieces one and six; etc. | ||
== Required XML == | == Required XML == | ||
Line 163: | Line 163: | ||
First, notice the StringVectorProperty element named "FileName". It has a command attribute called "SetFileName". ParaView uses this property to tell the reader what file it should examine. This is done very early in the lifetime of a reader. Typically, ParaView code will give the reader a filename and then call the reader's CanReadFile method. If CanReadFile succeeds, ParaView will first call RequestInformation to get general information about the data within the file, and then call RequestData to read the actual data and produce a vtkDataObject. | First, notice the StringVectorProperty element named "FileName". It has a command attribute called "SetFileName". ParaView uses this property to tell the reader what file it should examine. This is done very early in the lifetime of a reader. Typically, ParaView code will give the reader a filename and then call the reader's CanReadFile method. If CanReadFile succeeds, ParaView will first call RequestInformation to get general information about the data within the file, and then call RequestData to read the actual data and produce a vtkDataObject. | ||
The second notable portion of the XML code below is two properties that let the user choose particular arrays to load from the data file. When ParaView sees them, it creates a section on the | The second notable portion of the XML code below is two properties that let the user choose particular arrays to load from the data file. When ParaView sees them, it creates a section on the Properties tab for the reader that lets the user select from the available cell-centered data arrays. In the following discussion, one can simply replace 'cell' with 'point' in order to let the user choose from the available, point-centered arrays. | ||
The StringVectorProperty named "CellArrayStatus" lets the ParaView client call the SetCellArrayStatus method on the server. The SetCellArrayStatus method is how the reader is told what it should do with each of the arrays the data file contains. It takes two parameters; the first is the name of the array, and the second is an integer indicating whether to read the array (1) or not (0). | The StringVectorProperty named "CellArrayStatus" lets the ParaView client call the SetCellArrayStatus method on the server. The SetCellArrayStatus method is how the reader is told what it should do with each of the arrays the data file contains. It takes two parameters; the first is the name of the array, and the second is an integer indicating whether to read the array (1) or not (0). | ||
The StringVectorProperty named 'CellArrayInfo' is an information property; that is, one through which the ParaView client gathers information from the server. The ParaView client uses it to gather the names of the cell-centered arrays from the reader. (See the nested Property sub-element in CellArrayStatus's ArraySelectionDomain element.) In order for this collection to work, two methods | The StringVectorProperty named 'CellArrayInfo' is an information property; that is, one through which the ParaView client gathers information from the server. The ParaView client uses it to gather the names of the cell-centered arrays from the reader. (See the nested Property sub-element in CellArrayStatus's ArraySelectionDomain element.) In order for this collection to work, two methods are implemented in the reader. GetNumberOfCellArrays returns the number (an int) of cell-centered arrays in the data file. This method does not accept any parameters. GetCellArrayName takes a single parameter the array's index (starting from 0). This method returns the name of the array (a const char*) with this index or NULL if the array has no name or if the index is larger than the number of arrays. | ||
<source lang="xml"> | <source lang="xml"> |
Revision as of 13:31, 6 June 2011
Writing ParaView Readers
If the format of your data files is not one supported by default in ParaView (see section Error: Reference source not found), you will either need to convert your files to a format ParaView can read, or you must write your own data file reader for ParaView. The reader must operate within a standard VTK pipeline. In this chapter, we will discuss integrating the new reader class into VTK, including outlining which C++ methods should be implemented for the reader to work properly. The necessary user interface and server manager XML will be described. Creating parallel readers and readers that output multiple parts will also be covered. For VTK information beyond the scope of the chapter, see The VTK User's Guide by Kitware, Inc.
Integrating with VTK
VTK is written in C++, and new readers should also be written in this language. A reader plays the role of a source in a VTK pipeline, and must be implemented as a class deriving from vtkAlgorithm or one of its subclasses. The best choice for the immediate superclass of a new reader depends on the reader's output type. For example, a reader producing an instance of vtkPolyData may derive from vtkPolyDataAlgorithm to simplify its implementation. In order for a reader to function properly within VTK's pipeline mechanism, it must be able to respond to standard requests. This is implemented by overriding one or more of the following methods from the chosen superclass.
The interface to these methods utilizes vtkInformation objects, which are heterogeneous maps storing key/value pairs. Many of these methods have the same three arguments. The first argument is of type vtkInformation* and contains at least one key specifying the request itself. The second argument is of type vtkInformationVector** and stores information about the input connections to the algorithm. This can be ignored by readers because they have no input connections. The third argument is of type vtkInformationVector* and contains one vtkInformation object for each output port of the algorithm. Most readers will have only one output port, but some may have multiple output ports (see the next section). All output information and data from the reader will be stored in one of these information objects.
ProcessRequest: This method is the entry point into a vtkAlgorithm through which the pipeline makes requests. A reader may override this method and implement responses to all requests. The method should be placed in the public section of the reader class. It should return 1 for success and 0 for failure. Full documentation of this method is beyond the scope of this chapter. Most readers should derive from one of the output-type-specific classes and implement the request-specific methods described below.
RequestInformation: This method is invoked by the superclass's ProcessRequest implementation when it receives a REQUEST_INFORMATION request. In the output port, it should store information about the data in the input file. For example, if the reader produces structured data, then the whole extent should be set here (shown below).
<source lang="cpp"> int vtkExampleReader::RequestInformation(
vtkInformation*, vtkInformationVector**, vtkInformationVector* outVec)
{
vtkInformation* outInfo = outVec->GetInformationObject(0); int extent[6];
// ... read file to find available extent ...
//store that in the pipeline outInfo->Set (vtkStreamingDemandDrivenPipeline::WHOLE_EXTENT(), extent, 6);
// ... store other information ...
return 1;
} </source>
This method is necessary when configuring a reader to operate in parallel. (This will be further discussed later in this chapter.) It should be placed in the protected section of the reader class. The method should return 1 for success and 0 for failure.
RequestData: This method is invoked by the superclass's ProcessRequest implementation when it receives a REQUEST_DATA request. It should read data from the file and store it in the corresponding data object in the output port. The output data object will have already been created by the pipeline before this request is made. The amount of data to read may be specified by keys in the output port information. For example, if the reader produces vtkImageData, this method might look like this.
<source lang="cpp"> int vtkExampleReader::RequestData(
vtkInformation*, vtkInformationVector**, vtkInformationVector* outVec)
{
vtkInformation* outInfo = outVec->GetInformationObject(0); vtkImageData* outData = vtkImageData::SafeDownCast (outInfo->Get(vtkDataObject::DATA_OBJECT()));
int extent[6] = {0,-1,0,-1,0,-1}; outInfo->Get (vtkStreamingDemandDrivenPipeline::UPDATE_EXTENT(), extent);
outData->SetExtent(extent);
// ... read data for this extent from the file ...
return 1;
} </source>
The method should be placed in the protected section of the reader class. It should return 1 for success and 0 for failure.
CanReadFile: The purpose of this method is to determine whether this reader can read a specified data file. Its input parameter is a const char* specifying the name of a data file. In this method, you should not actually read the data but determine whether it is the correct format to be read by this reader. This method should return an integer value: 1 indicates that the specified file is of the correct type; 0 indicates it is not. It is not absolutely required that this method be implemented, but ParaView will make use of it if it exists.
SetFileName: This method allows you to specify the name of the data file to be loaded by your reader. The method is not required to have this exact name, but a method with this functionality must be implemented. The easiest way to implement SetFileName is with a vtkSetStringMacro in the header file for this class. (There is also an associated vtkGetStringMacro for implementing GetFileName.) This method handles allocating an array to contain the file name and lets the reader know that the pipeline should be updated when the name is changed.
<source lang="cpp"> vtkSetStringMacro(FileName); </source>
When using this macro, you must also add a FileName instance variable of type char* in the protected section of this class. In the constructor for your reader, assign FileName the value NULL before you use SetFileName for the first time. In the destructor for your reader, call SetFileName(0)to free the file name storage.
Multi-Group (Multi-Block and AMR) Readers
As of VTK 5.0 and ParaView 2.4, multi-block and AMR datasets are supported. Multi-group readers follow the same guidelines as described in the previous sections. For convenience, you can subclass multi-block readers from vtkMultiBlockDataSetAlgorithm and AMR readers from vtkHierarchicalDataSetAlgorithm. If you do not sub-class from one of these classes, make sure to implement the CreateDefaultExecutive(), FillOutputPortInformation(), and FillInputPortInformation() methods appropriately. (You can use vtkMultiBlockDataSetAlgorithm as a starting point.) Two good examples of multi-group dataset readers are vtkMultiBlockPLOT3DReader and vtkXMLHierarchicalDataReader.
Parallel Readers
Unless otherwise specified, a VTK reader used in ParaView will cause the entire data set to be read on the first process. ParaView will then redistribute the data to the other processes. It is more desirable to have ParaView do the reading in parallel as well, so that the data set is already appropriately divided across the processors. Changes should be made in two methods for ParaView readers to operate in parallel: RequestInformation and RequestData. Exactly which changes should be made depends on whether structured or unstructured data set types are to be read.
Structured
In RequestInformation for readers of structured data (vtkStructuredGrid, vtkRectilinearGrid, or vtkImageData), the WHOLE_EXTENT key should be set on the output information object, so that downstream filters can take into account the overall dimensions of the data. The WHOLE_EXTENT is specified using six parameters: the minimum and maximum index along each of the three coordinate axes (imin, imax, jmin, jmax, kmin, kmax or a single int array of length 6) for the entire data set. Example C++ code demonstrating this is shown below.
<source lang="cpp">
int vtkDEMReader::RequestInformation (
vtkInformation * vtkNotUsed(request), vtkInformationVector ** vtkNotUsed( inputVector ), vtkInformationVector *outputVector)
{
vtkInformation* outInfo =
outputVector->GetInformationObject(0); int extent[6];
//Read entire extent from file. ...
outInfo->Set (vtkStreamingDemandDrivenPipeline::WHOLE_EXTENT(), extent,6);
return 1;
} </source>
Before doing any processing in the RequestData method, first get the UPDATE_EXTENT from the output information object. This key contains the sub-extent of the WHOLE_EXTENT for which the current process is responsible. The current process is responsible for filling in the data values for the update extent that is returned. An example of doing this is shown below.
<source lang="cpp">
int vtkDEMReader::RequestData(
vtkInformation* vtkNotUsed( request ), vtkInformationVector** vtkNotUsed( inputVector ), vtkInformationVector* outputVector)
{
// get the data object and find out what part we need to // read now vtkInformation *outInfo = outputVector->GetInformationObject(0); int subext[6]; outInfo->Get (vtkStreamingDemandDrivenPipeline::UPDATE_EXTENT(), subext);
//read that part of the data in from the file //and put it in the output data ...
return 1;
} </source>
Unstructured
In the unstructured case (vtkPolyData or vtkUnstructuredGrid), the MAXIMUM_NUMBER_OF_PIECES key should be set on the output information object in RequestInformation. This specifies the maximum number of pieces that can be generated from the file. If this reader can only read the entire data set, then the maximum number of pieces should be set to 1. If the input file can be read into as many pieces as needed (i.e., one per processor), the maximum number of pieces should be set to -1, as shown below.
<source lang="cpp"> outInfo->Set(vtkStreamingDemandDrivenPipeline::MAXIMUM_NUMBER_OF_PIECES(), -1); </source>
In the RequestData method, the reader should first get the UPDATE_NUMBER_OF_PIECES and UPDATE_PIECE_NUMBER from the output information object. The value returned from getting the UPDATE_NUMBER_OF_PIECES specifies the number of pieces into which the output data set will be broken. Getting UPDATE_PIECE_NUMBER returns the piece number (0 to UPDATE_NUMBER_OF_PIECES-1) for which the current process is responsible. The reader should use this information to determine which part of the dataset the current process should read. Example code that demonstrates this is shown below.
<source lang="cpp">
int vtkUnstructuredGridReader::RequestData(
vtkInformation *, vtkInformationVector **, vtkInformationVector *outputVector)
{
vtkInformation *outInfo = outputVector->GetInformationObject(0); int piece, numPieces;
piece = outInfo->Get (vtkStreamingDemandDrivenPipeline::UPDATE_PIECE_NUMBER());
numPieces = outInfo->Get (vtkStreamingDemandDrivenPipeline::UPDATE_NUMBER_OF_PIECES());
//skip to proper offset in the file and read piece ...
return 1;
} </source>
It is possible that your data file can only be broken into a specified number of pieces, and that this number is different than the number of processors being used (i.e., the result of getting UPDATE_NUMBER_OF_PIECES). If the number of processors is larger than the possible number of pieces, then each processor beyond the number of available pieces should produce an empty output by calling Initialize() on the output. If the number of processors is smaller than the number of pieces, you should internally redistribute the extra data across the processors. For example, if your dataset can produce ten pieces, and you are using five processors for reading, then process 0 could read pieces zero and five; process 1 could read pieces one and six; etc.
Required XML
To use your new reader within ParaView, you must write XML code for the server manager and for the client. The server-side XML for ParaView's readers is located in the file Servers/ServerManager/Resources/readers.xml. When ParaView is compiled, the server manager XML code for a reader is used to create a proxy object for it. The ParaView client accesses the VTK class for the reader on the server through the proxy. Below is an excerpt from the server manager XML code for vtkXMLPolyDataReader. A few parts of this XML code segment are particularly worth noting, and more extensive information about server manager XML code can be found in section Error: Reference source not found.
First, notice the StringVectorProperty element named "FileName". It has a command attribute called "SetFileName". ParaView uses this property to tell the reader what file it should examine. This is done very early in the lifetime of a reader. Typically, ParaView code will give the reader a filename and then call the reader's CanReadFile method. If CanReadFile succeeds, ParaView will first call RequestInformation to get general information about the data within the file, and then call RequestData to read the actual data and produce a vtkDataObject.
The second notable portion of the XML code below is two properties that let the user choose particular arrays to load from the data file. When ParaView sees them, it creates a section on the Properties tab for the reader that lets the user select from the available cell-centered data arrays. In the following discussion, one can simply replace 'cell' with 'point' in order to let the user choose from the available, point-centered arrays.
The StringVectorProperty named "CellArrayStatus" lets the ParaView client call the SetCellArrayStatus method on the server. The SetCellArrayStatus method is how the reader is told what it should do with each of the arrays the data file contains. It takes two parameters; the first is the name of the array, and the second is an integer indicating whether to read the array (1) or not (0).
The StringVectorProperty named 'CellArrayInfo' is an information property; that is, one through which the ParaView client gathers information from the server. The ParaView client uses it to gather the names of the cell-centered arrays from the reader. (See the nested Property sub-element in CellArrayStatus's ArraySelectionDomain element.) In order for this collection to work, two methods are implemented in the reader. GetNumberOfCellArrays returns the number (an int) of cell-centered arrays in the data file. This method does not accept any parameters. GetCellArrayName takes a single parameter the array's index (starting from 0). This method returns the name of the array (a const char*) with this index or NULL if the array has no name or if the index is larger than the number of arrays.
<source lang="xml">
<SourceProxy name="XMLPolyDataReader"
class="vtkXMLPolyDataReader" label="XML Polydata reader"> <StringVectorProperty name="FileName" command="SetFileName" animateable="0" number_of_elements="1"> <FileListDomain name="files"/> </StringVectorProperty>
<StringVectorProperty name="CellArrayInfo" information_only="1"> <ArraySelectionInformationHelper attribute_name="Cell"/> </StringVectorProperty>
<StringVectorProperty name="CellArrayStatus" command="SetCellArrayStatus" number_of_elements="0" repeat_command="1" number_of_elements_per_command="2" element_types="2 0" information_property="CellArrayInfo" label="Cell Arrays"> <ArraySelectionDomain name="array_list"> <RequiredProperties> <Property name="CellArrayInfo" function="ArrayList"/> </RequiredProperties> </ArraySelectionDomain> </StringVectorProperty>
</SourceProxy>
</source>
The client-side XML is extremely simple. The purpose of the client-side XML is to enable readers from the server-side XML and to associate file extensions with them. In the case where multiple readers can read files that have the same file extensions, the CanReadFile method is called on each one in order to choose the correct one for each data file. The client-side XML file for readers is located in Qt/Components/Resources/XML/ParaViewReaders.xml. The portion of that file related to XMLPolyDataReader follows.
<source lang="xml">
<Reader name="XMLPolyDataReader"
extensions="vtp" file_description="VTK PolyData Files">
</Reader>
</source>
An alternative to modifying the ParaView source code directly is to use ParaView's plugin architecture. With a plugin, the same C++ source code and XML content must be written, but it is kept outside of the ParaView source code proper and is compiled separately from ParaView. Plugins are discussed in chapter Error: Reference source not found.