VTK/Supporting Arrays With Arbitrary Memory Layouts

From KitwarePublic
Jump to navigationJump to search

Background

Not too long ago, the notion of Mapped Arrays was introduced in VTK. The main objective behind these was to add support to handle data arrays with arbitrary memory layouts in VTK. This motivation was to support in-situ data processing use-cases without having to deep-copy arrays from simulation datastructures to VTK's memory layout when the simulation used different memory layout than VTK's. While mapped arrays enabled supporting arbitrary memory layouts in VTK for the the first time, as they started to get used by a wider community, a few drawbacks come to light:

  1. Adding a new array layout required developers to implement a large number of pure-virtual methods defined in its superclass hierarchy. This list is discouragingly large especially for simulation developers, not too familiar with VTK. For veteran VTK developers, this was tedious at the least.
  2. Virtual API for accessing values in a generic way meant that the access was invariably slow and kept compilers from performing further optimizations.

This new design addresses both these issues.

Caveats

This design targets vtkDataArray and subclasses alone. vtkDataArray, in VTK. Hence, this discussion is not applicable for certain vtkAbstractArray subclasses that are not vtkDataArray subclasses as well e.g. vtkStringArray, vtkVariantArray. Since such arrays are rarely, if ever, encountered in in-situ use-cases, this should not be an issue.

Classes

GenericArraysUML.png

vtkGenericDataArray is a new subclass of vtkDataArray that does all the the heavy lifting to implement the pure-virtual API expected by vtkAbstractArray and vtkDataArray and defines a new concept that the subclasses should implement. All methods expected to be implemented by the subclasses are non-virtual and hence provide for fast access and compile time optimizations.

vtkSoADataArrayTemplate, and vtkAoSDataArrayTemplate are concrete subclasses templated over ScalarTypeT. SoA stands for Structure of Arrays, while AoS stands for Array of Structures (the traditional VTK array memory layout).

Let's take a closer look at the hierarchy.

  1. vtkAbstractArray is the abstract base class for all arrays in VTK (numeric or otherwise). It provides API that does not assume any type for the scalar or tuple values e.g. Get|SetNumberOfComponents, Get|SetNumberOfTuples, etc. Even the methods such as SetTuple, InsertTuple provided by this class that operate on tuples, take another vtkAbstractArray as the argument to get the tuple value to set or insert -- thus is agnostic of the actual data type for the tuples.
  2. vtkDataArray qualifies vtkAbstractArray for numeric arrays. Now, it can expose API that makes this assumption e.g. GetTuple, SetTuple, InsertTuple etc. with double or float types since all numeric types can be converted to double or float (with known precision errors, of course). When using this API, the user is aware of two things: i) there may be type conversions and hence loss in precision, ii) this may be slow since there may be type conversions.
  3. vtkGenericDataArray extends vtkDataArray to add awareness of scalar type. Since this is templated class, it can provide type aware API to Get/Set/Insert values in tuples. The nice thing here is that the implementations for all these Get/Set/Insert methods are totally agnostic of the memory layout and yet are reasonably fast (except for some overhead converting tuple index to value index and vice-versa). This is possible because of the introduction of the concept of TupleIterator.
  4. vtkSoADataArrayTemplate and vtkAoSDataArrayTemplate are subclasses that support two most commonly encountered memory layouts in the scientific community, namely structure of arrays and array of structures respectively. Both are templated classes templated over ScalarType. Besides unique API specific to these subclasses like SetArray, these classes provide 4 methods. The TupleIterator these use (vtkGenericDataArrayTupleIterator) expect GetComponentFast, GetTupleFast, while vtkGenericDataArray requires AllocateTuples and ReallocateTuples. These represent the number of methods one has to typically implement to support a new memory layout, in most cases.

Template Macros

A common pattern in VTK is to use vtkTemplateMacro for dispatching to appropriate data array type. To support accessing vtkGenericDataArray types, we have several macros. Instead of explaining what these are, let's just look a few examples. Current implementation allows for ScalarType to be a const. It may be worthwhile to see if that's really useful in VTK and if not, we can opt for just 2 version of the macro vtkGenericDataArrayMacro and vtkGenericDataArrayMacro2.

vtkWriteableGenericDataArrayMacro

<source lang="cpp">

  vtkDataArray* writableArray = vtkAosDataArrayTemplate<float>::New();
  template <class ArrayType>
  void fillValues(ArrayType* array)
  {
    int numcomps = array->GetNumberOfComponents();
    typename ArrayType::TupleIterator iter;
    for (iter = array->Begin(); iter != array->End(); ++iter)
      {
      for (int cc=0; cc < numcomps; ++cc)
        {
        iter[cc] = cc;
        }
      }
  }
  
  vtkWriteableGenericDataArrayMacro(writeableArray,
     fillValues<ARRAY_TYPE>(ARRAY);
  );

</source>

vtkConstGenericDataArrayMacro

<source lang="cpp">

 vtkDataArray* array = vtkSoADataArrayTemplate<const float>::New();
 ...  
 template <class ArrayType>
 void printValues(ArrayType* array)
 {
    int numcomps = array->GetNumberOfComponents();
    typename ArrayType::TupleIterator iter;
    for (iter = array->Begin(); iter != array->End(); ++iter)
      {
      for (int cc=0; cc < numcomps; ++cc)
        {
        cout << iter[cc] << endl;
        }
      }
 }
 vtkConstGenericDataArrayMacro(array,
   printValues<ARRAY_TYPE>(ARRAY);
 );

</source>

vtkGenericDataArrayMacro2

<source lang="cpp">

 vtkDataArray* inarray = vtkSoADataArrayTemplate<const float>::New();
 vtkDataArray* outarray = vtkAoSDataArrayTemplate<float>::New();
 
 template <class InArrayType, class OutArrayType>
 void Copy(InArrayType* inArray, OutArrayType* outArray)
 {
    ...
 }
 vtkGenericDataArrayMacro2(inarray, outarray,
   Copy<IN_ARRAY_TYPE, OUT_ARRAY_TYPE>(IN_ARRAY, OUT_ARRAY);
 );

</source>

Things to note

  1. vtkGenericDataArray does exactly what vtkMappedArray did for support GetVoidPointer, it will create a deep copy to a vtkAoSDataArrayTemplate and use that. Code using GetVoidPointer should be updated to not use that API, period.
  2. The TupleIterator concept, as the name suggests, iterates over tuples, not values (or scalars in tuples). This has significant performance benefits since in most cases, algorithms indeed iterate over tuples, avoiding a division and modulus operation for tuple-based memory layouts (e.g. SoA). For AoS layouts, we do incur the cost of a multiplication, but that's usually cheaper than division and modulus.
  3. vtkGenericDataArray still support random-access since vtkGenericDataArray::Begin(vtkIdType pos) takes an optional argument which is the tuple offset to seek too.