vtkPCAStatistics Class Reference

#include <vtkPCAStatistics.h>

Inheritance diagram for vtkPCAStatistics:

Inheritance graph
[legend]
Collaboration diagram for vtkPCAStatistics:

Collaboration graph
[legend]

List of all members.


Detailed Description

A class for principal component analysis.

This class derives from the multi-correlative statistics algorithm and uses the covariance matrix and Cholesky decomposition computed by it. However, when it finalizes the statistics in Learn mode, the PCA class computes the SVD of the covariance matrix in order to obtain its eigenvectors.

In the assess mode, the input data are

Thanks:
Thanks to David Thompson, Philippe Pebay and Jackson Mayo from Sandia National Laboratories for implementing this class.
Examples:
vtkPCAStatistics (Examples)
Tests:
vtkPCAStatistics (Tests)

Definition at line 57 of file vtkPCAStatistics.h.


Public Types

typedef
vtkMultiCorrelativeStatistics 
Superclass
enum  NormalizationType {
  NONE, TRIANGLE_SPECIFIED, DIAGONAL_SPECIFIED, DIAGONAL_VARIANCE,
  NUM_NORMALIZATION_SCHEMES
}
enum  ProjectionType { FULL_BASIS, FIXED_BASIS_SIZE, FIXED_BASIS_ENERGY, NUM_BASIS_SCHEMES }

Public Member Functions

virtual const char * GetClassName ()
virtual int IsA (const char *type)
virtual void PrintSelf (ostream &os, vtkIndent indent)
virtual void SetNormalizationScheme (int)
virtual int GetNormalizationScheme ()
virtual void SetNormalizationSchemeByName (const char *sname)
virtual const char * GetNormalizationSchemeName (int scheme)
virtual vtkTableGetSpecifiedNormalization ()
virtual void SetSpecifiedNormalization (vtkTable *)
void GetEigenvalues (int request, vtkDoubleArray *)
void GetEigenvalues (vtkDoubleArray *)
double GetEigenvalue (int request, int i)
double GetEigenvalue (int i)
void GetEigenvectors (int request, vtkDoubleArray *eigenvectors)
void GetEigenvectors (vtkDoubleArray *eigenvectors)
void GetEigenvector (int i, vtkDoubleArray *eigenvector)
void GetEigenvector (int request, int i, vtkDoubleArray *eigenvector)
virtual void SetBasisScheme (int)
virtual int GetBasisScheme ()
virtual const char * GetBasisSchemeName (int schemeIndex)
virtual void SetBasisSchemeByName (const char *schemeName)
virtual void SetFixedBasisSize (int)
virtual int GetFixedBasisSize ()
virtual void SetFixedBasisEnergy (double)
virtual double GetFixedBasisEnergy ()
virtual bool SetParameter (const char *parameter, int index, vtkVariant value)

Static Public Member Functions

static int IsTypeOf (const char *type)
static vtkPCAStatisticsSafeDownCast (vtkObject *o)
static vtkPCAStatisticsNew ()

Protected Member Functions

 vtkPCAStatistics ()
 ~vtkPCAStatistics ()
virtual int FillInputPortInformation (int port, vtkInformation *info)
virtual void Derive (vtkMultiBlockDataSet *inMeta)
virtual void Test (vtkTable *, vtkMultiBlockDataSet *, vtkTable *)
virtual void Assess (vtkTable *, vtkMultiBlockDataSet *, vtkTable *)
virtual void SelectAssessFunctor (vtkTable *inData, vtkDataObject *inMeta, vtkStringArray *rowNames, AssessFunctor *&dfunc)

Protected Attributes

int NormalizationScheme
int BasisScheme
int FixedBasisSize
double FixedBasisEnergy

Static Protected Attributes

static const char * BasisSchemeEnumNames [NUM_BASIS_SCHEMES+1]
static const char * NormalizationSchemeEnumNames [NUM_NORMALIZATION_SCHEMES+1]

Member Typedef Documentation

Reimplemented from vtkMultiCorrelativeStatistics.

Reimplemented in vtkPPCAStatistics.

Definition at line 60 of file vtkPCAStatistics.h.


Member Enumeration Documentation

Methods by which the covariance matrix may be normalized.

Enumerator:
NONE  The covariance matrix should be used as computed.
TRIANGLE_SPECIFIED  Normalize cov(i,j) by V(i,j) where V is supplied by the user.
DIAGONAL_SPECIFIED  Normalize cov(i,j) by sqrt(V(i)*V(j)) where V is supplied by the user.
DIAGONAL_VARIANCE  Normalize cov(i,j) by sqrt(cov(i,i)*cov(j,j)).
NUM_NORMALIZATION_SCHEMES  The number of normalization schemes.

Definition at line 67 of file vtkPCAStatistics.h.

These are the enumeration values that SetBasisScheme() accepts and GetBasisScheme returns.

Enumerator:
FULL_BASIS  Use all entries in the basis matrix.
FIXED_BASIS_SIZE  Use the first N entries in the basis matrix.
FIXED_BASIS_ENERGY  Use consecutive basis matrix entries whose energies sum to at least T.
NUM_BASIS_SCHEMES  The number of schemes (not a valid scheme).

Definition at line 80 of file vtkPCAStatistics.h.


Constructor & Destructor Documentation

vtkPCAStatistics::vtkPCAStatistics (  )  [protected]

vtkPCAStatistics::~vtkPCAStatistics (  )  [protected]


Member Function Documentation

virtual const char* vtkPCAStatistics::GetClassName (  )  [virtual]

Reimplemented from vtkMultiCorrelativeStatistics.

Reimplemented in vtkPPCAStatistics.

static int vtkPCAStatistics::IsTypeOf ( const char *  name  )  [static]

Return 1 if this class type is the same type of (or a subclass of) the named class. Returns 0 otherwise. This method works in combination with vtkTypeMacro found in vtkSetGet.h.

Reimplemented from vtkMultiCorrelativeStatistics.

Reimplemented in vtkPPCAStatistics.

virtual int vtkPCAStatistics::IsA ( const char *  name  )  [virtual]

Return 1 if this class is the same type of (or a subclass of) the named class. Returns 0 otherwise. This method works in combination with vtkTypeMacro found in vtkSetGet.h.

Reimplemented from vtkMultiCorrelativeStatistics.

Reimplemented in vtkPPCAStatistics.

static vtkPCAStatistics* vtkPCAStatistics::SafeDownCast ( vtkObject o  )  [static]

Reimplemented from vtkMultiCorrelativeStatistics.

Reimplemented in vtkPPCAStatistics.

virtual void vtkPCAStatistics::PrintSelf ( ostream &  os,
vtkIndent  indent 
) [virtual]

Methods invoked by print to print information about the object including superclasses. Typically not called by the user (use Print() instead) but used in the hierarchical print process to combine the output of several classes.

Reimplemented from vtkMultiCorrelativeStatistics.

Reimplemented in vtkPPCAStatistics.

static vtkPCAStatistics* vtkPCAStatistics::New (  )  [static]

Create an object with Debug turned off, modified time initialized to zero, and reference counting on.

Reimplemented from vtkMultiCorrelativeStatistics.

Reimplemented in vtkPPCAStatistics.

virtual void vtkPCAStatistics::SetNormalizationScheme ( int   )  [virtual]

This determines how (or if) the covariance matrix cov is normalized before PCA. When set to NONE, no normalization is performed. This is the default. When set to TRIANGLE_SPECIFIED, each entry cov(i,j) is divided by V(i,j). The list V of normalization factors must be set using the SetNormalization method before the filter is executed. When set to DIAGONAL_SPECIFIED, each entry cov(i,j) is divided by sqrt(V(i)*V(j)). The list V of normalization factors must be set using the SetNormalization method before the filter is executed. When set to DIAGONAL_VARIANCE, each entry cov(i,j) is divided by sqrt(cov(i,i)*cov(j,j)). Warning: Although this is accepted practice in some fields, some people think you should not turn this option on unless there is a good physically-based reason for doing so. Much better instead to determine how component magnitudes should be compared using physical reasoning and use DIAGONAL_SPECIFIED, TRIANGLE_SPECIFIED, or perform some pre-processing to shift and scale input data columns appropriately than to expect magical results from a shady normalization hack.

virtual int vtkPCAStatistics::GetNormalizationScheme (  )  [virtual]

This determines how (or if) the covariance matrix cov is normalized before PCA. When set to NONE, no normalization is performed. This is the default. When set to TRIANGLE_SPECIFIED, each entry cov(i,j) is divided by V(i,j). The list V of normalization factors must be set using the SetNormalization method before the filter is executed. When set to DIAGONAL_SPECIFIED, each entry cov(i,j) is divided by sqrt(V(i)*V(j)). The list V of normalization factors must be set using the SetNormalization method before the filter is executed. When set to DIAGONAL_VARIANCE, each entry cov(i,j) is divided by sqrt(cov(i,i)*cov(j,j)). Warning: Although this is accepted practice in some fields, some people think you should not turn this option on unless there is a good physically-based reason for doing so. Much better instead to determine how component magnitudes should be compared using physical reasoning and use DIAGONAL_SPECIFIED, TRIANGLE_SPECIFIED, or perform some pre-processing to shift and scale input data columns appropriately than to expect magical results from a shady normalization hack.

virtual void vtkPCAStatistics::SetNormalizationSchemeByName ( const char *  sname  )  [virtual]

This determines how (or if) the covariance matrix cov is normalized before PCA. When set to NONE, no normalization is performed. This is the default. When set to TRIANGLE_SPECIFIED, each entry cov(i,j) is divided by V(i,j). The list V of normalization factors must be set using the SetNormalization method before the filter is executed. When set to DIAGONAL_SPECIFIED, each entry cov(i,j) is divided by sqrt(V(i)*V(j)). The list V of normalization factors must be set using the SetNormalization method before the filter is executed. When set to DIAGONAL_VARIANCE, each entry cov(i,j) is divided by sqrt(cov(i,i)*cov(j,j)). Warning: Although this is accepted practice in some fields, some people think you should not turn this option on unless there is a good physically-based reason for doing so. Much better instead to determine how component magnitudes should be compared using physical reasoning and use DIAGONAL_SPECIFIED, TRIANGLE_SPECIFIED, or perform some pre-processing to shift and scale input data columns appropriately than to expect magical results from a shady normalization hack.

virtual const char* vtkPCAStatistics::GetNormalizationSchemeName ( int  scheme  )  [virtual]

This determines how (or if) the covariance matrix cov is normalized before PCA. When set to NONE, no normalization is performed. This is the default. When set to TRIANGLE_SPECIFIED, each entry cov(i,j) is divided by V(i,j). The list V of normalization factors must be set using the SetNormalization method before the filter is executed. When set to DIAGONAL_SPECIFIED, each entry cov(i,j) is divided by sqrt(V(i)*V(j)). The list V of normalization factors must be set using the SetNormalization method before the filter is executed. When set to DIAGONAL_VARIANCE, each entry cov(i,j) is divided by sqrt(cov(i,i)*cov(j,j)). Warning: Although this is accepted practice in some fields, some people think you should not turn this option on unless there is a good physically-based reason for doing so. Much better instead to determine how component magnitudes should be compared using physical reasoning and use DIAGONAL_SPECIFIED, TRIANGLE_SPECIFIED, or perform some pre-processing to shift and scale input data columns appropriately than to expect magical results from a shady normalization hack.

virtual vtkTable* vtkPCAStatistics::GetSpecifiedNormalization (  )  [virtual]

These methods allow you to set/get values used to normalize the covariance matrix before PCA. The normalization values apply to all requests, so you do not specify a single vector but a 3-column table. The first two columns contain the names of columns from input 0 and the third column contains the value to normalize the corresponding entry in the covariance matrix. The table must always have 3 columns even when the NormalizationScheme is DIAGONAL_SPECIFIED. When only diagonal entries are to be used, only table rows where the first two columns are identical to one another will be employed. If there are multiple rows specifying different values for the same pair of columns, the entry nearest the bottom of the table takes precedence. These functions are actually convenience methods that set/get the third input of the filter. Because the table is the third input, you may use other filters to produce a table of normalizations and have the pipeline take care of updates. Any missing entries will be set to 1.0 and a warning issued. An error will occur if the third input to the filter is not set and the NormalizationScheme is DIAGONAL_SPECIFIED or TRIANGLE_SPECIFIED.

virtual void vtkPCAStatistics::SetSpecifiedNormalization ( vtkTable  )  [virtual]

These methods allow you to set/get values used to normalize the covariance matrix before PCA. The normalization values apply to all requests, so you do not specify a single vector but a 3-column table. The first two columns contain the names of columns from input 0 and the third column contains the value to normalize the corresponding entry in the covariance matrix. The table must always have 3 columns even when the NormalizationScheme is DIAGONAL_SPECIFIED. When only diagonal entries are to be used, only table rows where the first two columns are identical to one another will be employed. If there are multiple rows specifying different values for the same pair of columns, the entry nearest the bottom of the table takes precedence. These functions are actually convenience methods that set/get the third input of the filter. Because the table is the third input, you may use other filters to produce a table of normalizations and have the pipeline take care of updates. Any missing entries will be set to 1.0 and a warning issued. An error will occur if the third input to the filter is not set and the NormalizationScheme is DIAGONAL_SPECIFIED or TRIANGLE_SPECIFIED.

void vtkPCAStatistics::GetEigenvalues ( int  request,
vtkDoubleArray  
)

Get the eigenvalues. This function: void GetEigenvalues(int request, int i, vtkDoubleArray*); does all of the work. The other functions simply call this function with the appropriate parameters. These functions are not valid unless Update() has been called and the Derive option is turned on.

void vtkPCAStatistics::GetEigenvalues ( vtkDoubleArray  ) 

Get the eigenvalues. This function: void GetEigenvalues(int request, int i, vtkDoubleArray*); does all of the work. The other functions simply call this function with the appropriate parameters. These functions are not valid unless Update() has been called and the Derive option is turned on.

double vtkPCAStatistics::GetEigenvalue ( int  request,
int  i 
)

Get the eigenvalues. This function: void GetEigenvalues(int request, int i, vtkDoubleArray*); does all of the work. The other functions simply call this function with the appropriate parameters. These functions are not valid unless Update() has been called and the Derive option is turned on.

double vtkPCAStatistics::GetEigenvalue ( int  i  ) 

Get the eigenvalues. This function: void GetEigenvalues(int request, int i, vtkDoubleArray*); does all of the work. The other functions simply call this function with the appropriate parameters. These functions are not valid unless Update() has been called and the Derive option is turned on.

void vtkPCAStatistics::GetEigenvectors ( int  request,
vtkDoubleArray eigenvectors 
)

Get the eigenvectors. This function: void GetEigenvectors(int request, vtkDoubleArray* eigenvectors) does all of the work. The other functions are convenience functions that call this function with default arguments. These functions are not valid unless Update() has been called and the Derive option is turned on.

void vtkPCAStatistics::GetEigenvectors ( vtkDoubleArray eigenvectors  ) 

Get the eigenvectors. This function: void GetEigenvectors(int request, vtkDoubleArray* eigenvectors) does all of the work. The other functions are convenience functions that call this function with default arguments. These functions are not valid unless Update() has been called and the Derive option is turned on.

void vtkPCAStatistics::GetEigenvector ( int  i,
vtkDoubleArray eigenvector 
)

Get the eigenvectors. This function: void GetEigenvectors(int request, vtkDoubleArray* eigenvectors) does all of the work. The other functions are convenience functions that call this function with default arguments. These functions are not valid unless Update() has been called and the Derive option is turned on.

void vtkPCAStatistics::GetEigenvector ( int  request,
int  i,
vtkDoubleArray eigenvector 
)

Get the eigenvectors. This function: void GetEigenvectors(int request, vtkDoubleArray* eigenvectors) does all of the work. The other functions are convenience functions that call this function with default arguments. These functions are not valid unless Update() has been called and the Derive option is turned on.

virtual void vtkPCAStatistics::SetBasisScheme ( int   )  [virtual]

This variable controls the dimensionality of output tuples in Assess mode. Consider the case where you have requested a PCA on D columns. When set to vtkPCAStatistics::FULL_BASIS, the entire set of basis vectors is used to derive new coordinates for each tuple being assessed. In this mode, you are guaranteed to have output tuples of the same dimension as the input tuples. (That dimension is D, so there will be D additional columns added to the table for the request.) When set to vtkPCAStatistics::FIXED_BASIS_SIZE, only the first N basis vectors are used to derive new coordinates for each tuple being assessed. In this mode, you are guaranteed to have output tuples of dimension min(N,D). You must set N prior to assessing data using the SetFixedBasisSize() method. When N < D, this turns the PCA into a projection (instead of change of basis). When set to vtkPCAStatistics::FIXED_BASIS_ENERGY, the number of basis vectors used to derive new coordinates for each tuple will be the minimum number of columns N that satisfy

\[ \frac{\sum_{i=1}^{N} \lambda_i}{\sum_{i=1}^{D} \lambda_i} < T \]

You must set T prior to assessing data using the SetFixedBasisEnergy() method. When T < 1, this turns the PCA into a projection (instead of change of basis). By default BasisScheme is set to vtkPCAStatistics::FULL_BASIS.

virtual int vtkPCAStatistics::GetBasisScheme (  )  [virtual]

This variable controls the dimensionality of output tuples in Assess mode. Consider the case where you have requested a PCA on D columns. When set to vtkPCAStatistics::FULL_BASIS, the entire set of basis vectors is used to derive new coordinates for each tuple being assessed. In this mode, you are guaranteed to have output tuples of the same dimension as the input tuples. (That dimension is D, so there will be D additional columns added to the table for the request.) When set to vtkPCAStatistics::FIXED_BASIS_SIZE, only the first N basis vectors are used to derive new coordinates for each tuple being assessed. In this mode, you are guaranteed to have output tuples of dimension min(N,D). You must set N prior to assessing data using the SetFixedBasisSize() method. When N < D, this turns the PCA into a projection (instead of change of basis). When set to vtkPCAStatistics::FIXED_BASIS_ENERGY, the number of basis vectors used to derive new coordinates for each tuple will be the minimum number of columns N that satisfy

\[ \frac{\sum_{i=1}^{N} \lambda_i}{\sum_{i=1}^{D} \lambda_i} < T \]

You must set T prior to assessing data using the SetFixedBasisEnergy() method. When T < 1, this turns the PCA into a projection (instead of change of basis). By default BasisScheme is set to vtkPCAStatistics::FULL_BASIS.

virtual const char* vtkPCAStatistics::GetBasisSchemeName ( int  schemeIndex  )  [virtual]

This variable controls the dimensionality of output tuples in Assess mode. Consider the case where you have requested a PCA on D columns. When set to vtkPCAStatistics::FULL_BASIS, the entire set of basis vectors is used to derive new coordinates for each tuple being assessed. In this mode, you are guaranteed to have output tuples of the same dimension as the input tuples. (That dimension is D, so there will be D additional columns added to the table for the request.) When set to vtkPCAStatistics::FIXED_BASIS_SIZE, only the first N basis vectors are used to derive new coordinates for each tuple being assessed. In this mode, you are guaranteed to have output tuples of dimension min(N,D). You must set N prior to assessing data using the SetFixedBasisSize() method. When N < D, this turns the PCA into a projection (instead of change of basis). When set to vtkPCAStatistics::FIXED_BASIS_ENERGY, the number of basis vectors used to derive new coordinates for each tuple will be the minimum number of columns N that satisfy

\[ \frac{\sum_{i=1}^{N} \lambda_i}{\sum_{i=1}^{D} \lambda_i} < T \]

You must set T prior to assessing data using the SetFixedBasisEnergy() method. When T < 1, this turns the PCA into a projection (instead of change of basis). By default BasisScheme is set to vtkPCAStatistics::FULL_BASIS.

virtual void vtkPCAStatistics::SetBasisSchemeByName ( const char *  schemeName  )  [virtual]

This variable controls the dimensionality of output tuples in Assess mode. Consider the case where you have requested a PCA on D columns. When set to vtkPCAStatistics::FULL_BASIS, the entire set of basis vectors is used to derive new coordinates for each tuple being assessed. In this mode, you are guaranteed to have output tuples of the same dimension as the input tuples. (That dimension is D, so there will be D additional columns added to the table for the request.) When set to vtkPCAStatistics::FIXED_BASIS_SIZE, only the first N basis vectors are used to derive new coordinates for each tuple being assessed. In this mode, you are guaranteed to have output tuples of dimension min(N,D). You must set N prior to assessing data using the SetFixedBasisSize() method. When N < D, this turns the PCA into a projection (instead of change of basis). When set to vtkPCAStatistics::FIXED_BASIS_ENERGY, the number of basis vectors used to derive new coordinates for each tuple will be the minimum number of columns N that satisfy

\[ \frac{\sum_{i=1}^{N} \lambda_i}{\sum_{i=1}^{D} \lambda_i} < T \]

You must set T prior to assessing data using the SetFixedBasisEnergy() method. When T < 1, this turns the PCA into a projection (instead of change of basis). By default BasisScheme is set to vtkPCAStatistics::FULL_BASIS.

virtual void vtkPCAStatistics::SetFixedBasisSize ( int   )  [virtual]

The number of basis vectors to use. See SetBasisScheme() for more information. When FixedBasisSize <= 0 (the default), the fixed basis size scheme is equivalent to the full basis scheme.

virtual int vtkPCAStatistics::GetFixedBasisSize (  )  [virtual]

The number of basis vectors to use. See SetBasisScheme() for more information. When FixedBasisSize <= 0 (the default), the fixed basis size scheme is equivalent to the full basis scheme.

virtual void vtkPCAStatistics::SetFixedBasisEnergy ( double   )  [virtual]

The minimum energy the new basis should use, as a fraction. See SetBasisScheme() for more information. When FixedBasisEnergy >= 1 (the default), the fixed basis energy scheme is equivalent to the full basis scheme.

virtual double vtkPCAStatistics::GetFixedBasisEnergy (  )  [virtual]

The minimum energy the new basis should use, as a fraction. See SetBasisScheme() for more information. When FixedBasisEnergy >= 1 (the default), the fixed basis energy scheme is equivalent to the full basis scheme.

virtual bool vtkPCAStatistics::SetParameter ( const char *  parameter,
int  index,
vtkVariant  value 
) [virtual]

A convenience method (in particular for access from other applications) to set parameter values. Return true if setting of requested parameter name was excuted, false otherwise.

Reimplemented from vtkStatisticsAlgorithm.

virtual int vtkPCAStatistics::FillInputPortInformation ( int  port,
vtkInformation info 
) [protected, virtual]

This algorithm accepts a vtkTable containing normalization values for its fourth input (port 3). We override FillInputPortInformation to indicate this.

Reimplemented from vtkStatisticsAlgorithm.

virtual void vtkPCAStatistics::Derive ( vtkMultiBlockDataSet inMeta  )  [protected, virtual]

Execute the calculations required by the Derive option.

Reimplemented from vtkMultiCorrelativeStatistics.

virtual void vtkPCAStatistics::Test ( vtkTable ,
vtkMultiBlockDataSet ,
vtkTable  
) [protected, virtual]

Execute the calculations required by the Test option.

Reimplemented from vtkMultiCorrelativeStatistics.

virtual void vtkPCAStatistics::Assess ( vtkTable ,
vtkMultiBlockDataSet ,
vtkTable  
) [protected, virtual]

Execute the calculations required by the Assess option.

Reimplemented from vtkMultiCorrelativeStatistics.

virtual void vtkPCAStatistics::SelectAssessFunctor ( vtkTable inData,
vtkDataObject inMeta,
vtkStringArray rowNames,
AssessFunctor *&  dfunc 
) [protected, virtual]

Provide the appropriate assessment functor.

Reimplemented from vtkMultiCorrelativeStatistics.


Member Data Documentation

Definition at line 252 of file vtkPCAStatistics.h.

Definition at line 253 of file vtkPCAStatistics.h.

Definition at line 254 of file vtkPCAStatistics.h.

Definition at line 255 of file vtkPCAStatistics.h.

const char* vtkPCAStatistics::BasisSchemeEnumNames[NUM_BASIS_SCHEMES+1] [static, protected]

Definition at line 258 of file vtkPCAStatistics.h.

const char* vtkPCAStatistics::NormalizationSchemeEnumNames[NUM_NORMALIZATION_SCHEMES+1] [static, protected]

Definition at line 259 of file vtkPCAStatistics.h.


The documentation for this class was generated from the following file:

Generated on Wed Aug 24 11:54:12 2011 for VTK by  doxygen 1.5.6