Projection pursuit methods for data analysis
Multivariate analysis can be thought of as a methodology for detection, description and validation of structure in p-dimensional (p > 1) point clouds. Classical multivariate analysis relies on the assumption that the observations forming the point cloud(s) have a Gaussian distribution. All information about structure is then contained in the means and covariance matrices, and the well-known apparatus for estimation and inference in parametric families can be brought to bear. The uncomfortable ingredient in this approach is the Gaussianity assumption. The data may be Gaussian with occasional outliers or even the bulk of the data simply might not conform to a Gaussian distribution. Methods are discussed that do not involve any distributional assumptions. In this case, structure cannot be perceived by looking at a set of estimated parameters. An obvious remedy is to look at the data themselves, at the p-dimensional point cloud(s), and to base the description of structure on those views. As perception in more than three dimensions is difficult, the dimensionality of the data first has to be reduced, most simply by projection. Projection of the data generally implies loss of information. As a consequence, multivariate structure does not usually show up in all projections, and no single projection might contain all the information. It is therefore important to judiciously choose the set of projections on which the model of the structure is to be based. This is the goal of projection pursuit procedures.
- Research Organization:
- Stanford Linear Accelerator Center, CA (USA)
- DOE Contract Number:
- AC03-76SF00515
- OSTI ID:
- 6288977
- Report Number(s):
- SLAC-PUB-2768; ON: DE81027493
- Country of Publication:
- United States
- Language:
- English
Similar Records
Projection Pursuit Indices Based on the Empirical Distribution Function
Interpretable projection pursuit