Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Self-aggregation in scaled principal component space

Technical Report ·
DOI:https://doi.org/10.2172/820779· OSTI ID:820779
Automatic grouping of voluminous data into meaningful structures is a challenging task frequently encountered in broad areas of science, engineering and information processing. These data clustering tasks are frequently performed in Euclidean space or a subspace chosen from principal component analysis (PCA). Here we describe a space obtained by a nonlinear scaling of PCA in which data objects self-aggregate automatically into clusters. Projection into this space gives sharp distinctions among clusters. Gene expression profiles of cancer tissue subtypes, Web hyperlink structure and Internet newsgroups are analyzed to illustrate interesting properties of the space.
Research Organization:
Ernest Orlando Lawrence Berkeley National Laboratory, Berkeley, CA (US)
Sponsoring Organization:
USDOE Director, Office of Science. Office of Advanced Scientific Computing Research. Mathematical, Information, and Computational Sciences Division (US)
DOE Contract Number:
AC03-76SF00098
OSTI ID:
820779
Report Number(s):
LBNL--49048
Country of Publication:
United States
Language:
English

Similar Records

Adaptive dimension reduction for clustering high dimensional data
Technical Report · Tue Oct 01 00:00:00 EDT 2002 · OSTI ID:807420

Web document clustering using hyperlink structures
Technical Report · Mon May 07 00:00:00 EDT 2001 · OSTI ID:815474

Detecting Combustion and Flow Features In Situ Using Principal Component Analysis
Technical Report · Sat Feb 28 23:00:00 EST 2009 · OSTI ID:1324759