Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Bipartite graph partitioning and data clustering

Technical Report ·
DOI:https://doi.org/10.2172/816202· OSTI ID:816202
Many data types arising from data mining applications can be modeled as bipartite graphs, examples include terms and documents in a text corpus, customers and purchasing items in market basket analysis and reviewers and movies in a movie recommender system. In this paper, the authors propose a new data clustering method based on partitioning the underlying biopartite graph. The partition is constructed by minimizing a normalized sum of edge weights between unmatched pairs of vertices of the bipartite graph. They show that an approximate solution to the minimization problem can be obtained by computing a partial singular value decomposition (SVD) of the associated edge weight matrix of the bipartite graph. They point out the connection of their clustering algorithm to correspondence analysis used in multivariate analysis. They also briefly discuss the issue of assigning data objects to multiple clusters. In the experimental results, they apply their clustering algorithm to the problem of document clustering to illustrate its effectiveness and efficiency.
Research Organization:
Ernest Orlando Lawrence Berkeley National Laboratory, Berkeley, CA (US)
Sponsoring Organization:
USDOE Director, Office of Science. Office of Advanced Scientific Computing Research. Mathematical, Information, and Computational Sciences Division (US)
DOE Contract Number:
AC03-76SF00098
OSTI ID:
816202
Report Number(s):
LBNL--47970
Country of Publication:
United States
Language:
English

Similar Records

Evolving bipartite authentication graph partitions
Journal Article · Sun Jan 15 23:00:00 EST 2017 · IEEE Transactions on Dependable and Secure Computing · OSTI ID:1351186

Genetic algorithms for graph partitioning and incremental graph partitioning
Book · Fri Dec 30 23:00:00 EST 1994 · OSTI ID:87649

Efficient multiple-way graph partitioning algorithms
Conference · Thu Nov 30 23:00:00 EST 1995 · OSTI ID:125591