skip to main content

SciTech ConnectSciTech Connect

Title: Multiresolution persistent homology for excessively large biomolecular datasets

Although persistent homology has emerged as a promising tool for the topological simplification of complex data, it is computationally intractable for large datasets. We introduce multiresolution persistent homology to handle excessively large datasets. We match the resolution with the scale of interest so as to represent large scale datasets with appropriate resolution. We utilize flexibility-rigidity index to access the topological connectivity of the data set and define a rigidity density for the filtration analysis. By appropriately tuning the resolution of the rigidity density, we are able to focus the topological lens on the scale of interest. The proposed multiresolution topological analysis is validated by a hexagonal fractal image which has three distinct scales. We further demonstrate the proposed method for extracting topological fingerprints from DNA molecules. In particular, the topological persistence of a virus capsid with 273‚ÄČ780 atoms is successfully analyzed which would otherwise be inaccessible to the normal point cloud method and unreliable by using coarse-grained multiscale persistent homology. The proposed method has also been successfully applied to the protein domain classification, which is the first time that persistent homology is used for practical protein domain analysis, to our knowledge. The proposed multiresolution topological method has potential applications inmore » arbitrary data sets, such as social networks, biological networks, and graphs.« less
Authors:
;  [1] ;  [1] ;  [2] ;  [2]
  1. Department of Mathematics, Michigan State University, East Lansing, Michigan 48824 (United States)
  2. (United States)
Publication Date:
OSTI Identifier:
22489667
Resource Type:
Journal Article
Resource Relation:
Journal Name: Journal of Chemical Physics; Journal Volume: 143; Journal Issue: 13; Other Information: (c) 2015 AIP Publishing LLC; Country of input: International Atomic Energy Agency (IAEA)
Country of Publication:
United States
Language:
English
Subject:
37 INORGANIC, ORGANIC, PHYSICAL AND ANALYTICAL CHEMISTRY; ATOMS; DATASETS; DENSITY; DNA; MOLECULES; PROTEINS