skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Understanding the Hierarchy of Dense Subgraphs in Stationary and Temporally Varying Setting

Technical Report ·
DOI:https://doi.org/10.2172/1527314· OSTI ID:1527314
 [1];  [2]
  1. Univ. of Buffalo, NY (United States)
  2. Sandia National Lab. (SNL-CA), Livermore, CA (United States). Cyber Analytics and Data Science Dept.

Graphs are widely used to model relationships in a wide variety of domains such as sociology, bioinformatics, infrastructure, the WWW, to name a few. One of the key observations is that while real-world graphs are often globally sparse, they are locally dense. In other words, the average degree is often quite small (say at most 10 in a million vertex graph), but vertex neighborhoods are often dense. Finding dense subgraphs is a critical aspect of graph mining It has been used for finding communities and spam link farms in web graphs, graph visualization, real-time story identification, DNA motif detection in biological networks, finding correlated genes, epilepsy prediction, finding price value motifs in financial data, graph compression, distance query indexing, and increasing the throughput of social networking site servers. However, most standard formulations of this problem (like clique, quasi-clique, k-densest subgraph) are NP-hard. Furthermore, current dense subgraph finding algorithms usually optimize some objective, and only find a few such subgraphs without providing any structural relations, whereas the goal is rarely to find the "true optimum," but to identify many (if not all) dense substructures, understand their distribution in the graph, and ideally determine relationships among them. In this project, we first aim to devise algorithms and provide 3 implementations with nice visualizations to find the hierarchy between dense subgraphs, and then understand the structure of the hierarchy to gain more insight on the hidden patterns in real-world networks. Another important aspects in graph analysis is the temporal nature of networks. Networks evolve over time and in many applications data arrives at a high velocity, and thus it is important to design algorithms that can process data efficiently. We report three main results towards identifying dense structures in large evolving graphs. First, we will show how the hierarchical connectedness structure can be maintained efficiently, where connectedness is defined by increasing levels of connectivity strength. Next, we present dense structure can be identified in bipartite graphs without building projection graphs. And finally, we present a new method for peeling algorithms This new approach avoids sequential nature of peeling algorithms and is amenable to parallelization, which is crucial for processing high velocity data.

Research Organization:
Sandia National Lab. (SNL-CA), Livermore, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR); USDOE National Nuclear Security Administration (NNSA)
DOE Contract Number:
AC04-94AL85000; NA0003525
OSTI ID:
1527314
Report Number(s):
SAND-2017-9707R; 663025
Country of Publication:
United States
Language:
English

Similar Records

Finding Hierarchical and Overlapping Dense Subgraphs using Nucleus Decompositions
Technical Report · Sat Nov 01 00:00:00 EDT 2014 · OSTI ID:1527314

Incremental k-core decomposition: Algorithms and evaluation
Journal Article · Mon Feb 01 00:00:00 EST 2016 · The VLDB Journal · OSTI ID:1527314

Large-Scale Continuous Subgraph Queries on Streams
Conference · Wed Nov 30 00:00:00 EST 2011 · OSTI ID:1527314

Related Subjects