skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Continuous Time Group Discovery in Dynamic Graphs

Conference ·
OSTI ID:1016298

With the rise in availability and importance of graphs and networks, it has become increasingly important to have good models to describe their behavior. While much work has focused on modeling static graphs, we focus on group discovery in dynamic graphs. We adapt a dynamic extension of Latent Dirichlet Allocation to this task and demonstrate good performance on two datasets. Modeling relational data has become increasingly important in recent years. Much work has focused on static graphs - that is fixed graphs at a single point in time. Here we focus on the problem of modeling dynamic (i.e. time-evolving) graphs. We propose a scalable Bayesian approach for community discovery in dynamic graphs. Our approach is based on extensions of Latent Dirichlet Allocation (LDA). LDA is a latent variable model for topic modeling in text corpora. It was extended to deal with topic changes in discrete time and later in continuous time. These models were referred to as the discrete Dynamic Topic Model (dDTM) and the continuous Dynamic Topic Model (cDTM), respectively. When adapting these models to graphs, we take our inspiration from LDA-G and SSN-LDA, applications of LDA to static graphs that have been shown to effectively factor out community structure to explain link patterns in graphs. In this paper, we demonstrate how to adapt and apply the cDTM to the task of finding communities in dynamic networks. We use link prediction to measure the quality of the discovered community structure and apply it to two different relational datasets - DBLP author-keyword and CAIDA autonomous systems relationships. We also discuss a parallel implementation of this approach using Hadoop. In Section 2, we review LDA and LDA-G. In Section 3, we review the cDTM and introduce cDTMG, its adaptation to modeling dynamic graphs. We discuss inference for the cDTM-G and details of our parallel implementation in Section 4 and present its performance on two datasets in Section 5 before concluding in Section 6.

Research Organization:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
W-7405-ENG-48
OSTI ID:
1016298
Report Number(s):
LLNL-CONF-461724; TRN: US201112%%288
Resource Relation:
Conference: Presented at: NIPS Workshop on Analyzing Networks and Learning with Graphs, Whistler, Canada, Dec 11 - Dec 11, 2009
Country of Publication:
United States
Language:
English

Similar Records

Tracking topic birth and death in LDA.
Technical Report · Thu Sep 01 00:00:00 EDT 2011 · OSTI ID:1016298

Statistical modeling of biomedical corpora: mining the Caenorhabditis Genetic Center Bibliography for genes related to life span
Journal Article · Mon May 08 00:00:00 EDT 2006 · BMC Bioinformatics · OSTI ID:1016298

Latent Dirichlet Allocation modeling of environmental microbiomes
Journal Article · Thu Jun 08 00:00:00 EDT 2023 · PLoS Computational Biology (Online) · OSTI ID:1016298