skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: The Network Completion Problem: Inferring Missing Nodes and Edges in Networks

Conference ·

Network structures, such as social networks, web graphs and networks from systems biology, play important roles in many areas of science and our everyday lives. In order to study the networks one needs to first collect reliable large scale network data. While the social and information networks have become ubiquitous, the challenge of collecting complete network data still persists. Many times the collected network data is incomplete with nodes and edges missing. Commonly, only a part of the network can be observed and we would like to infer the unobserved part of the network. We address this issue by studying the Network Completion Problem: Given a network with missing nodes and edges, can we complete the missing part? We cast the problem in the Expectation Maximization (EM) framework where we use the observed part of the network to fit a model of network structure, and then we estimate the missing part of the network using the model, re-estimate the parameters and so on. We combine the EM with the Kronecker graphs model and design a scalable Metropolized Gibbs sampling approach that allows for the estimation of the model parameters as well as the inference about missing nodes and edges of the network. Experiments on synthetic and several real-world networks show that our approach can effectively recover the network even when about half of the nodes in the network are missing. Our algorithm outperforms not only classical link-prediction approaches but also the state of the art Stochastic block modeling approach. Furthermore, our algorithm easily scales to networks with tens of thousands of nodes.

Research Organization:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
W-7405-ENG-48
OSTI ID:
1035599
Report Number(s):
LLNL-CONF-513839; TRN: US201205%%164
Resource Relation:
Conference: Presented at: SIAM International Conference on Data Mining (SDM), Mesa, AZ, United States, Apr 28 - Apr 30, 2011
Country of Publication:
United States
Language:
English

Similar Records

Accurate Characterization of Real Networks from Inaccurate Measurements
Technical Report · Fri Sep 01 00:00:00 EDT 2017 · OSTI ID:1035599

Topology Inference of Unknown Networks Based on Robust Virtual Coordinate Systems
Journal Article · Fri Feb 01 00:00:00 EST 2019 · IEEE/ACM Transactions on Networking · OSTI ID:1035599

Making social networks more human: A topological approach
Journal Article · Wed Jul 24 00:00:00 EDT 2019 · Statistical Analysis and Data Mining · OSTI ID:1035599