skip to main content

SciTech ConnectSciTech Connect

Title: Wedge sampling for computing clustering coefficients and triangle counts on large graphs

Graphs are used to model interactions in a variety of contexts, and there is a growing need to quickly assess the structure of such graphs. Some of the most useful graph metrics are based on triangles, such as those measuring social cohesion. Despite the importance of these triadic measures, algorithms to compute them can be extremely expensive. We discuss the method of wedge sampling. This versatile technique allows for the fast and accurate approximation of various types of clustering coefficients and triangle counts. Furthermore, these techniques are extensible to counting directed triangles in digraphs. Our methods come with provable and practical time-approximation tradeoffs for all computations. We provide extensive results that show our methods are orders of magnitude faster than the state of the art, while providing nearly the accuracy of full enumeration.
 [1] ;  [1] ;  [1]
  1. Sandia National Lab. (SNL-CA), Livermore, CA (United States)
Publication Date:
OSTI Identifier:
Report Number(s):
Journal ID: ISSN 1932-1864; 473877
Grant/Contract Number:
Accepted Manuscript
Journal Name:
Statistical Analysis and Data Mining
Additional Journal Information:
Journal Volume: 7; Journal Issue: 4; Journal ID: ISSN 1932-1864
Research Org:
Sandia National Laboratories (SNL-CA), Livermore, CA (United States)
Sponsoring Org:
USDOE National Nuclear Security Administration (NNSA)
Country of Publication:
United States
97 MATHEMATICS AND COMPUTING triangle counting; clustering coefficients; directed triangles; triangle characteristics; wedge sampling