### Wedge sampling for computing clustering coefficients and triangle counts on large graphs

Graphs are used to model interactions in a variety of contexts, and there is a growing need to quickly assess the structure of such graphs. Some of the most useful graph metrics are based on triangles, such as those measuring social cohesion. Despite the importance of these triadic measures, algorithms to compute them can be extremely expensive. We discuss the method of wedge sampling. This versatile technique allows for the fast and accurate approximation of various types of clustering coefficients and triangle counts. Furthermore, these techniques are extensible to counting directed triangles in digraphs. Our methods come with provable and practical time-approximation tradeoffs for all computations. We provide extensive results that show our methods are orders of magnitude faster than the state of the art, while providing nearly the accuracy of full enumeration.

- Publication Date:

- Report Number(s):
- SAND2013-7623J

Journal ID: ISSN 1932-1864; 473877

- Grant/Contract Number:
- AC04-94AL85000

- Type:
- Accepted Manuscript

- Journal Name:
- Statistical Analysis and Data Mining

- Additional Journal Information:
- Journal Volume: 7; Journal Issue: 4; Journal ID: ISSN 1932-1864

- Publisher:
- Wiley

- Research Org:
- Sandia National Lab. (SNL-CA), Livermore, CA (United States)

- Sponsoring Org:
- USDOE National Nuclear Security Administration (NNSA)

- Country of Publication:
- United States

- Language:
- English

- Subject:
- 97 MATHEMATICS AND COMPUTING; triangle counting; clustering coefficients; directed triangles; triangle characteristics; wedge sampling

- OSTI Identifier:
- 1110379

```
Seshadhri, C., Pinar, Ali, and Kolda, Tamara G..
```*Wedge sampling for computing clustering coefficients and triangle counts on large graphs*. United States: N. p.,
Web. doi:10.1002/sam.11224.

```
Seshadhri, C., Pinar, Ali, & Kolda, Tamara G..
```*Wedge sampling for computing clustering coefficients and triangle counts on large graphs*. United States. doi:10.1002/sam.11224.

```
Seshadhri, C., Pinar, Ali, and Kolda, Tamara G.. 2014.
"Wedge sampling for computing clustering coefficients and triangle counts on large graphs". United States.
doi:10.1002/sam.11224. https://www.osti.gov/servlets/purl/1110379.
```

```
@article{osti_1110379,
```

title = {Wedge sampling for computing clustering coefficients and triangle counts on large graphs},

author = {Seshadhri, C. and Pinar, Ali and Kolda, Tamara G.},

abstractNote = {Graphs are used to model interactions in a variety of contexts, and there is a growing need to quickly assess the structure of such graphs. Some of the most useful graph metrics are based on triangles, such as those measuring social cohesion. Despite the importance of these triadic measures, algorithms to compute them can be extremely expensive. We discuss the method of wedge sampling. This versatile technique allows for the fast and accurate approximation of various types of clustering coefficients and triangle counts. Furthermore, these techniques are extensible to counting directed triangles in digraphs. Our methods come with provable and practical time-approximation tradeoffs for all computations. We provide extensive results that show our methods are orders of magnitude faster than the state of the art, while providing nearly the accuracy of full enumeration.},

doi = {10.1002/sam.11224},

journal = {Statistical Analysis and Data Mining},

number = 4,

volume = 7,

place = {United States},

year = {2014},

month = {5}

}