Streaming Statistics

RESOURCE

Abstract

In the context of a larger effort for in situ data analytics, there is a need to calculate basic statistics metrics (e.g., count, mean, median) online as new data points become available. Originally, the code for such online, or in other words streaming, statistics was part of the TALASS (Topological Analysis of Large- Scale Simulations) library. We isolated the relevant code and created a standalone library from it called Streaming Statistics. We also added an ability to serialize and deserialize the statistics objects so that the library can be used in distributed, task-based processing. To use the Streaming Statistics library, the user chooses a statistic, constructs an object for it, and then "adds" values to it, which means the statistic is augmented.
Developers:
Shudler, Sergei [1] Bremer, Peer-Timo [1]
  1. Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Release Date:
2022-03-01
Project Type:
Open Source, Publicly Available Repository
Software Type:
Scientific
Version:
1
Licenses:
BSD 3-clause "New" or "Revised" License
Sponsoring Org.:
Code ID:
72100
Site Accession Number:
LLNL-CODE-833061
Research Org.:
Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)
Country of Origin:
United States

RESOURCE

Citation Formats

Shudler, Sergei, and Bremer, Peer-Timo. Streaming Statistics. Computer Software. https://github.com/LLNL/STREAMSTAT. USDOE National Nuclear Security Administration (NNSA). 01 Mar. 2022. Web. doi:10.11578/dc.20220324.4.
Shudler, Sergei, & Bremer, Peer-Timo. (2022, March 01). Streaming Statistics. [Computer software]. https://github.com/LLNL/STREAMSTAT. https://doi.org/10.11578/dc.20220324.4.
Shudler, Sergei, and Bremer, Peer-Timo. "Streaming Statistics." Computer software. March 01, 2022. https://github.com/LLNL/STREAMSTAT. https://doi.org/10.11578/dc.20220324.4.
@misc{ doecode_72100,
title = {Streaming Statistics},
author = {Shudler, Sergei and Bremer, Peer-Timo},
abstractNote = {In the context of a larger effort for in situ data analytics, there is a need to calculate basic statistics metrics (e.g., count, mean, median) online as new data points become available. Originally, the code for such online, or in other words streaming, statistics was part of the TALASS (Topological Analysis of Large- Scale Simulations) library. We isolated the relevant code and created a standalone library from it called Streaming Statistics. We also added an ability to serialize and deserialize the statistics objects so that the library can be used in distributed, task-based processing. To use the Streaming Statistics library, the user chooses a statistic, constructs an object for it, and then "adds" values to it, which means the statistic is augmented.},
doi = {10.11578/dc.20220324.4},
url = {https://doi.org/10.11578/dc.20220324.4},
howpublished = {[Computer Software] \url{https://doi.org/10.11578/dc.20220324.4}},
year = {2022},
month = {mar}
}