Simulation INsight and Analysis

RESOURCE

Abstract

Sina is a tool set for modern scientific data management that provides flexible, light-weight support of non-bulk data capture for retention in and queries against SQL and noSQL data stores. HPC simulations traditionally maintain their data in files. Extracting data of interest for subsequent analysis then requires the time-consuming process of traversing directories and scraping data from files in a variety of formats. Sina facilitates capturing relevant data during execution or post-processing of simulation runs for retention in and queries from a modern data store. The tools are sufficiently general to allow for the inclusion of new fields as scientists learn more about their data. Libraries, currently in C++ and Python, and a command line interface (CLI) are provided. Sina's flexibility starts with a general schema, in JSON, for the collection of non-bulk simulation data. JSON provides a flexible, human-readable representation of the data that of interest. Sina currently has a C++ library for simulations to write data to and read from a schema-compliant file for subsequent ingestion into one of the supported data stores. However, applications are free to write their data directly into a schema-compliant file. Python packages provide data ingestion, management, query, and export capabilities. A command line  More>>
Developers:
Pauli, Esteban [1] Aschwanden, Pascal [1] Laney, Daiel [1] Dahlgren, Tamara [1] Semler, Jessica [1] Di Natale, Francesco [1] Greco, Nathan [1] Eklund, Joseph [1] Haluska, Rebecca [1]
  1. Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Release Date:
2018-11-19
Project Type:
Open Source, Publicly Available Repository
Software Type:
Scientific
Licenses:
MIT License
Sponsoring Org.:
Code ID:
27829
Research Org.:
Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)
Country of Origin:
United States

RESOURCE

Citation Formats

Pauli, Esteban T., Aschwanden, Pascal D., Laney, Daiel E., Dahlgren, Tamara, Semler, Jessica A., Di Natale, Francesco, Greco, Nathan S., Eklund, Joseph L., and Haluska, Rebecca M. Simulation INsight and Analysis. Computer Software. https://github.com/LLNL/Sina. USDOE National Nuclear Security Administration (NNSA). 19 Nov. 2018. Web. doi:10.11578/dc.20190715.10.
Pauli, Esteban T., Aschwanden, Pascal D., Laney, Daiel E., Dahlgren, Tamara, Semler, Jessica A., Di Natale, Francesco, Greco, Nathan S., Eklund, Joseph L., & Haluska, Rebecca M. (2018, November 19). Simulation INsight and Analysis. [Computer software]. https://github.com/LLNL/Sina. https://doi.org/10.11578/dc.20190715.10.
Pauli, Esteban T., Aschwanden, Pascal D., Laney, Daiel E., Dahlgren, Tamara, Semler, Jessica A., Di Natale, Francesco, Greco, Nathan S., Eklund, Joseph L., and Haluska, Rebecca M. "Simulation INsight and Analysis." Computer software. November 19, 2018. https://github.com/LLNL/Sina. https://doi.org/10.11578/dc.20190715.10.
@misc{ doecode_27829,
title = {Simulation INsight and Analysis},
author = {Pauli, Esteban T. and Aschwanden, Pascal D. and Laney, Daiel E. and Dahlgren, Tamara and Semler, Jessica A. and Di Natale, Francesco and Greco, Nathan S. and Eklund, Joseph L. and Haluska, Rebecca M.},
abstractNote = {Sina is a tool set for modern scientific data management that provides flexible, light-weight support of non-bulk data capture for retention in and queries against SQL and noSQL data stores. HPC simulations traditionally maintain their data in files. Extracting data of interest for subsequent analysis then requires the time-consuming process of traversing directories and scraping data from files in a variety of formats. Sina facilitates capturing relevant data during execution or post-processing of simulation runs for retention in and queries from a modern data store. The tools are sufficiently general to allow for the inclusion of new fields as scientists learn more about their data. Libraries, currently in C++ and Python, and a command line interface (CLI) are provided. Sina's flexibility starts with a general schema, in JSON, for the collection of non-bulk simulation data. JSON provides a flexible, human-readable representation of the data that of interest. Sina currently has a C++ library for simulations to write data to and read from a schema-compliant file for subsequent ingestion into one of the supported data stores. However, applications are free to write their data directly into a schema-compliant file. Python packages provide data ingestion, management, query, and export capabilities. A command line interface (CLI) provides simplified access to these features. A common application programming interface (API) is used to maintain and query data in any of the supported data stores, which are currently limited to SQL and Apache Cassandra (a column store). Tutorials, demonstrations, and examples illustrate aspects of the process using scripts and Jupyter notebooks.},
doi = {10.11578/dc.20190715.10},
url = {https://doi.org/10.11578/dc.20190715.10},
howpublished = {[Computer Software] \url{https://doi.org/10.11578/dc.20190715.10}},
year = {2018},
month = {nov}
}