skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Choosing the best partition of the output from a large-scale simulation

Abstract

Data partitioning becomes necessary when a large-scale simulation produces more data than can be feasibly stored. The goal is to partition the data, typically so that every element belongs to one and only one partition, and store summary information about the partition, either a representative value plus an estimate of the error or a distribution. Once the partitions are determined and the summary information stored, the raw data is discarded. This process can be performed in-situ; meaning while the simulation is running. When creating the partitions there are many decisions that researchers must make. For instance, how to determine once an adequate number of partitions have been created, how are the partitions created with respect to dividing the data, or how many variables should be considered simultaneously. In addition, decisions must be made for how to summarize the information within each partition. Because of the combinatorial number of possible ways to partition and summarize the data, a method of comparing the different possibilities will help guide researchers into choosing a good partitioning and summarization scheme for their application.

Authors:
 [1];  [1]
  1. Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Publication Date:
Research Org.:
Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Sponsoring Org.:
USDOE Office of Science (SC). Advanced Scientific Computing Research (ASCR) (SC-21)
OSTI Identifier:
1396090
Report Number(s):
LA-UR-17-28730
DOE Contract Number:  
AC52-06NA25396
Resource Type:
Technical Report
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Challacombe, Chelsea Jordan, and Casleton, Emily Michele. Choosing the best partition of the output from a large-scale simulation. United States: N. p., 2017. Web. doi:10.2172/1396090.
Challacombe, Chelsea Jordan, & Casleton, Emily Michele. Choosing the best partition of the output from a large-scale simulation. United States. doi:10.2172/1396090.
Challacombe, Chelsea Jordan, and Casleton, Emily Michele. Tue . "Choosing the best partition of the output from a large-scale simulation". United States. doi:10.2172/1396090. https://www.osti.gov/servlets/purl/1396090.
@article{osti_1396090,
title = {Choosing the best partition of the output from a large-scale simulation},
author = {Challacombe, Chelsea Jordan and Casleton, Emily Michele},
abstractNote = {Data partitioning becomes necessary when a large-scale simulation produces more data than can be feasibly stored. The goal is to partition the data, typically so that every element belongs to one and only one partition, and store summary information about the partition, either a representative value plus an estimate of the error or a distribution. Once the partitions are determined and the summary information stored, the raw data is discarded. This process can be performed in-situ; meaning while the simulation is running. When creating the partitions there are many decisions that researchers must make. For instance, how to determine once an adequate number of partitions have been created, how are the partitions created with respect to dividing the data, or how many variables should be considered simultaneously. In addition, decisions must be made for how to summarize the information within each partition. Because of the combinatorial number of possible ways to partition and summarize the data, a method of comparing the different possibilities will help guide researchers into choosing a good partitioning and summarization scheme for their application.},
doi = {10.2172/1396090},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Tue Sep 26 00:00:00 EDT 2017},
month = {Tue Sep 26 00:00:00 EDT 2017}
}

Technical Report:

Save / Share: