skip to main content

SciTech ConnectSciTech Connect

Title: Nationwide Buildings Energy Research enabled through an integrated Data Intensive Scientific Workflow and Advanced Analysis Environment

Modern workflow systems enable scientists to run ensemble simulations at unprecedented scales and levels of complexity, allowing them to study system sizes previously impossible to achieve, due to the inherent resource requirements needed for the modeling work. However as a result of these new capabilities the science teams suddenly also face unprecedented data volumes that they are unable to analyze with their existing tools and methodologies in a timely fashion. In this paper we will describe the ongoing development work to create an integrated data intensive scientific workflow and analysis environment that offers researchers the ability to easily create and execute complex simulation studies and provides them with different scalable methods to analyze the resulting data volumes. The integration of simulation and analysis environments is hereby not only a question of ease of use, but supports fundamental functions in the correlated analysis of simulation input, execution details and derived results for multi-variant, complex studies. To this end the team extended and integrated the existing capabilities of the Velo data management and analysis infrastructure, the MeDICi data intensive workflow system and RHIPE the R for Hadoop version of the well-known statistics package, as well as developing a new visual analytics interfacemore » for the result exploitation by multi-domain users. The capabilities of the new environment are demonstrated on a use case that focusses on the Pacific Northwest National Laboratory (PNNL) building energy team, showing how they were able to take their previously local scale simulations to a nationwide level by utilizing data intensive computing techniques not only for their modeling work, but also for the subsequent analysis of their modeling results. As part of the PNNL research initiative PRIMA (Platform for Regional Integrated Modeling and Analysis) the team performed an initial 3 year study of building energy demands for the US Eastern Interconnect domain, which they are now planning to extend to predict the demand for the complete century. The initial study raised their data demands from a few GBs to 400GB for the 3year study and expected tens of TBs for the full century.« less
 [1] ;  [1] ;  [1] ;  [1] ;  [1] ;  [1] ;  [1] ;  [1] ;  [1] ;  [2] ;  [3]
  1. Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
  2. Carnegie Mellon Univ., Pittsburgh, PA (United States)
  3. Concordia Univ., Montreal, QC (Canada)
Publication Date:
OSTI Identifier:
Report Number(s):
Journal ID: ISSN 1996-3599
DOE Contract Number:
Resource Type:
Journal Article
Resource Relation:
Journal Name: Building Simulation; Journal Volume: 7; Journal Issue: 4
Research Org:
Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)
Sponsoring Org:
Country of Publication:
United States
99 GENERAL AND MISCELLANEOUS Building energy; data analysis; scientific workflow; data intensive; data management; Hadoop; RHIPE