skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Building Simulation Modelers are we big-data ready?

Conference ·
OSTI ID:1157101

Recent advances in computing and sensor technologies have pushed the amount of data we collect or generate to limits previously unheard of. Sub-minute resolution data from dozens of channels is becoming increasingly common and is expected to increase with the prevalence of non-intrusive load monitoring. Experts are running larger building simulation experiments and are faced with an increasingly complex data set to analyze and derive meaningful insight. This paper focuses on the data management challenges that building modeling experts may face in data collected from a large array of sensors, or generated from running a large number of building energy/performance simulations. The paper highlights the technical difficulties that were encountered and overcome in order to run 3.5 million EnergyPlus simulations on supercomputers and generating over 200 TBs of simulation output. This extreme case involved development of technologies and insights that will be beneficial to modelers in the immediate future. The paper discusses different database technologies (including relational databases, columnar storage, and schema-less Hadoop) in order to contrast the advantages and disadvantages of employing each for storage of EnergyPlus output. Scalability, analysis requirements, and the adaptability of these database technologies are discussed. Additionally, unique attributes of EnergyPlus output are highlighted which make data-entry non-trivial for multiple simulations. Practical experience regarding cost-effective strategies for big-data storage is provided. The paper also discusses network performance issues when transferring large amounts of data across a network to different computing devices. Practical issues involving lag, bandwidth, and methods for synchronizing or transferring logical portions of the data are presented. A cornerstone of big-data is its use for analytics; data is useless unless information can be meaningfully derived from it. In addition to technical aspects of managing big data, the paper details design of experiments in anticipation of large volumes of data. The cost of re-reading output into an analysis program is elaborated and analysis techniques that perform analysis in-situ with the simulations as they are run are discussed. The paper concludes with an example and elaboration of the tipping point where it becomes more expensive to store the output than re-running a set of simulations.

Research Organization:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
Sponsoring Organization:
USDOE Office of Energy Efficiency and Renewable Energy (EERE)
DOE Contract Number:
DE-AC05-00OR22725
OSTI ID:
1157101
Resource Relation:
Conference: Proceedings of the ASHRAE/IBPSA-USA Building Simulation Conference, Atlanta, GA, USA, 20140910, 20140912
Country of Publication:
United States
Language:
English

Similar Records

Advances in Cross-Cutting Ideas for Computational Climate Science
Technical Report · Sun Jan 01 00:00:00 EST 2017 · OSTI ID:1157101

Advances in Cross-Cutting Ideas for Computational Climate Science
Technical Report · Sun Jan 01 00:00:00 EST 2017 · OSTI ID:1157101

Big Data Framework with Machine Learning for D and D Applications - 19108
Conference · Mon Jul 01 00:00:00 EDT 2019 · OSTI ID:1157101

Related Subjects