skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A scalable messaging system for accelerating discovery from large scale scientific simulations

Conference ·
OSTI ID:1096347

Emerging scientific and engineering simulations running at scale on leadership-class High End Computing (HEC) environments are producing large volumes of data, which has to be transported and analyzed before any insights can result from these simulations. The complexity and cost (in terms of time and energy) associated with managing and analyzing this data have become significant challenges, and are limiting the impact of these simulations. Recently, data-staging approaches along with in-situ and in-transit analytics have been proposed to address these challenges by offloading I/O and/or moving data processing closer to the data. However, scientists continue to be overwhelmed by the large data volumes and data rates. In this paper we address this latter challenge. Specifically, we propose a highly scalable and low-overhead associative messaging framework that runs on the data staging resources within the HEC platform, and builds on the staging-based online in-situ/in- transit analytics to provide publish/subscribe/notification-type messaging patterns to the scientist. Rather than having to ingest and inspect the data volumes, this messaging system allows scientists to (1) dynamically subscribe to data events of interest, e.g., simple data values or a complex function or simple reduction (max()/min()/avg()) of the data values in a certain region of the application domain is greater/less than a threshold value, or certain spatial/temporal data features or data patterns are detected; (2) define customized in-situ/in-transit actions that are triggered based on the events, such as data visualization or transformation; and (3) get notified when these events occur. The key contribution of this paper is a design and implementation that can support such a messaging abstraction at scale on high- end computing (HEC) systems with minimal overheads. We have implemented and deployed the messaging system on the Jaguar Cray XK6 machines at Oak Ridge National Laboratory and the Lonestar system at the Texas Advanced Computing Center (TACC), and we present the experimental performance evaluation using these HEC platforms in the paper.

Research Organization:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). National Center for Computational Sciences (NCCS)
Sponsoring Organization:
USDOE Office of Science (SC)
DOE Contract Number:
DE-AC05-00OR22725
OSTI ID:
1096347
Resource Relation:
Conference: High Performance Computing (HiPC), 2012 19th International Conference on, Pune, India, 20121218, 20121218
Country of Publication:
United States
Language:
English

Similar Records

Related Subjects