skip to main content

Title: R&D100: Lightweight Distributed Metric Service

On today's High Performance Computing platforms, the complexity of applications and configurations makes efficient use of resources difficult. The Lightweight Distributed Metric Service (LDMS) is monitoring software developed by Sandia National Laboratories to provide detailed metrics of system performance. LDMS provides collection, transport, and storage of data from extreme-scale systems at fidelities and timescales to provide understanding of application and system performance with no statistically significant impact on application performance.
Authors:
; ; ;
Publication Date:
OSTI Identifier:
1328737
Resource Type:
Other
Research Org:
SNL (Sandia National Laboratories (SNL), Albuquerque, NM, and Livermore, CA (United States))
Sponsoring Org:
USDOE National Nuclear Security Administration (NNSA)
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING HIGH PERFORMANCE COMPUTING SYSTEMS; DATA; COMPUTER RESOURCES; LDMS; NCSA BLUE WATER SUPERCOMPUTER