skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Lustre Distributed Name Space (DNE) Evaluation at the Oak Ridge Leadership Computing Facility (OLCF)

Abstract

This document describes the Lustre Distributed Name Space (DNE) evaluation carried at the Oak Ridge Leadership Computing Facility (OLCF) between 2014 and 2015. DNE is a development project funded by the OpenSFS, to improve Lustre metadata performance and scalability. The development effort has been split into two parts, the first part (DNE P1) providing support for remote directories over remote Lustre Metadata Server (MDS) nodes and Metadata Target (MDT) devices, while the second phase (DNE P2) addressed split directories over multiple remote MDS nodes and MDT devices. The OLCF have been actively evaluating the performance, reliability, and the functionality of both DNE phases. For these tests, internal OLCF testbed were used. Results are promising and OLCF is planning on a full DNE deployment by mid-2016 timeframe on production systems.

Authors:
 [1];  [1];  [1];  [1]
  1. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Center for Computational Sciences
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1302888
Report Number(s):
ORNL/TM-2015/608
DOE Contract Number:
AC05-00OR22725
Resource Type:
Technical Report
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Simmons, James S., Leverman, Dustin B., Hanley, Jesse A., and Oral, Sarp. Lustre Distributed Name Space (DNE) Evaluation at the Oak Ridge Leadership Computing Facility (OLCF). United States: N. p., 2016. Web. doi:10.2172/1302888.
Simmons, James S., Leverman, Dustin B., Hanley, Jesse A., & Oral, Sarp. Lustre Distributed Name Space (DNE) Evaluation at the Oak Ridge Leadership Computing Facility (OLCF). United States. doi:10.2172/1302888.
Simmons, James S., Leverman, Dustin B., Hanley, Jesse A., and Oral, Sarp. 2016. "Lustre Distributed Name Space (DNE) Evaluation at the Oak Ridge Leadership Computing Facility (OLCF)". United States. doi:10.2172/1302888. https://www.osti.gov/servlets/purl/1302888.
@article{osti_1302888,
title = {Lustre Distributed Name Space (DNE) Evaluation at the Oak Ridge Leadership Computing Facility (OLCF)},
author = {Simmons, James S. and Leverman, Dustin B. and Hanley, Jesse A. and Oral, Sarp},
abstractNote = {This document describes the Lustre Distributed Name Space (DNE) evaluation carried at the Oak Ridge Leadership Computing Facility (OLCF) between 2014 and 2015. DNE is a development project funded by the OpenSFS, to improve Lustre metadata performance and scalability. The development effort has been split into two parts, the first part (DNE P1) providing support for remote directories over remote Lustre Metadata Server (MDS) nodes and Metadata Target (MDT) devices, while the second phase (DNE P2) addressed split directories over multiple remote MDS nodes and MDT devices. The OLCF have been actively evaluating the performance, reliability, and the functionality of both DNE phases. For these tests, internal OLCF testbed were used. Results are promising and OLCF is planning on a full DNE deployment by mid-2016 timeframe on production systems.},
doi = {10.2172/1302888},
journal = {},
number = ,
volume = ,
place = {United States},
year = 2016,
month = 8
}

Technical Report:

Save / Share:
  • Oak Ridge National Laboratory's (ORNL's) Cray XT5 supercomputer, Jaguar, kicked off the era of petascale scientific computing in 2008 with applications that sustained more than a thousand trillion floating point calculations per second - or 1 petaflop. Jaguar continues to grow even more powerful as it helps researchers broaden the boundaries of knowledge in virtually every domain of computational science, including weather and climate, nuclear energy, geosciences, combustion, bioenergy, fusion, and materials science. Their insights promise to broaden our knowledge in areas that are vitally important to the Department of Energy (DOE) and the nation as a whole, particularly energymore » assurance and climate change. The science of the 21st century, however, will demand further revolutions in computing, supercomputers capable of a million trillion calculations a second - 1 exaflop - and beyond. These systems will allow investigators to continue attacking global challenges through modeling and simulation and to unravel longstanding scientific questions. Creating such systems will also require new approaches to daunting challenges. High-performance systems of the future will need to be codesigned for scientific and engineering applications with best-in-class communications networks and data-management infrastructures and teams of skilled researchers able to take full advantage of these new resources. The Oak Ridge Leadership Computing Facility (OLCF) provides the nation's most powerful open resource for capability computing, with a sustainable path that will maintain and extend national leadership for DOE's Office of Science (SC). The OLCF has engaged a world-class team to support petascale science and to take a dramatic step forward, fielding new capabilities for high-end science. This report highlights the successful delivery and operation of a petascale system and shows how the OLCF fosters application development teams, developing cutting-edge tools and resources for next-generation systems.« less
  • Oak Ridge National Laboratory's Leadership Computing Facility (OLCF) continues to deliver the most powerful resources in the U.S. for open science. At 2.33 petaflops peak performance, the Cray XT Jaguar delivered more than 1.5 billion core hours in calendar year (CY) 2010 to researchers around the world for computational simulations relevant to national and energy security; advancing the frontiers of knowledge in physical sciences and areas of biological, medical, environmental, and computer sciences; and providing world-class research facilities for the nation's science enterprise. Scientific achievements by OLCF users range from collaboration with university experimentalists to produce a working supercapacitor thatmore » uses atom-thick sheets of carbon materials to finely determining the resolution requirements for simulations of coal gasifiers and their components, thus laying the foundation for development of commercial-scale gasifiers. OLCF users are pushing the boundaries with software applications sustaining more than one petaflop of performance in the quest to illuminate the fundamental nature of electronic devices. Other teams of researchers are working to resolve predictive capabilities of climate models, to refine and validate genome sequencing, and to explore the most fundamental materials in nature - quarks and gluons - and their unique properties. Details of these scientific endeavors - not possible without access to leadership-class computing resources - are detailed in Section 4 of this report and in the INCITE in Review. Effective operations of the OLCF play a key role in the scientific missions and accomplishments of its users. This Operational Assessment Report (OAR) will delineate the policies, procedures, and innovations implemented by the OLCF to continue delivering a petaflop-scale resource for cutting-edge research. The 2010 operational assessment of the OLCF yielded recommendations that have been addressed (Reference Section 1) and where appropriate, changes in Center metrics were introduced. This report covers CY 2010 and CY 2011 Year to Date (YTD) that unless otherwise specified, denotes January 1, 2011 through June 30, 2011. User Support remains an important element of the OLCF operations, with the philosophy 'whatever it takes' to enable successful research. Impact of this center-wide activity is reflected by the user survey results that show users are 'very satisfied.' The OLCF continues to aggressively pursue outreach and training activities to promote awareness - and effective use - of U.S. leadership-class resources (Reference Section 2). The OLCF continues to meet and in many cases exceed DOE metrics for capability usage (35% target in CY 2010, delivered 39%; 40% target in CY 2011, 54% January 1, 2011 through June 30, 2011). The Schedule Availability (SA) and Overall Availability (OA) for Jaguar were exceeded in CY2010. Given the solution to the VRM problem the SA and OA for Jaguar in CY 2011 are expected to exceed the target metrics of 95% and 90%, respectively (Reference Section 3). Numerous and wide-ranging research accomplishments, scientific support, and technological innovations are more fully described in Sections 4 and 6 and reflect OLCF leadership in enabling high-impact science solutions and vision in creating an exascale-ready center. Financial Management (Section 5) and Risk Management (Section 7) are carried out using best practices approved of by DOE. The OLCF has a valid cyber security plan and Authority to Operate (Section 8). The proposed metrics for 2012 are reflected in Section 9.« less
  • Oak Ridge National Laboratory's Leadership Computing Facility (OLCF) continues to deliver the most powerful resources in the U.S. for open science. At 2.33 petaflops peak performance, the Cray XT Jaguar delivered more than 1.4 billion core hours in calendar year (CY) 2011 to researchers around the world for computational simulations relevant to national and energy security; advancing the frontiers of knowledge in physical sciences and areas of biological, medical, environmental, and computer sciences; and providing world-class research facilities for the nation's science enterprise. Users reported more than 670 publications this year arising from their use of OLCF resources. Of thesemore » we report the 300 in this review that are consistent with guidance provided. Scientific achievements by OLCF users cut across all range scales from atomic to molecular to large-scale structures. At the atomic scale, researchers discovered that the anomalously long half-life of Carbon-14 can be explained by calculating, for the first time, the very complex three-body interactions between all the neutrons and protons in the nucleus. At the molecular scale, researchers combined experimental results from LBL's light source and simulations on Jaguar to discover how DNA replication continues past a damaged site so a mutation can be repaired later. Other researchers combined experimental results from ORNL's Spallation Neutron Source and simulations on Jaguar to reveal the molecular structure of ligno-cellulosic material used in bioethanol production. This year, Jaguar has been used to do billion-cell CFD calculations to develop shock wave compression turbo machinery as a means to meet DOE goals for reducing carbon sequestration costs. General Electric used Jaguar to calculate the unsteady flow through turbo machinery to learn what efficiencies the traditional steady flow assumption is hiding from designers. Even a 1% improvement in turbine design can save the nation billions of gallons of fuel.« less