skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: CloudBB: Scalable I/O Accelerator for Shared Cloud Storage

Authors:
; ;
Publication Date:
Research Org.:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1342057
Report Number(s):
LLNL-CONF-696937
DOE Contract Number:
AC52-07NA27344
Resource Type:
Conference
Resource Relation:
Conference: Presented at: The 22nd IEEE International Conference on Parallel and Distributed Systems (ICPADS 2016), Wuhan, China, Dec 13 - Dec 16, 2016
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE

Citation Formats

Xu, T ., Sato, K ., and Matsuoka, S . CloudBB: Scalable I/O Accelerator for Shared Cloud Storage. United States: N. p., 2016. Web. doi:10.1109/ICPADS.2016.0074.
Xu, T ., Sato, K ., & Matsuoka, S . CloudBB: Scalable I/O Accelerator for Shared Cloud Storage. United States. doi:10.1109/ICPADS.2016.0074.
Xu, T ., Sato, K ., and Matsuoka, S . Thu . "CloudBB: Scalable I/O Accelerator for Shared Cloud Storage". United States. doi:10.1109/ICPADS.2016.0074. https://www.osti.gov/servlets/purl/1342057.
@article{osti_1342057,
title = {CloudBB: Scalable I/O Accelerator for Shared Cloud Storage},
author = {Xu, T . and Sato, K . and Matsuoka, S .},
abstractNote = {},
doi = {10.1109/ICPADS.2016.0074},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Thu Jul 07 00:00:00 EDT 2016},
month = {Thu Jul 07 00:00:00 EDT 2016}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:
  • The growth in the computational capability of modern supercomputing systems has been accompanied by corresponding increases in CPU count, total RAM, and total storage capacity. Indeed, systems such as Blue-Gene/L [3], BlueGene/P, Ranger, and the Cray XT series have grown to more than 100k processors, with 100 TeraBytes of RAM and are attached to multi-PetaByte storage systems. However, as part of this design evolution, large supercomputers have lost node-local storage elements, such as disks. While this decision was motivated by important considerations like overall system reliability, it also resulted in these systems losing a key level in their memory hierarchy,more » with nothing to fill the gap between local RAM and the parallel file system. While today's large supercomputers are typically attached to fast parallel file systems, which provide tens of GBs/s of I/O bandwidth, the computational capacity has grown much faster than the storage bandwidth capacity. As such, these machines are now provided with much less than 1GB/s of I/O bandwidth per TeraFlop of compute power, which is below the generally accepted limit required for a well-balanced system [8] [16]. The result is that today's limited I/O bandwidth is choking the capabilities of modern supercomputers, specifically in terms of limiting their working sets and making fault tolerance techniques, such as checkpointing, prohibitively expensive. This paper presents an alternative system design oriented on using node-local storage to improve aggregate system I/O bandwidth. We focus on the checkpointing use-case and present an experimental evaluation of SCR, a new checkpointing library that makes use of node-local storage to significantly improve the performance of checkpointing on large-scale supercomputers. Experiments show that SCR achieves unprecedented write speeds, reaching 700GB/s on 8,752 processors. Our results scale such that we expect a similarly structured system consisting of 12,500 processors to achieve aggregate I/O bandwidth of 1 TB/s.« less
  • I/O-intensive applications are becoming increasingly common on today's high-performance computing systems. While performance of compute-bound applications can be effectively guaranteed with techniques such as space sharing or QoS-aware process scheduling, it remains a challenge to meet QoS requirements for end users of I/O-intensive applications using shared storage systems because it is difficult to differentiate I/O services for different applications with individual quality requirements. Furthermore, it is difficult for end users to accurately specify performance goals to the storage system using I/O-related metrics such as request latency or throughput. As access patterns, request rates, and the system workload change in time,more » a fixed I/O performance goal, such as bounds on throughput or latency, can be expensive to achieve and may not lead to a meaningful performance guarantees such as bounded program execution time. We propose a scheme supporting end-users QoS goals, specified in terms of program execution time, in shared storage environments. We automatically translate the users performance goals into instantaneous I/O throughput bounds using a machine learning technique, and use dynamically determined service time windows to efficiently meet the throughput bounds. We have implemented this scheme in the PVFS2 parallel file system and have conducted an extensive evaluation. Our results show that this scheme can satisfy realistic end-user QoS requirements by making highly efficient use of the I/O resources. The scheme seeks to balance programs attainment of QoS requirements, and saves as much of the remaining I/O capacity as possible for best-effort programs.« less
  • Inter-application I/O contention and performance interference have been recognized as severe problems. In this work, we demonstrate, through measurement from Titan (world s No. 3 supercomputer), that high I/O variance co-exists with the fact that individual storage units remain under-utilized for the majority of the time. This motivates us to propose AID, a system that performs automatic application I/O characterization and I/O-aware job scheduling. AID analyzes existing I/O traffic and batch job history logs, without any prior knowledge on applications or user/developer involvement. It identifies the small set of I/O-intensive candidates among all applications running on a supercomputer and subsequentlymore » mines their I/O patterns, using more detailed per-I/O-node traffic logs. Based on such auto- extracted information, AID provides online I/O-aware scheduling recommendations to steer I/O-intensive applications away from heavy ongoing I/O activities. We evaluate AID on Titan, using both real applications (with extracted I/O patterns validated by contacting users) and our own pseudo-applications. Our results confirm that AID is able to (1) identify I/O-intensive applications and their detailed I/O characteristics, and (2) significantly reduce these applications I/O performance degradation/variance by jointly evaluating out- standing applications I/O pattern and real-time system l/O load.« less
  • Our project consists of bleeding-edge research into replacing the traditional storage archives with a parallel, cloud-based storage solution. It used OpenStack's Swift Object Store cloud software. It's Benchmarked Swift for write speed and scalability. Our project is unique because Swift is typically used for reads and we are mostly concerned with write speeds. Cloud Storage is a viable archive solution because: (1) Container management for larger parallel archives might ease the migration workload; (2) Many tools that are written for cloud storage could be utilized for local archive; and (3) Current large cloud storage practices in industry could be utilizedmore » to manage a scalable archive solution.« less
  • In the race to PetaFLOP-speed supercomputing systems, the increase in computational capability has been accompanied by corresponding increases in CPU count, total RAM, and storage capacity. However, a proportional increase in storage bandwidth has lagged behind. In order to improve system reliability and to reduce maintenance effort for modern large-scale systems, system designers have opted to remove node-local storage from the compute nodes. Today's multi-TeraFLOP supercomputers are typically attached to parallel file systems that provide only tens of GBs/s of I/O bandwidth. As a result, such machines have access to much less than 1GB/s of I/O bandwidth per TeraFLOP ofmore » compute power, which is below the generally accepted limit required for a well-balanced system. In a many ways, the current I/O bottleneck limits the capabilities of modern supercomputers, specifically in terms of limiting their working sets and restricting fault tolerance techniques, which become critical on systems consisting of tens of thousands of components. This paper resolves the dilemma between high performance and high reliability by presenting an alternative system design which makes use of node-local storage to improve aggregate system I/O bandwidth. In this work, we focus on the checkpointing use-case and present an experimental evaluation of the Scalable Checkpoint/Restart (SCR) library, a new adaptive checkpointing library that uses node-local storage to significantly improve the checkpointing performance of large-scale supercomputers. Experiments show that SCR achieves unprecedented write speeds, reaching a measured 700GB/s of aggregate bandwidth on 8,752 processors and an estimated 1TB/s for a similarly structured machine of 12,500 processors. This corresponds to a speedup of over 70x compared to the bandwidth provided by the 10GB/s parallel file system the cluster uses. Further, SCR can adapt to an environment in which there is wide variation in performance or capacity among the individual node-local storage elements.« less