skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: BLACKCOMB2: Hardware-software co-design for non-volatile memory in exascale systems

Abstract

This work was part of a larger project, Blackcomb2, centered at Oak Ridge National Labs (Jeff Vetter PI) to investigate the opportunities for replacing or supplementing DRAM main memory with nonvolatile memory (NVmemory) in Exascale memory systems. The goal was to reduce the energy consumed by in future supercomputer memory systems and to improve their resiliency. Building on the accomplishments of the original Blackcomb Project, funded in 2010, the goal for Blackcomb2 was to identify, evaluate, and optimize the most promising emerging memory technologies, architecture hardware and software technologies, which are essential to provide the necessary memory capacity, performance, resilience, and energy efficiency in Exascale systems. Capacity and energy are the key drivers.

Authors:
 [1]
  1. Univ. of Michigan, Ann Arbor, MI (United States)
Publication Date:
Research Org.:
Univ. of Michigan, Ann Arbor, MI (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
Contributing Org.:
Oak Ridge National Labs; Hewlett Packard Labs
OSTI Identifier:
1413470
Report Number(s):
DOE-Michigan-12295
DOE-Michigan-12295
DOE Contract Number:
SC0012295
Resource Type:
Technical Report
Resource Relation:
Related Information: 1.Studies in Exascale Computer Architecture: Interconnect, Resiliency, and Checkpointing, Sandunmalee Nilmini Abeyratne, Ph.D. Theisi, The University of Michigan, 2017.2. N. Abeyratne, H.-M. Chen, B. Oh, R. Dreslinski, C. Chakrabarti, T. Mudge. Checkpointing Exascale Memory Systems with Existing Memory Technologies. MEMSYS, Washington DC, October 2016, pp. 18-28.
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; High Performance Computing; Memory Systems; Non-volatile memory; Exscale

Citation Formats

Mudge, Trevor. BLACKCOMB2: Hardware-software co-design for non-volatile memory in exascale systems. United States: N. p., 2017. Web. doi:10.2172/1413470.
Mudge, Trevor. BLACKCOMB2: Hardware-software co-design for non-volatile memory in exascale systems. United States. doi:10.2172/1413470.
Mudge, Trevor. Fri . "BLACKCOMB2: Hardware-software co-design for non-volatile memory in exascale systems". United States. doi:10.2172/1413470. https://www.osti.gov/servlets/purl/1413470.
@article{osti_1413470,
title = {BLACKCOMB2: Hardware-software co-design for non-volatile memory in exascale systems},
author = {Mudge, Trevor},
abstractNote = {This work was part of a larger project, Blackcomb2, centered at Oak Ridge National Labs (Jeff Vetter PI) to investigate the opportunities for replacing or supplementing DRAM main memory with nonvolatile memory (NVmemory) in Exascale memory systems. The goal was to reduce the energy consumed by in future supercomputer memory systems and to improve their resiliency. Building on the accomplishments of the original Blackcomb Project, funded in 2010, the goal for Blackcomb2 was to identify, evaluate, and optimize the most promising emerging memory technologies, architecture hardware and software technologies, which are essential to provide the necessary memory capacity, performance, resilience, and energy efficiency in Exascale systems. Capacity and energy are the key drivers.},
doi = {10.2172/1413470},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Fri Dec 15 00:00:00 EST 2017},
month = {Fri Dec 15 00:00:00 EST 2017}
}

Technical Report:

Save / Share:
  • Summary of technical results of Blackcomb Memory Devices We explored various different memory technologies (STTRAM, PCRAM, FeRAM, and ReRAM). The progress can be classified into three categories, below. Modeling and Tool Releases Various modeling tools have been developed over the last decade to help in the design of SRAM or DRAM-based memory hierarchies. To explore new design opportunities that NVM technologies can bring to the designers, we have developed similar high-level models for NVM, including PCRAMsim [Dong 2009], NVSim [Dong 2012], and NVMain [Poremba 2012]. NVSim is a circuit-level model for NVM performance, energy, and area estimation, which supports variousmore » NVM technologies, including STT-RAM, PCRAM, ReRAM, and legacy NAND Flash. NVSim is successfully validated against industrial NVM prototypes, and it is expected to help boost architecture-level NVM-related studies. On the other side, NVMain is a cycle accurate main memory simulator designed to simulate emerging nonvolatile memories at the architectural level. We have released these models as open source tools and provided contiguous support to them. We also proposed PS3-RAM, which is a fast, portable and scalable statistical STT-RAM reliability analysis model [Wen 2012]. Design Space Exploration and Optimization With the support of these models, we explore different device/circuit optimization techniques. For example, in [Niu 2012a] we studied the power reduction technique for the application of ECC scheme in ReRAM designs and proposed to use ECC code to relax the BER (Bit Error Rate) requirement of a single memory to improve the write energy consumption and latency for both 1T1R and cross-point ReRAM designs. In [Xu 2011], we proposed a methodology to design STT-RAM for different optimization goals such as read performance, write performance and write energy by leveraging the trade-off between write current and write time of MTJ. We also studied the tradeoffs in building a reliable crosspoint ReRAM array [Niu 2012b]. We have conducted an in depth analysis of the circuit and system level design implications of multi-layer cross-point Resistive RAM (MLCReRAM) from performance, power and reliability perspectives [Xu 2013]. The objective of this study is to understand the design trade-offs of this technology with respect to the MLC Phase Change Memory (MLCPCM).Our MLC ReRAM design at the circuit and system levels indicates that different resistance allocation schemes, programming strategies, peripheral designs, and material selections profoundly affect the area, latency, power, and reliability of MLC ReRAM. Based on this analysis, we conduct two case studies: first we compare MLC ReRAM design against MLC phase-change memory (PCM) and multi-layer cross-point ReRAM design, and point out why multi-level ReRAM is appealing; second we further explore the design space for MLC ReRAM. Architecture and Application We explored hybrid checkpointing using phase-change memory for future exascale systems [Dong 2011] and showed that the use of nonvolatile memory for local checkpointing significantly increases the number of faults covered by local checkpoints and reduces the probability of a global failure in the middle of a global checkpoint to less than 1%. We also proposed a technique called i2WAP to mitigate the write variations in NVM-based last-level cache for the improvement of the NVM lifetime [Wang 2013]. Our wear leveling technique attempts to work around the limitations of write endurance by arranging data access so that write operations can be distributed evenly across all the storage cells. During our intensive research on fault-tolerant NVM design, we found that ECC cannot effectively tolerate hard errors from limited write endurance and process imperfection. Therefore, we devised a novel Point and Discard (PAD) architecture in in [ 2012] as a hard-error-tolerant architecture for ReRAM-based Last Level Caches. PAD improves the lifetime of ReRAM caches by 1.6X-440X under different process variations without performance overhead in the system's early life. We have investigated the applicability of NVM for persistent memory design [Zhao 2013]. New byte addressable NVM enables fast persistent memory that allows in-memory persistent data objects to be updated with much higher throughput. Despite the significant improvement, the performance of these designs is only 50% of the native system with no persistence support, due to the logging or copy-on-write mechanisms used to update the persistent memory. A challenge in this approach is therefore how to efficiently enable atomic, consistent, and durable updates to ensure data persistence that survives application and/or system failures. We have designed a persistent memory system, called Klin, that can provide performance as close as that of the native system. The Klin design adopts a non-volatile cache and a non-volatile main memory for constructing a multi-versioned durable memory system, enabling atomic updates without logging or copy-on-write. Our evaluation shows that the proposed Kiln mechanism can achieve up to 2X of performance improvement to NVRAM-based persistent memory employing write-ahead logging. In addition, our design has numerous practical advantages: a simple and intuitive abstract interface, microarchitecture-level optimizations, fast recovery from failures, and no redundant writes to slow non-volatile storage media. The work was published in MICRO 2013 and received Best Paper Honorable Mentioned Award.« less
  • In this grant, we enhanced the Palacios virtual machine monitor to increase its scalability and suitability for addressing exascale system software design issues. This included a wide range of research on core Palacios features, large-scale system emulation, fault injection, perfomrance monitoring, and VMM extensibility. This research resulted in large number of high-impact publications in well-known venues, the support of a number of students, and the graduation of two Ph.D. students and one M.S. student. In addition, our enhanced version of the Palacios virtual machine monitor has been adopted as a core element of the Hobbes operating system under active DOE-fundedmore » research and development.« less
  • This report describes the activities, findings, and products of the Northwestern University component of the "Enabling Exascale Hardware and Software Design through Scalable System Virtualization" project. The purpose of this project has been to extend the state of the art of systems software for high-end computing (HEC) platforms, and to use systems software to better enable the evaluation of potential future HEC platforms, for example exascale platforms. Such platforms, and their systems software, have the goal of providing scientific computation at new scales, thus enabling new research in the physical sciences and engineering. Over time, the innovations in systems softwaremore » for such platforms also become applicable to more widely used computing clusters, data centers, and clouds. This was a five-institution project, centered on the Palacios virtual machine monitor (VMM) systems software, a project begun at Northwestern, and originally developed in a previous collaboration between Northwestern University and the University of New Mexico. In this project, Northwestern (including via our subcontract to the University of Pittsburgh) contributed to the continued development of Palacios, along with other team members. We took the leadership role in (1) continued extension of support for emerging Intel and AMD hardware, (2) integration and performance enhancement of overlay networking, (3) connectivity with architectural simulation, (4) binary translation, and (5) support for modern Non-Uniform Memory Access (NUMA) hosts and guests. We also took a supporting role in support for specialized hardware for I/O virtualization, profiling, configurability, and integration with configuration tools. The efforts we led (1-5) were largely successful and executed as expected, with code and papers resulting from them. The project demonstrated the feasibility of a virtualization layer for HEC computing, similar to such layers for cloud or datacenter computing. For effort (3), although a prototype connecting Palacios with the GEM5 architectural simulator was demonstrated, our conclusion was that such a platform was less useful for design space exploration than anticipated due to inherent complexity of the connection between the instruction set architecture level and the microarchitectural level. For effort (4), we found that a code injection approach proved to be more fruitful. The results of our efforts are publicly available in the open source Palacios codebase and published papers, all of which are available from the project web site, v3vee.org. Palacios is currently one of the two codebases (the other being Sandia’s Kitten lightweight kernel) that underlies the node operating system for the DOE Hobbes Project, one of two projects tasked with building a systems software prototype for the national exascale computing effort.« less
  • Recently, advances in processor architecture have become the driving force for new programming models in the computing industry, as ever newer multicore processor designs with increasing number of cores are introduced on schedules regimented by marketing demands. As a result, collaborative parallel (rather than simply concurrent) implementations of important applications, programming languages, models, and even algorithms have been forced to adapt to these architectures to exploit the available raw performance. We believe that this optimization regime is flawed. In this paper, we present an alternate approach that, rather than starting with an existing hardware/software solution laced with hidden assumptions, definesmore » the computational problems of interest and invites architects, researchers and programmers to implement novel hardware/software co-designed solutions. Our work builds on the previous ideas of computational dwarfs, motifs, and parallel patterns by selecting a representative set of essential problems for which we provide: An algorithmic description; scalable problem definition; illustrative reference implementations; verification schemes. This testbed will enable comparative research in areas such as parallel programming models, languages, auto-tuning, and hardware/software codesign. For simplicity, we focus initially on the computational problems of interest to the scientific computing community but proclaim the methodology (and perhaps a subset of the problems) as applicable to other communities. We intend to broaden the coverage of this problem space through stronger community involvement.« less