skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Exploiting Internal Parallelism for Address Translation in Solid-State Drives

Abstract

Solid-state Drives (SSDs) have changed the landscape of storage systems and present a promising storage solution for data-intensive applications due to their low latency, high bandwidth, and low power consumption compared to traditional hard disk drives. SSDs achieve these desirable characteristics using internal parallelism—parallel access to multiple internal flash memory chips—and a Flash Translation Layer (FTL) that determines where data are stored on those chips so that they do not wear out prematurely. However, current state-of-the-art cache-based FTLs like the Demand-based Flash Translation Layer (DFTL) do not allow IO schedulers to take full advantage of internal parallelism, because they impose a tight coupling between the logical-to-physical address translation and the data access. In this study to address this limitation, we introduce a new FTL design called Parallel-DFTL that works with the DFTL to decouple address translation operations from data accesses. Parallel-DFTL separates address translation and data access operations into different queues, allowing the SSD to use concurrent flash accesses for both types of operations. We also present a Parallel-LRU cache replacement algorithm to improve the concurrency of address translation operations. To compare Parallel-DFTL against existing FTL approaches, we present a Parallel-DFTL performance model and compare its predictions against those formore » DFTL and an ideal page-mapping approach. We also implemented the Parallel-DFTL approach in an SSD simulator using real device parameters, and used trace-driven simulation to evaluate Parallel-DFTL’s efficacy. Our evaluation results show that Parallel-DFTL improved the overall performance by up to 32% for the real IO workloads we tested, and by up to two orders of magnitude with synthetic test workloads. Finally, we also found that Parallel-DFTL is able to achieve reasonable performance with a very small cache size and that it provides the best benefit for those workloads with large request size or with high write ratio.« less

Authors:
 [1];  [1]; ORCiD logo [2]
  1. Texas Tech Univ., Lubbock, TX (United States)
  2. ; Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
OSTI Identifier:
1490593
Grant/Contract Number:  
AC05-00OR22725
Resource Type:
Journal Article: Accepted Manuscript
Journal Name:
ACM Transactions on Storage
Additional Journal Information:
Journal Volume: 14; Journal Issue: 4; Journal ID: ISSN 1553-3077
Publisher:
Association for Computing Machinery (ACM)
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; Flash translation layer; SSD; parallelism; DFTL; address translation

Citation Formats

Xie, Wei, Chen, Yong, and Roth, Philip C. Exploiting Internal Parallelism for Address Translation in Solid-State Drives. United States: N. p., 2018. Web. doi:10.1145/3239564.
Xie, Wei, Chen, Yong, & Roth, Philip C. Exploiting Internal Parallelism for Address Translation in Solid-State Drives. United States. https://doi.org/10.1145/3239564
Xie, Wei, Chen, Yong, and Roth, Philip C. 2018. "Exploiting Internal Parallelism for Address Translation in Solid-State Drives". United States. https://doi.org/10.1145/3239564. https://www.osti.gov/servlets/purl/1490593.
@article{osti_1490593,
title = {Exploiting Internal Parallelism for Address Translation in Solid-State Drives},
author = {Xie, Wei and Chen, Yong and Roth, Philip C.},
abstractNote = {Solid-state Drives (SSDs) have changed the landscape of storage systems and present a promising storage solution for data-intensive applications due to their low latency, high bandwidth, and low power consumption compared to traditional hard disk drives. SSDs achieve these desirable characteristics using internal parallelism—parallel access to multiple internal flash memory chips—and a Flash Translation Layer (FTL) that determines where data are stored on those chips so that they do not wear out prematurely. However, current state-of-the-art cache-based FTLs like the Demand-based Flash Translation Layer (DFTL) do not allow IO schedulers to take full advantage of internal parallelism, because they impose a tight coupling between the logical-to-physical address translation and the data access. In this study to address this limitation, we introduce a new FTL design called Parallel-DFTL that works with the DFTL to decouple address translation operations from data accesses. Parallel-DFTL separates address translation and data access operations into different queues, allowing the SSD to use concurrent flash accesses for both types of operations. We also present a Parallel-LRU cache replacement algorithm to improve the concurrency of address translation operations. To compare Parallel-DFTL against existing FTL approaches, we present a Parallel-DFTL performance model and compare its predictions against those for DFTL and an ideal page-mapping approach. We also implemented the Parallel-DFTL approach in an SSD simulator using real device parameters, and used trace-driven simulation to evaluate Parallel-DFTL’s efficacy. Our evaluation results show that Parallel-DFTL improved the overall performance by up to 32% for the real IO workloads we tested, and by up to two orders of magnitude with synthetic test workloads. Finally, we also found that Parallel-DFTL is able to achieve reasonable performance with a very small cache size and that it provides the best benefit for those workloads with large request size or with high write ratio.},
doi = {10.1145/3239564},
url = {https://www.osti.gov/biblio/1490593}, journal = {ACM Transactions on Storage},
issn = {1553-3077},
number = 4,
volume = 14,
place = {United States},
year = {Sat Dec 15 00:00:00 EST 2018},
month = {Sat Dec 15 00:00:00 EST 2018}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 5 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Achieving page-mapping FTL performance at block-mapping FTL cost by hiding address translation
conference, May 2010


Hot data identification for flash-based storage systems using multiple bloom filters
conference, May 2011


FlashSim: A Simulator for NAND Flash-Based Solid-State Drives
conference, September 2009


A mean field model for a class of garbage collection algorithms in flash-based solid state drives
conference, January 2013


LazyFTL: a page-level flash translation layer optimized for NAND flash memory
conference, January 2011


Efficient identification of hot data for flash memory storage systems
journal, February 2006


Sprinkler: Maximizing resource utilization in many-chip solid state disks
conference, February 2014


Analytic modeling of SSD write performance
conference, January 2012


Performance impact and interplay of SSD parallelism through advanced commands, allocation strategy and data granularity
conference, January 2011


A space-efficient flash translation layer for CompactFlash systems
journal, May 2002


Hydra: A Block-Mapped Parallel Flash Memory Solid-State Disk Architecture
journal, July 2010


CBM: A cooperative buffer management for SSD
conference, June 2014


FASTer FTL for Enterprise-Class Flash Memory SSDs
conference, May 2010


Ozone (O3): An Out-of-Order Flash Memory Controller Architecture
journal, May 2011


A log buffer-based flash translation layer using fully-associative sector translation
journal, July 2007


On the role of burst buffers in leadership-class storage systems
conference, April 2012


The performance of PC solid-state disks (SSDs) as a function of bandwidth, concurrency, device architecture, and system organization
conference, January 2009


Two-mode data distribution scheme for heterogeneous storage in data centers
conference, October 2015


Elastic Consistent Hashing for Distributed Storage Systems
conference, May 2017


Hystor: making the best use of solid state drives in high performance storage systems
conference, January 2011


ASA-FTL: An adaptive separation aware flash translation layer for solid state drives
journal, January 2017


DFTL: a flash translation layer employing demand-based selective caching of page-level address mappings
conference, January 2009

  • Gupta, Aayush; Kim, Youngjae; Urgaonkar, Bhuvan
  • Proceeding of the 14th international conference on Architectural support for programming languages and operating systems - ASPLOS '09
  • https://doi.org/10.1145/1508244.1508271

Hot/cold clustering for page mapping in NAND flash memory
journal, November 2011


Performance of greedy garbage collection in flash-based solid-state drives
journal, November 2010


Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing
conference, February 2011


Revealing applications' access pattern in collective I/O for cache management
conference, January 2014


Using data clustering to improve cleaning performance for flash memory
journal, March 1999


Exploiting Internal Parallelism of Flash-based SSDs
journal, January 2010


Multi-Channel Architecture-Based FTL for Reliable and High-Performance SSD
journal, December 2014


Cleaning policies in mobile computers using flash memory
journal, November 1999


Write amplification analysis in flash-based solid state drives
conference, January 2009


Parallel-DFTL: A Flash Translation Layer That Exploits Internal Parallelism in Solid State Drives
conference, August 2016


Using data clustering to improve cleaning performance for flash memory
journal, March 1999


Two-Choice Randomized Dynamic I/O Scheduler for Object Storage Systems
conference, November 2014


PUD-LRU: An Erase-Efficient Write Buffer Management Algorithm for Flash Memory SSD
conference, August 2010

  • Hu, Jian; Jiang, Hong; Tian, Lei
  • Simulation of Computer and Telecommunication Systems (MASCOTS), 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems
  • https://doi.org/10.1109/MASCOTS.2010.16

Locality-driven high-level I/O aggregation for processing scientific datasets
conference, October 2013


ADAPT: Efficient workload-sensitive flash management based on adaptation, prediction and aggregation
conference, April 2012


A mean field model for a class of garbage collection algorithms in flash-based solid state drives
journal, June 2013