Exploiting Internal Parallelism for Address Translation in Solid-State Drives
Abstract
Solid-state Drives (SSDs) have changed the landscape of storage systems and present a promising storage solution for data-intensive applications due to their low latency, high bandwidth, and low power consumption compared to traditional hard disk drives. SSDs achieve these desirable characteristics using internal parallelism—parallel access to multiple internal flash memory chips—and a Flash Translation Layer (FTL) that determines where data are stored on those chips so that they do not wear out prematurely. However, current state-of-the-art cache-based FTLs like the Demand-based Flash Translation Layer (DFTL) do not allow IO schedulers to take full advantage of internal parallelism, because they impose a tight coupling between the logical-to-physical address translation and the data access. In this study to address this limitation, we introduce a new FTL design called Parallel-DFTL that works with the DFTL to decouple address translation operations from data accesses. Parallel-DFTL separates address translation and data access operations into different queues, allowing the SSD to use concurrent flash accesses for both types of operations. We also present a Parallel-LRU cache replacement algorithm to improve the concurrency of address translation operations. To compare Parallel-DFTL against existing FTL approaches, we present a Parallel-DFTL performance model and compare its predictions against those formore »
- Authors:
-
- Texas Tech Univ., Lubbock, TX (United States)
- ; Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
- Publication Date:
- Research Org.:
- Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
- Sponsoring Org.:
- USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
- OSTI Identifier:
- 1490593
- Grant/Contract Number:
- AC05-00OR22725
- Resource Type:
- Accepted Manuscript
- Journal Name:
- ACM Transactions on Storage
- Additional Journal Information:
- Journal Volume: 14; Journal Issue: 4; Journal ID: ISSN 1553-3077
- Publisher:
- Association for Computing Machinery (ACM)
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING; Flash translation layer; SSD; parallelism; DFTL; address translation
Citation Formats
Xie, Wei, Chen, Yong, and Roth, Philip C. Exploiting Internal Parallelism for Address Translation in Solid-State Drives. United States: N. p., 2018.
Web. doi:10.1145/3239564.
Xie, Wei, Chen, Yong, & Roth, Philip C. Exploiting Internal Parallelism for Address Translation in Solid-State Drives. United States. https://doi.org/10.1145/3239564
Xie, Wei, Chen, Yong, and Roth, Philip C. Sat .
"Exploiting Internal Parallelism for Address Translation in Solid-State Drives". United States. https://doi.org/10.1145/3239564. https://www.osti.gov/servlets/purl/1490593.
@article{osti_1490593,
title = {Exploiting Internal Parallelism for Address Translation in Solid-State Drives},
author = {Xie, Wei and Chen, Yong and Roth, Philip C.},
abstractNote = {Solid-state Drives (SSDs) have changed the landscape of storage systems and present a promising storage solution for data-intensive applications due to their low latency, high bandwidth, and low power consumption compared to traditional hard disk drives. SSDs achieve these desirable characteristics using internal parallelism—parallel access to multiple internal flash memory chips—and a Flash Translation Layer (FTL) that determines where data are stored on those chips so that they do not wear out prematurely. However, current state-of-the-art cache-based FTLs like the Demand-based Flash Translation Layer (DFTL) do not allow IO schedulers to take full advantage of internal parallelism, because they impose a tight coupling between the logical-to-physical address translation and the data access. In this study to address this limitation, we introduce a new FTL design called Parallel-DFTL that works with the DFTL to decouple address translation operations from data accesses. Parallel-DFTL separates address translation and data access operations into different queues, allowing the SSD to use concurrent flash accesses for both types of operations. We also present a Parallel-LRU cache replacement algorithm to improve the concurrency of address translation operations. To compare Parallel-DFTL against existing FTL approaches, we present a Parallel-DFTL performance model and compare its predictions against those for DFTL and an ideal page-mapping approach. We also implemented the Parallel-DFTL approach in an SSD simulator using real device parameters, and used trace-driven simulation to evaluate Parallel-DFTL’s efficacy. Our evaluation results show that Parallel-DFTL improved the overall performance by up to 32% for the real IO workloads we tested, and by up to two orders of magnitude with synthetic test workloads. Finally, we also found that Parallel-DFTL is able to achieve reasonable performance with a very small cache size and that it provides the best benefit for those workloads with large request size or with high write ratio.},
doi = {10.1145/3239564},
journal = {ACM Transactions on Storage},
number = 4,
volume = 14,
place = {United States},
year = {Sat Dec 15 00:00:00 EST 2018},
month = {Sat Dec 15 00:00:00 EST 2018}
}
Web of Science
Works referenced in this record:
Achieving page-mapping FTL performance at block-mapping FTL cost by hiding address translation
conference, May 2010
- Hu, Yang; Jiang, Hong; Feng, Dan
- 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)
Hot data identification for flash-based storage systems using multiple bloom filters
conference, May 2011
- Park, Dongchul; Du, David H. C.
- 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST)
FlashSim: A Simulator for NAND Flash-Based Solid-State Drives
conference, September 2009
- Kim, Youngjae; Tauras, Brendan; Gupta, Aayush
- 2009 First International Conference on Advances in System Simulation
A mean field model for a class of garbage collection algorithms in flash-based solid state drives
conference, January 2013
- Van Houdt, Benny
- Proceedings of the ACM SIGMETRICS/international conference on Measurement and modeling of computer systems - SIGMETRICS '13
LazyFTL: a page-level flash translation layer optimized for NAND flash memory
conference, January 2011
- Ma, Dongzhe; Feng, Jianhua; Li, Guoliang
- Proceedings of the 2011 international conference on Management of data - SIGMOD '11
Efficient identification of hot data for flash memory storage systems
journal, February 2006
- Hsieh, Jen-Wei; Kuo, Tei-Wei; Chang, Li-Pin
- ACM Transactions on Storage, Vol. 2, Issue 1
Sprinkler: Maximizing resource utilization in many-chip solid state disks
conference, February 2014
- Jung, Myoungsoo; Kandemir, Mahmut T.
- 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA)
Analytic modeling of SSD write performance
conference, January 2012
- Desnoyers, Peter
- Proceedings of the 5th Annual International Systems and Storage Conference on - SYSTOR '12
Performance impact and interplay of SSD parallelism through advanced commands, allocation strategy and data granularity
conference, January 2011
- Hu, Yang; Jiang, Hong; Feng, Dan
- Proceedings of the international conference on Supercomputing - ICS '11
A space-efficient flash translation layer for CompactFlash systems
journal, May 2002
- Jesung Kim, ; Noh, S. H.
- IEEE Transactions on Consumer Electronics, Vol. 48, Issue 2
Hydra: A Block-Mapped Parallel Flash Memory Solid-State Disk Architecture
journal, July 2010
- Seong, Yoon Jae; Nam, Eyee Hyun; Yoon, Jin Hyuk
- IEEE Transactions on Computers, Vol. 59, Issue 7
CBM: A cooperative buffer management for SSD
conference, June 2014
- Wei, Qingsong; Chen, Cheng; Yang, Jun
- 2014 30th Symposium on Mass Storage Systems and Technologies (MSST)
FASTer FTL for Enterprise-Class Flash Memory SSDs
conference, May 2010
- Lim, Sang-Phil; Lee, Sang-Won; Moon, Bongki
- 2010 International Workshop on Storage Network Architecture and Parallel I/Os (SNAPI)
Ozone (O3): An Out-of-Order Flash Memory Controller Architecture
journal, May 2011
- Nam, Eyee Hyun; Kim, Bryan Suk Joon; Eom, Hyeonsang
- IEEE Transactions on Computers, Vol. 60, Issue 5
A log buffer-based flash translation layer using fully-associative sector translation
journal, July 2007
- Lee, Sang-Won; Park, Dong-Joo; Chung, Tae-Sun
- ACM Transactions on Embedded Computing Systems, Vol. 6, Issue 3
On the role of burst buffers in leadership-class storage systems
conference, April 2012
- Liu, Ning; Cope, Jason; Carns, Philip
- 2012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST)
The performance of PC solid-state disks (SSDs) as a function of bandwidth, concurrency, device architecture, and system organization
conference, January 2009
- Dirik, Cagdas; Jacob, Bruce
- Proceedings of the 36th annual international symposium on Computer architecture - ISCA '09
Two-mode data distribution scheme for heterogeneous storage in data centers
conference, October 2015
- Xie, Wei; Zhou, Jiang; Reyes, Mark
- 2015 IEEE International Conference on Big Data (Big Data)
Elastic Consistent Hashing for Distributed Storage Systems
conference, May 2017
- Xie, Wei; Chen, Yong
- 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
Hystor: making the best use of solid state drives in high performance storage systems
conference, January 2011
- Chen, Feng; Koufaty, David A.; Zhang, Xiaodong
- Proceedings of the international conference on Supercomputing - ICS '11
ASA-FTL: An adaptive separation aware flash translation layer for solid state drives
journal, January 2017
- Xie, Wei; Chen, Yong; Roth, Philip C.
- Parallel Computing, Vol. 61
DFTL: a flash translation layer employing demand-based selective caching of page-level address mappings
conference, January 2009
- Gupta, Aayush; Kim, Youngjae; Urgaonkar, Bhuvan
- Proceeding of the 14th international conference on Architectural support for programming languages and operating systems - ASPLOS '09
Hot/cold clustering for page mapping in NAND flash memory
journal, November 2011
- Shin, Ilhoon
- IEEE Transactions on Consumer Electronics, Vol. 57, Issue 4
Performance of greedy garbage collection in flash-based solid-state drives
journal, November 2010
- Bux, Werner; Iliadis, Ilias
- Performance Evaluation, Vol. 67, Issue 11
Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing
conference, February 2011
- Chen, Feng; Lee, Rubao; Zhang, Xiaodong
- 2011 IEEE 17th International Symposium on High Performance Computer Architecture (HPCA)
Revealing applications' access pattern in collective I/O for cache management
conference, January 2014
- Lu, Yin; Chen, Yong; Latham, Rob
- Proceedings of the 28th ACM international conference on Supercomputing - ICS '14
Using data clustering to improve cleaning performance for flash memory
journal, March 1999
- Chiang, Mei-Ling; Lee, Paul C. H.; Chang, Ruei-Chuan
- Software: Practice and Experience, Vol. 29, Issue 3
Exploiting Internal Parallelism of Flash-based SSDs
journal, January 2010
- Seon-yeong Park,
- IEEE Computer Architecture Letters, Vol. 9, Issue 1
Multi-Channel Architecture-Based FTL for Reliable and High-Performance SSD
journal, December 2014
- Hsieh, Jen-Wei; Lin, Han-Yi; Yang, Dong-Lin
- IEEE Transactions on Computers, Vol. 63, Issue 12
Cleaning policies in mobile computers using flash memory
journal, November 1999
- Chiang, M. -L.; Chang, R. -C.
- Journal of Systems and Software, Vol. 48, Issue 3
Write amplification analysis in flash-based solid state drives
conference, January 2009
- Hu, Xiao-Yu; Eleftheriou, Evangelos; Haas, Robert
- Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference on - SYSTOR '09
Parallel-DFTL: A Flash Translation Layer That Exploits Internal Parallelism in Solid State Drives
conference, August 2016
- Xie, Wei; Chen, Yong; Roth, Philip C.
- 2016 IEEE International Conference on Networking, Architecture and Storage (NAS)
Using data clustering to improve cleaning performance for flash memory
journal, March 1999
- Chiang, Mei-Ling; Lee, Paul C. H.; Chang, Ruei-Chuan
- Software: Practice and Experience, Vol. 29, Issue 3
Two-Choice Randomized Dynamic I/O Scheduler for Object Storage Systems
conference, November 2014
- Dai, Dong; Chen, Yong; Kimpe, Dries
- SC14: International Conference for High Performance Computing, Networking, Storage and Analysis
PUD-LRU: An Erase-Efficient Write Buffer Management Algorithm for Flash Memory SSD
conference, August 2010
- Hu, Jian; Jiang, Hong; Tian, Lei
- Simulation of Computer and Telecommunication Systems (MASCOTS), 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems
Locality-driven high-level I/O aggregation for processing scientific datasets
conference, October 2013
- Liu, Jialin; Crysler, Bradly; Lu, Yin
- 2013 IEEE International Conference on Big Data
ADAPT: Efficient workload-sensitive flash management based on adaptation, prediction and aggregation
conference, April 2012
- Wang, Chundong; Wong, Weng-Fai
- 2012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST)
The performance of PC solid-state disks (SSDs) as a function of bandwidth, concurrency, device architecture, and system organization
journal, June 2009
- Dirik, Cagdas; Jacob, Bruce
- ACM SIGARCH Computer Architecture News, Vol. 37, Issue 3
A mean field model for a class of garbage collection algorithms in flash-based solid state drives
journal, June 2013
- Van Houdt, Benny
- ACM SIGMETRICS Performance Evaluation Review, Vol. 41, Issue 1
A mean field model for a class of garbage collection algorithms in flash-based solid state drives
journal, April 2014
- Van Houdt, Benny
- Queueing Systems, Vol. 77, Issue 2