Enabling Scalable and Extensible Memory-mapped Datastores in Userspace
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
- Virginia Polytechnic Inst. and State Univ. (Virginia Tech), Blacksburg, VA (United States)
Exascale workloads are expected to incorporate data-intensive processing in close coordination with traditional physics simulations. These emerging scientific, data-analytics and machine learning applications need to access a wide variety of datastores in flat files and structured databases. Programmer productivity is greatly enhanced by mapping datastores into the application process's virtual memory space to provide a unified “in-memory” interface. Currently, memory mapping is provided by system software primarily designed for generality and reliability. However, scalability at high concurrency is a formidable challenge on exascale systems. Also, there is a need for extensibility to support new datastores potentially requiring HPC data transfer services. In this article, we present UMap , a scalable and extensible userspace service for memory-mapping datastores. Furthermore, through decoupled queue management, concurrency aware adaptation, and dynamic load balancing, UMap enables application performance to scale even at high concurrency. We evaluate UMap in data-intensive applications, including sorting, graph traversal, database operations, and metagenomic analytics. Our results show that UMap as a userspace service outperforms an optimized kernel-based service across a wide range of intra-node concurrency by 1.22-1.9 × . We performed two case studies to demonstrate UMap 's extensibility. First, a new datastore residing in remote memory is incorporated into UMap as an application-specific plugin. Second, we present a persistent memory allocator Metall built atop UMap for unified storage/memory.
- Research Organization:
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
- Sponsoring Organization:
- USDOE National Nuclear Security Administration (NNSA)
- Grant/Contract Number:
- AC52-07NA27344
- OSTI ID:
- 1829975
- Report Number(s):
- LLNL-JRNL-819817; 1030958
- Journal Information:
- IEEE Transactions on Parallel and Distributed Systems, Vol. 33, Issue 4; ISSN 1045-9219
- Publisher:
- IEEECopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
Exploiting Internal Parallelism for Address Translation in Solid-State Drives
Characteristics of workload on ASCI blue-pacific at lawrence livermore national laboratory