Parallel sort with a ranged, partitioned key-value store in a high perfomance computing environment
Abstract
Improved sorting techniques are provided that perform a parallel sort using a ranged, partitioned key-value store in a high performance computing (HPC) environment. A plurality of input data files comprising unsorted key-value data in a partitioned key-value store are sorted. The partitioned key-value store comprises a range server for each of a plurality of ranges. Each input data file has an associated reader thread. Each reader thread reads the unsorted key-value data in the corresponding input data file and performs a local sort of the unsorted key-value data to generate sorted key-value data. A plurality of sorted, ranged subsets of each of the sorted key-value data are generated based on the plurality of ranges. Each sorted, ranged subset corresponds to a given one of the ranges and is provided to one of the range servers corresponding to the range of the sorted, ranged subset. Each range server sorts the received sorted, ranged subsets and provides a sorted range. A plurality of the sorted ranges are concatenated to obtain a globally sorted result.
- Inventors:
- Issue Date:
- Research Org.:
- Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1262640
- Patent Number(s):
- 9245048
- Application Number:
- 14/143,771
- Assignee:
- EMC Corporation (Hopkinton, MA) Los Alamos National Security, LLC (Los Alamos, NM)
- Patent Classifications (CPCs):
-
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
- DOE Contract Number:
- AC52-06NA25396
- Resource Type:
- Patent
- Resource Relation:
- Patent File Date: 2013 Dec 30
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING
Citation Formats
Bent, John M., Faibish, Sorin, Grider, Gary, Torres, Aaron, and Poole, Stephen W. Parallel sort with a ranged, partitioned key-value store in a high perfomance computing environment. United States: N. p., 2016.
Web.
Bent, John M., Faibish, Sorin, Grider, Gary, Torres, Aaron, & Poole, Stephen W. Parallel sort with a ranged, partitioned key-value store in a high perfomance computing environment. United States.
Bent, John M., Faibish, Sorin, Grider, Gary, Torres, Aaron, and Poole, Stephen W. Tue .
"Parallel sort with a ranged, partitioned key-value store in a high perfomance computing environment". United States. https://www.osti.gov/servlets/purl/1262640.
@article{osti_1262640,
title = {Parallel sort with a ranged, partitioned key-value store in a high perfomance computing environment},
author = {Bent, John M. and Faibish, Sorin and Grider, Gary and Torres, Aaron and Poole, Stephen W.},
abstractNote = {Improved sorting techniques are provided that perform a parallel sort using a ranged, partitioned key-value store in a high performance computing (HPC) environment. A plurality of input data files comprising unsorted key-value data in a partitioned key-value store are sorted. The partitioned key-value store comprises a range server for each of a plurality of ranges. Each input data file has an associated reader thread. Each reader thread reads the unsorted key-value data in the corresponding input data file and performs a local sort of the unsorted key-value data to generate sorted key-value data. A plurality of sorted, ranged subsets of each of the sorted key-value data are generated based on the plurality of ranges. Each sorted, ranged subset corresponds to a given one of the ranges and is provided to one of the range servers corresponding to the range of the sorted, ranged subset. Each range server sorts the received sorted, ranged subsets and provides a sorted range. A plurality of the sorted ranges are concatenated to obtain a globally sorted result.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Tue Jan 26 00:00:00 EST 2016},
month = {Tue Jan 26 00:00:00 EST 2016}
}