Data shuffling with hierarchical tuple spaces
Abstract
Methods and systems for shuffling data to generate a dataset are described. A first map module may generate first pair data, and a second map module may generate second pair data, from source data. The first map module may insert the first pair data into a first local tuple space accessible to the first map module. The second map module may insert the second pair data into a second local tuple space accessible to the second map module. A shuffle module may request pair data that includes a particular key. The first and second pair data may be inserted into a global tuple space accessible by the first and second map modules. The shuffle module may identify the requested pair data in the global tuple space, and may fetch the identified pair data from a memory. The shuffle module may shuffle the fetched pair data to generate the dataset.
- Inventors:
- Issue Date:
- Research Org.:
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1805382
- Patent Number(s):
- 10891274
- Application Number:
- 15/851,511
- Assignee:
- International Business Machines Corporation (Armonk, NY)
- DOE Contract Number:
- AC02-05CH11231
- Resource Type:
- Patent
- Resource Relation:
- Patent File Date: 12/21/2017
- Country of Publication:
- United States
- Language:
- English
Citation Formats
Kayi, Abdullah, Andrade Costa, Carlos Henrique, Park, Yoonho, and Johns, Charles Ray. Data shuffling with hierarchical tuple spaces. United States: N. p., 2021.
Web.
Kayi, Abdullah, Andrade Costa, Carlos Henrique, Park, Yoonho, & Johns, Charles Ray. Data shuffling with hierarchical tuple spaces. United States.
Kayi, Abdullah, Andrade Costa, Carlos Henrique, Park, Yoonho, and Johns, Charles Ray. Tue .
"Data shuffling with hierarchical tuple spaces". United States. https://www.osti.gov/servlets/purl/1805382.
@article{osti_1805382,
title = {Data shuffling with hierarchical tuple spaces},
author = {Kayi, Abdullah and Andrade Costa, Carlos Henrique and Park, Yoonho and Johns, Charles Ray},
abstractNote = {Methods and systems for shuffling data to generate a dataset are described. A first map module may generate first pair data, and a second map module may generate second pair data, from source data. The first map module may insert the first pair data into a first local tuple space accessible to the first map module. The second map module may insert the second pair data into a second local tuple space accessible to the second map module. A shuffle module may request pair data that includes a particular key. The first and second pair data may be inserted into a global tuple space accessible by the first and second map modules. The shuffle module may identify the requested pair data in the global tuple space, and may fetch the identified pair data from a memory. The shuffle module may shuffle the fetched pair data to generate the dataset.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2021},
month = {1}
}