Data shuffling with hierarchical tuple spaces
Abstract
Methods and systems for shuffling data are described. A processor may generate pair data from source data. The processor may insert the pair data into local tuple spaces. In response to a request for a particular key, the processor may determine a presence of the requested key in a global tuple space. The processor may, in response to a presence of the requested key in the global tuple space, update the global tuple space. The update may be based on the pair data among the local tuple spaces including the existing key. The processor may, in response to an absence of the requested key in the global tuple space, insert pair data including the missing key from the local tuple spaces into the global tuple space. The processor may fetch the requested pair data, and may shuffle the fetched data to generate a dataset.
- Inventors:
- Issue Date:
- Research Org.:
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1805660
- Patent Number(s):
- 10956125
- Application Number:
- 15/851,480
- Assignee:
- International Business Machines Corporation (Armonk, NY)
- Patent Classifications (CPCs):
-
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
- DOE Contract Number:
- AC02-05CH11231
- Resource Type:
- Patent
- Resource Relation:
- Patent File Date: 12/21/2017
- Country of Publication:
- United States
- Language:
- English
Citation Formats
Andrade Costa, Carlos Henrique, Kayi, Abdullah, Park, Yoonho, and Johns, Charles Ray. Data shuffling with hierarchical tuple spaces. United States: N. p., 2021.
Web.
Andrade Costa, Carlos Henrique, Kayi, Abdullah, Park, Yoonho, & Johns, Charles Ray. Data shuffling with hierarchical tuple spaces. United States.
Andrade Costa, Carlos Henrique, Kayi, Abdullah, Park, Yoonho, and Johns, Charles Ray. Tue .
"Data shuffling with hierarchical tuple spaces". United States. https://www.osti.gov/servlets/purl/1805660.
@article{osti_1805660,
title = {Data shuffling with hierarchical tuple spaces},
author = {Andrade Costa, Carlos Henrique and Kayi, Abdullah and Park, Yoonho and Johns, Charles Ray},
abstractNote = {Methods and systems for shuffling data are described. A processor may generate pair data from source data. The processor may insert the pair data into local tuple spaces. In response to a request for a particular key, the processor may determine a presence of the requested key in a global tuple space. The processor may, in response to a presence of the requested key in the global tuple space, update the global tuple space. The update may be based on the pair data among the local tuple spaces including the existing key. The processor may, in response to an absence of the requested key in the global tuple space, insert pair data including the missing key from the local tuple spaces into the global tuple space. The processor may fetch the requested pair data, and may shuffle the fetched data to generate a dataset.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2021},
month = {3}
}
Works referenced in this record:
Timeline Index for Partitioned Temporal Database Tables
patent-application, February 2016
- Kaufmann, Martin; May, Norman; Yousefi Amin Abadi, Elias
- US Patent Application 14/453546; 20160042039
Systems and/or Methods for Leveraging in-Memory Storage in Connection with the shuffle phase of Mapreduce
patent-application, February 2016
- Mehra, Gagan; Devgan, Manish
- US Patent Application 14/449517; 20160034205
Methods and Apparatus of Analyzing Electrical Power Grid Data
patent-application, December 2013
- Hafen, Ryan P.; Critchlow, Terence J.; Gibson, Tara D.
- US Patent Application 13/928108; 20130345999
Method and apparatus for shuffling data
patent, December 2018
- Roussel, Patrice; Macy, William W.; Nguyen, Huy V.
- US Patent Document 10,152,323
Optimization of Map-reduce Shuffle Performance Through Shuffler I/O Pipeline Actions and Planning
patent-application, May 2015
- Hu, Zhenhua; Ma, Hao Hai; Tang, Wentao
- 14/090282; 20150150017
Organizing, Joining, and Performing Statistical Calculations on Massive Sets of Data
patent-application, September 2015
- Vemuri, Srinivas S.; Varshney, Maneesh; Puttaswamy Naga, Krishna P.
- 14/536220; 20150261804
Method and system for facilitating data retrieval from a plurality of data sources
patent-application, November 2006
- Gorelik, Alexander
- US Patent Application 11/499442; 20060271528
Data Shuffling with Hierarchical Tuple Spaces
patent-application, June 2019
- Andrade Costa, Carlos Henrique; Kayi, Abdullah; Park, Yoonho
- US Patent Application 15/851480; 20190196783
Transparent Efficiency for in-Memory Execution of Map Reduce Job Sequences
patent-application, February 2014
- Cunningham, David; Herta, Benjamin W.; Saraswat, Vijay A.
- US Patent Application 13/593718; 20140059552
Data-Parallel Computation Management
patent-application, March 2014
- Zhang, Jiaxing; Zhou, Hucheng; Guo, Zhenyu
- US Patent Application 13/612184; 20140075161
System and Architecture for Robust Management of Resources in a Wide-Area Network
patent, January 2016
- Mukhopadhyay, Supratik; Iyengar, S. S.
- US Patent Document 9,240,955
Network Server Systems, Architectures, Components and Related Methods
patent-application, April 2019
- Dalal, Parin Bhadrik
- US Patent Application 16/129762; 20190109793
Management of intermediate data spills during the shuffle phase of a map-reduce job
patent, August 2016
- Cramer, Michael; Christian, Brian P.
- US Patent Document 9,424,274
Software architecture for control of highly parallel computer systems
patent, November 1997
- Jagannathan, Suresh; Philbin, James F.
- US Patent Document 5,692,193
Method and apparatus for fundamental operations on token sequences: computing similarity, extracting term values, and searching efficiently
patent-application, August 2004
- Nakano, Russell T.
- US Patent Application 10/781580; 20040162827
Workload Balancing to Handle Skews for Big Data Analytics
patent-application, November 2015
- Sahu, Birendra Kumar
- US Patent Application 14/279911; 20150331724
Method and Apparatus for Shuffling Data
patent-application, February 2011
- Macy, JR., William W.; Debes, Eric L.; Roussel, Patrice L.
- US Patent Application 12/901336; 20110029759
Deterministic Progressive Big Data Analytics
patent-application, December 2014
- Chandramouli, Badrish; Goldstein, Jonathan; Quamar, Abdul Hussain
- US Patent Application 13/915632; 20140372438