Performing an allreduce operation using shared memory
Abstract
Methods, apparatus, and products are disclosed for performing an allreduce operation using shared memory that include: receiving, by at least one of a plurality of processing cores on a compute node, an instruction to perform an allreduce operation; establishing, by the core that received the instruction, a job status object for specifying a plurality of shared memory allreduce work units, the plurality of shared memory allreduce work units together performing the allreduce operation on the compute node; determining, by an available core on the compute node, a next shared memory allreduce work unit in the job status object; and performing, by that available core on the compute node, that next shared memory allreduce work unit.
- Inventors:
- Issue Date:
- Research Org.:
- International Business Machines Corporation, Armonk, NY (USA
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1134015
- Patent Number(s):
- 8752051
- Application Number:
- 13/427,057
- Assignee:
- International Business Machines Corporation
- Patent Classifications (CPCs):
-
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
- DOE Contract Number:
- B554331
- Resource Type:
- Patent
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING
Citation Formats
Archer, Charles J, Dozsa, Gabor, Ratterman, Joseph D, and Smith, Brian E. Performing an allreduce operation using shared memory. United States: N. p., 2014.
Web.
Archer, Charles J, Dozsa, Gabor, Ratterman, Joseph D, & Smith, Brian E. Performing an allreduce operation using shared memory. United States.
Archer, Charles J, Dozsa, Gabor, Ratterman, Joseph D, and Smith, Brian E. Tue .
"Performing an allreduce operation using shared memory". United States. https://www.osti.gov/servlets/purl/1134015.
@article{osti_1134015,
title = {Performing an allreduce operation using shared memory},
author = {Archer, Charles J and Dozsa, Gabor and Ratterman, Joseph D and Smith, Brian E},
abstractNote = {Methods, apparatus, and products are disclosed for performing an allreduce operation using shared memory that include: receiving, by at least one of a plurality of processing cores on a compute node, an instruction to perform an allreduce operation; establishing, by the core that received the instruction, a job status object for specifying a plurality of shared memory allreduce work units, the plurality of shared memory allreduce work units together performing the allreduce operation on the compute node; determining, by an available core on the compute node, a next shared memory allreduce work unit in the job status object; and performing, by that available core on the compute node, that next shared memory allreduce work unit.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2014},
month = {6}
}
Works referenced in this record:
Inter-computer message routing system with each computer having separate routinng automata for each dimension of the network
patent, April 1992
- Flaig, Charles M.; Seitz, Charles L.
- US Patent Document 5,105,424
Self-timed mesh routing chip with data broadcasting
patent, July 1994
- Dunning, Dave
- US Patent Document 5,333,279
Hierarchical interconnection network architecture for parallel processing, having interconnections between bit-addressible nodes based on address bit permutations
patent, April 1996
- Cypher, Robert E.; Sanz, Jorge L. C.
- US Patent Document 5,513,371
Packet-switched self-routing multistage interconnection network having contention-free fanout, low-loss routing, and fanin buffering to efficiently realize arbitrarily low packet loss
patent, July 1996
- Krishnamoorthy, Ashok V.; Kiamilev, Fouad
- US Patent Document 5,541,914
Object oriented message passing system and method
patent, December 1996
- Saulpaugh, Thomas E.; Bruffey, Bill M.; Williams, Russell T.
- US Patent Document 5,590,334
Data gathering/scattering system for a plurality of processors in a parallel computer
patent, November 1998
- Kato, Sadaharu; Ishihata, Hiroaki; Horie, Takeshi
- US Patent Document 5,832,215
Intelligent batching of distributed messages
patent, February 1999
- Shan, Yen-Ping
- US Patent Document 5,875,329
Partitioning of processing elements in a SIMD/MIMD array processor
patent, March 1999
- Wilkinson, Paul Amba; Dieffenderfer, James Warren; Kogge, Peter M.
- US Patent Document 5,878,241
Parallel computer system using properties of messages to route them through an interconnect network and to select virtual channel circuits therewithin
patent, April 1999
- Yasuda, Yoshiko; Tanaka, Teruo
- US Patent Document 5,892,923
High-speed, parallel, processor architecture for front-end electronics, based on a single type of ASIC, and method use thereof
patent, August 1999
- Crosetto, Dario B.
- US Patent Document 5,937,202
Prediction system for RF power distribution
patent, September 1999
- Feisullin, Farid; Naylor, Bruce E.; Raukumar, Ajay
- US Patent Document 5,949,988
SMP clusters with remote resource managers for distributing work to other clusters while reducing bus traffic to a minimum
patent, March 2000
- VanHuben, Gary A.; Blake, Michael A.; Mak, Pak-kin
- US Patent Document 6,038,651
Pattern generation and shift plane operations for a mesh connected computer
patent, May 2000
- Meeker, Woodrow; Abercrombie, Andrew P.
- US Patent Document 6,067,609
Routing resource reserve/release protocol for multi-processor computer systems
patent, June 2000
- Nugent, Steven F.
- US Patent Document 6,076,131
Dead reckoning routing of packet data within a network of nodes having generally regular topology
patent, August 2001
- Cotter, David; Tatham, Martin C.
- US Patent Document 6,272,548
Method, system and computer program product for managing memory in a non-uniform memory access system
patent, September 2001
- Stevens, Luis F.
- US Patent Document 6,289,424
Implementing locks in a distributed processing system
patent, October 2002
- Keller, James B.; Hughes, William A.
- US Patent Document 6,473,849
Dynamically matching users for group communications based on a threshold degree of matching of sender and recipient predetermined acceptance criteria
patent, November 2002
- Olivier, Michael
- US Patent Document 6,480,885
System and method for configuration, management, and monitoring of a computer network using inheritance
patent, December 2004
- Hanchett, Paul F.
- US Patent Document 6,834,301
Protocol for self-organizing network using a logical spanning tree backbone
patent, January 2006
- Lee, Chung-Chieh; Hester, Lance; O'Dea, Robert J.
- US Patent Document 6,982,960
Efficient method of globalization and synchronization of distributed resources in distributed peer data processing environments
patent, March 2006
- Bae, Myung M.
- US Patent Document 7,010,576
Multiprocessor system supporting multiple outstanding TLBI operations per partition
patent, July 2006
- Arimilli, Ravi Kumar; Guthrie, Guy L.; Livingston, Kirk Samuel
- US Patent Document 7,073,043
Hyperbolic tree space display of computer system monitoring and analysis data
patent, November 2006
- Li, Jun; Moore, Keith E.
- US Patent Document 7,143,392
Reliable datagram transport service
patent, January 2007
- Krause, Michael R.; Garcia, David J.; Culley, Paul R.
- US Patent Document 7,171,484
Hierarchical tree-based protection scheme for mesh networks
patent, April 2007
- Shah-Heydari, Shahram
- US Patent Document 7,203,743
Phased upgrade of a computing environment
patent, August 2007
- Wildhagen, Andreas; Kretz, Michael; Kessler, Jörg
- US Patent Document 7,263,698
Systems for communicating current and future activity information among mobile internet users and methods therefor
patent, October 2007
- Jhanji, Neeraj
- US Patent Document 7,284,033
Method and apparatus for storing tree data structures among and within multiple memory channels
patent, April 2008
- Rangarajan, Vijay; Maniyar, Shyamsundar N.; Eatherton, William N.
- US Patent Document 7,352,739
Method and apparatus for suspending execution of a thread until a specified memory access occurs
patent, April 2008
- Rodgers, Dion; Marr, Deborah T.; Hill, David L.
- US Patent Document 7,363,474
Distributed counter and centralized sensor in barrier wait synchronization
patent, February 2009
- Silvera, Raul E.; Stoodley, Kevin A.; Zhang, Guansong
- US Patent Document 7,487,501
Distributed model compilation
patent, March 2009
- Shakeri, Mojdeh; Mosterman, Pieter J.
- US Patent Document 7,509,244
Massively parallel supercomputer
patent, June 2009
- Blumrich, Matthias A.; Chen, Dong; Chiu, George Liang-Tai
- US Patent Document 7,555,566
Synchronizing access to global resources
patent, August 2009
- Rabinovici, Sorana; Nishihara, Kenneth
- US Patent Document 7,571,439
Method and apparatus for storing tree data structures among and within multiple memory channels
patent, November 2009
- Rangarajan, Vijay; Maniyar, Shyamsundar N.; Eatherton, William N.
- US Patent Document 7,613,134
Implementing locks in a distributed processing system
patent, December 2009
- Meyer, Derrick R.; Owen, Jonathan M.; Hummel, Mark D.
- US Patent Document 7,640,315
Configuring compute nodes of a parallel computer in an operational group into a plurality of independent non-overlapping collective networks
patent, March 2010
- Archer, Charles J.; Inglett, Todd A.; Ratterman, Joseph D.
- US Patent Document 7,673,011
Locating hardware faults in a parallel computer
patent, April 2010
- Archer, Charles J.; Megerian, Mark G.; Ratterman, Joseph D.
- US Patent Document 7,697,443
System and method for automatic generation of a hierarchical tree network and the use of two complementary learning algorithms, optimized for each leaf of the hierarchical tree network
patent, May 2010
- Kil, David H.; Pottschmidt, David B.
- US Patent Document 7,725,329
Method and apparatus for stacked address, bus to memory data transfer
patent, June 2010
- Wiedenman, Gregory B.; Eckel, Nathan A.; Artmann, Joel B.
- US Patent Document 7,739,451
Hierarchical tree-based protection scheme for mesh networks
patent, August 2010
- Shah-Heydari, Shahram
- US Patent Document 7,774,448
Reinforced handle assembly for lock
patent, September 2010
- Shen, Mu-Lin
- US Patent Document 7,793,527
Computer hardware fault administration
patent, September 2010
- Archer, Charles J.; Megerian, Mark G.; Ratterman, Joseph D.
- US Patent Document 7,796,527
Dynamic multipoint tree rearrangement
patent, October 2010
- Boers, Arjen; Wijnands, Ijsbrand; Vicisano, Lorenzo
- US Patent Document 7,808,930
Root node redundancy for multipoint-to-multipoint transport trees
patent, November 2010
- Wijnands, Ijsbrand; Boers, Arjen; Lo, Alton
- US Patent Document 7,835,378
Cross-layer design techniques for interference-aware routing configuration in wireless mesh networks
patent, May 2011
- Gong, Xiaohong; Hart, Brian D.; Douglas, Bretton
- US Patent Document 7,936,681
Signaling completion of a message transfer from an origin compute node to a target compute node
patent, May 2011
- Blocksome, Michael A.; Parker, Jeffrey J.
- US Patent Document 7,948,999
Efficient content authentication in peer-to-peer networks
patent, July 2011
- Tamassia, Roberto; Triandopoulos, Nikolaos
- US Patent Document 7,974,221
Mechanism to support generic collective communication across a variety of programming models
patent, July 2011
- Almasi, Gheorghe; Dozsa, Gabor J.; Kumar, Sameer
- US Patent Document 7,984,448
Broadcasting a message in a parallel computer
patent, August 2011
- Berg, Jeremy E.; Faraj, Ahmad A.
- US Patent Document 7,991,857
Methods and systems for launching applications into existing isolation environments
patent, January 2012
- Chinta, Madhav; Raj, SamArun
- US Patent Document 8,090,797
Method and a system for responding locally to requests for file metadata associated with files stored remotely
patent, March 2012
- Nord, Joseph; Hoy, David
- US Patent Document 8,131,825
Systems and methods for determining compute kernels for an application in a parallel-processing computer system
patent, March 2012
- Papakipos, Matthew Nicholas; Grant, Brian K.; McGuire, Morgan
- US Patent Document 8,136,104
Performing an allreduce operation using shared memory
patent, April 2012
- Archer, Charles J.; Dozsa, Gabor J.; Ratterman, Joseph D.
- US Patent Document 8,161,480
Methods and systems for launching applications into existing isolation environments
patent, December 2012
- Chinta, Madhav; Raj, Sam Arun
- US Patent Document 8,326,943
Runtime optimization of an application executing on a parallel computer
patent, January 2013
- Faraj, Daniel A.; Smith, Brian E.
- US Patent Document 8,365,186
Monitoring operating parameters in a distributed computing system with active messages
patent, May 2013
- Archer, Charles J.; Carey, James E.; Markland, Matthew W.
- US Patent Document 8,436,720
Performing a scatterv operation on a hierarchical tree network optimized for collective operations
patent, October 2013
- Archer, Charles J.; Blocksome, Michael A.; Ratterman, Joseph D.
- US Patent Document 8,565,089
System and method for configuring computer applications and devices using inheritance
patent-application, July 2002
- Melchione, Daniel; Kouznetsov, Victor
- US Patent Application 09/755525; 20020091819
Synchronization objects for multi-computer systems
patent-application, February 2003
- Hoyle, Stephen L.
- US Patent Application 09/928115; 20030041173
Efficient method of globalization and synchronization of distributed resources in distributed peer data processing environments
patent-application,
- Bae, Myung M.
- US Patent Application 10/158500; 20030225852
Efficient circuits for out-of-order microprocessors
patent-application, February 2004
- Kuszmaul, Bradley C.; Henry-Kuszmaul, Dana Sue
- US Patent Application 10/608621; 20040034678
Arithmetic functions in torus and tree networks
patent-application, April 2004
- Bhanot, Gyan; Blumrich, Matthias A.; Chen, Dong
- US Patent Application 10/468991; 20040073590
Method and apparatus for managing an event processing system
patent-application, July 2006
- Supalov, Alexander
- US Patent Application 11/027627; 20060156312
Fast and memory protected asynchronous message scheme in a multi-process and multi-thread environment
patent-application, August 2006
- Zhou, Hao; Marineau-Mes, Sebastian; van der Veen, Peter
- US Patent Application 11/145105; 20060182137
MPI-aware networking infrastructure
patent-application, December 2006
- Gupta, Rinku; Abels, Timothy
- US Patent Application 11/147783; 20060282838
Method, system and program product for communicating among processes in a symmetric multi-processing cluster environment
patent-application, July 2007
- Jia, Bin; Treumann, Richard R.
- US Patent Application 11/282011; 20070174558
Programming a Multi-processor System
patent-application, September 2007
- Beardslee, John Mark; Doerr, Michael B.; Eng, Tommy K.
- US Patent Application 11/691889; 20070226686
Systems and methods for determining compute kernels for an application in a parallel-processing computer system
patent-application, December 2007
- Papakipos, Matthew N.; Grant, Brian K.; McGuire, Morgan S.
- US Patent Application 11/714592; 20070294666
Systems and methods for profiling an application running on a parallel-processing computer system
patent-application, December 2007
- Tuck, Nathan D.; Papakipos, Matthew N.; Grant, Brian K.
- US Patent Application 11/716508; 20070294681
Remote DMA systems and methods for supporting synchronization of distributed processes in a multi-processor system using collective operations
patent-application, May 2008
- Leonard, Judson S.; Stewart, Lawrence C.; Gingold, David
- US Patent Application 11/594427; 20080109569
Integrated Development Environment with Object-Oriented GUI Rendering Feature
patent-application, October 2008
- Feigenbaum, Barry A.; Squillace, Michael A.
- US Patent Application 11/695658; 20080250325
Signaling Completion of a Message Transfer from an Origin Compute Node to a Target Compute Node
patent-application, November 2008
- Blocksome, Michael A.; Parker, Jeffrey J.
- US Patent Application 11/744319; 20080273543
Interprocess Resource-Based Dynamic Scheduling System and Method
patent-application, November 2008
- Bohra, Subash; Nam, Scott
- US Patent Application 11/749810; 20080288949
Performing an Allreduce Operation Using Shared Memory
patent-application, December 2008
- Archer, Charles J.; Dozsa, Gabor; Ratterman, Joseph D.
- US Patent Application 11/754782; 20080301683
Non-Volatile Memory And Method With Non-Sequential Update Block Management
patent-application, January 2009
- Sinclair, Alan Walsh; Gorobets, Sergey Anatolievich; Bennett, Alan David
- US Patent Application 12/239489; 20090019218
Fault Tolerant Self-Optimizing Multi-Processor System and Method Thereof
patent-application, January 2009
- Shi, Justin Y.
- US Patent Application 12/168214; 20090019258
Database Retrieval with a Non-Unique Key on a Parallel Computer System
patent-application, February 2009
- Archer, Charles Jens; Peters, Amanda; Rocard, Gary Ross
- US Patent Application 11/830463; 20090037377
Determining When a Set of Compute Nodes Participating in a Barrier Operation on a Parallel Computer are Ready to Exit the Barrier Operation
patent-application, February 2009
- Blocksome, Michael A.
- US Patent Application 11/832192; 20090037707
Query Execution and Optimization Utilizing a Combining Network in a Parallel Computer System
patent-application, February 2009
- Barsness, Eric L.; Darrington, David L.; Peters, Amanda E.
- US Patent Application 11/834827; 20090043910
System and Method for Providing a Fully Non-Blocking Switch in a Supernode of a Multi-Tiered Full-Graph Interconnect Architecture
patent-application, March 2009
- Arimilli, Lakshminarayana B.; Arimilli, Ravi K.; Rajamony, Ramakrishnan
- US Patent Application 11/845211; 20090064140
Mechanism For Process Migration On A Massively Parallel Computer
patent-application, March 2009
- Archer, Charles; Darrington, David; McCarthy, Patrick
- US Patent Application 11/853927; 20090067334
Broadcasting A Message In A Parallel Computer
patent-application, September 2009
- Berg, Jeremy E.; Faraj, Ahmad A.
- US Patent Application 12/053902; 20090240838
Collecting and Aggregating Data Using Distributed Resources
patent-application, October 2009
- Yuan, Zhongsheng
- US Patent Application 12/058789; 20090248712
Novel Massively Parallel Supercomputer
patent-application, October 2009
- Blumrich, Matthias A.; Chen, Dong; Chiu, George L.
- US Patent Application 12/492799; 20090259713
Performing An Allreduce Operation On A Plurality Of Compute Nodes Of A Parallel Computer
patent-application, November 2009
- Faraj, Ahmad
- US Patent Application 12/124763; 20090292905
Method and System for Increasing Throughput in a Hierarchical Wireless Network
patent-application, December 2009
- Jaim, Praval; Aggarwal, Prashant
- US Patent Application 12/176681; 20090310544
Message Flow Control in a Multi-Node Computer System
patent-application, December 2009
- Barsness, Eric L.; Darrington, David L.; Peters, Amanda
- US Patent Application 12/144783; 20090319621
Processing Data Access Requests Among A Plurality Of Compute Nodes
patent-application, January 2010
- Archer, Charles J.; Howe, Emily J.; Smith, Brian E.
- US Patent Application 12/180963; 20100023631
Providing Improved Message Handling Performance in Computer Systems Utilizing Shared Network Devices
patent-application, January 2010
- Mundy, Michael Basil
- US Patent Application 12/239966; 20100082788
System-On-A-Chip Having an Array of Programmable Processing Elements Linked By an On-Chip Network with Distributed On-Chip Shared Memory and External Shared Memory
patent-application, July 2010
- Heddes, Marco; Ravasi, Masssimo; Malik, Rakesh Kumar
- US Patent Application 12/639325; 20100191911
Recording A Communication Pattern and Replaying Messages in a Parallel Computing System
patent-application, January 2011
- Heidelberger, Philip; Kumar, Sameer
- US Patent Application 12/500715; 20110010471
Cross-Channel Network Operation Offloading for Collective Operations
patent-application, May 2011
- Bloch, Noam; Bloch, Gil; Shachar, Ariel
- US Patent Application 12/945904; 20110119673
Distributed Symmetric Multiprocessing Computing Architecture
patent-application, May 2011
- Anderson, Richard S.
- US Patent Application 12/946626; 20110125974
Adaptive Address Mapping with Dynamic Runtime Memory Mapping Selection
patent-application, June 2011
- Schafer, Andre; Gries, Matthias
- US Patent Application 12/646248; 20110153908
Runtime Optimization Of An Application Executing On A Parallel Computer
patent-application, October 2011
- Faraj, Daniel A.; Smith, Brian E.
- US Patent Application 12/760111; 20110258627
Monitoring operating parameters in a distributed computing system with active messages
patent-application, November 2011
- Archer, Charles J.; Carey, James E.; Markland, Matthew W.
- US Patent Application 12/770187; 20110267197
Consolidated Information Retrieval Results
patent-application, August 2012
- Jensen, Lee Samuel
- US Patent Application 13/422245; 20120197882
Automatic generation and tuning of MPI collective communication routines
conference, January 2005
- Faraj, Ahmad; Yuan, Xin
- Proceedings of the 19th annual international conference on Supercomputing - ICS '05
Computing the Hough transform on a scan line array processor (image processing)
journal, March 1989
- Fisher, A. L.; Highnam, P. T.
- IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 11, Issue 3
Computing parallel prefix and reduction using coterie structures
conference, January 1992
- Herbordt, M. C.; Weems, C. C.
- [1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation
Kernel-level single system image for petascale computing
journal, April 2006
- Ong, Hong; Vetter, Jeffrey; Studham, R. Scott
- ACM SIGOPS Operating Systems Review, Vol. 40, Issue 2
Building packet buffers using interleaved memories
conference, January 2005
- Shrimali, G.; McKeown, N.
- HPSR. 2005 Workshop on High Performance Switching and Routing, 2005.
Optimization of MPI Collectives on Clusters of Large-Scale SMP's
conference, January 1999
- Sistare, Steve; vande Vaart, Rolf; Loh, Eugene
- SC Conference
Real-Time Performance Monitoring, Adaptive Control, and Interactive Steering of Computational Grids
journal, November 2000
- Vetter, Jeffrey S.; Reed, Daniel A.
- The International Journal of High Performance Computing Applications, Vol. 14, Issue 4