Performing a local reduction operation on a parallel computer
Abstract
A parallel computer including compute nodes, each including two reduction processing cores, a network write processing core, and a network read processing core, each processing core assigned an input buffer. Copying, in interleaved chunks by the reduction processing cores, contents of the reduction processing cores' input buffers to an interleaved buffer in shared memory; copying, by one of the reduction processing cores, contents of the network write processing core's input buffer to shared memory; copying, by another of the reduction processing cores, contents of the network read processing core's input buffer to shared memory; and locally reducing in parallel by the reduction processing cores: the contents of the reduction processing core's input buffer; every other interleaved chunk of the interleaved buffer; the copied contents of the network write processing core's input buffer; and the copied contents of the network read processing core's input buffer.
- Inventors:
- Issue Date:
- Research Org.:
- International Business Machines Corp., Armonk, NY (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1084349
- Patent Number(s):
- 8458244
- Application Number:
- 13/585,993
- Assignee:
- International Business Machines Corporation (Armonk, NY)
- Patent Classifications (CPCs):
-
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
- DOE Contract Number:
- B554331
- Resource Type:
- Patent
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING
Citation Formats
Blocksome, Michael A, and Faraj, Daniel A. Performing a local reduction operation on a parallel computer. United States: N. p., 2013.
Web.
Blocksome, Michael A, & Faraj, Daniel A. Performing a local reduction operation on a parallel computer. United States.
Blocksome, Michael A, and Faraj, Daniel A. Tue .
"Performing a local reduction operation on a parallel computer". United States. https://www.osti.gov/servlets/purl/1084349.
@article{osti_1084349,
title = {Performing a local reduction operation on a parallel computer},
author = {Blocksome, Michael A and Faraj, Daniel A},
abstractNote = {A parallel computer including compute nodes, each including two reduction processing cores, a network write processing core, and a network read processing core, each processing core assigned an input buffer. Copying, in interleaved chunks by the reduction processing cores, contents of the reduction processing cores' input buffers to an interleaved buffer in shared memory; copying, by one of the reduction processing cores, contents of the network write processing core's input buffer to shared memory; copying, by another of the reduction processing cores, contents of the network read processing core's input buffer to shared memory; and locally reducing in parallel by the reduction processing cores: the contents of the reduction processing core's input buffer; every other interleaved chunk of the interleaved buffer; the copied contents of the network write processing core's input buffer; and the copied contents of the network read processing core's input buffer.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2013},
month = {6}
}
Works referenced in this record:
Root node redundancy for multipoint-to-multipoint transport trees
patent, November 2010
- Wijnands, Ijsbrand; Boers, Arjen; Lo, Alton
- US Patent Document 7,835,378
Non-Binary Source-to-Channel Symbol Mappings with Minimized Distortion
patent-application, August 2009
- Chan, Ho Yin; Mow, Wai Ho
- US Patent Application 12/023750; 20090196361
Method, system and computer program product for managing memory in a non-uniform memory access system
patent, September 2001
- Stevens, Luis F.
- US Patent Document 6,289,424
Correlating Hardware Devices Between Local Operating System and Global Management Entity
patent-application, August 2008
- Ritz, Andrew J.; Jodh, Santosh S.; Walker, Ellsworth D.
- US Patent Application 11/675261; 20080201603
Locating hardware faults in a parallel computer
patent, April 2010
- Archer, Charles J.; Megerian, Mark G.; Ratterman, Joseph D.
- US Patent Document 7,697,443
Method and apparatus for pre-provisioning networks to support fast restoration with minimum overbuild
patent-application, November 2005
- Alicherry, Mansoor Ali Khan; Bhatia, Randeep Singh
- US Patent Application 10/838098; 20050243711
Interleaved all-to-all reliable broadcast on meshes and hypercubes
journal, May 1994
- Sunggu Lee, ; Shin, K. G.
- IEEE Transactions on Parallel and Distributed Systems, Vol. 5, Issue 5
Reinforced handle assembly for lock
patent, September 2010
- Shen, Mu-Lin
- US Patent Document 7,793,527
Broadcasting A Message In A Parallel Computer
patent-application, September 2009
- Berg, Jeremy E.; Faraj, Ahmad A.
- US Patent Application 12/053902; 20090240838
Computing the Hough transform on a scan line array processor (image processing)
journal, March 1989
- Fisher, A. L.; Highnam, P. T.
- IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 11, Issue 3
Method for testing integrated memory using an integrated DMA controller
patent, September 1997
- Gittinger, Robert Paul; Spilo, David A.
- US Patent Document 5,668,815
DMA descriptor queue read and cache write pointer arrangement
patent, February 2009
- Pope, Steve L.; Roberts, Derek; Riddoch, David J.
- US Patent Document 7,496,699
Method and apparatus for stacked address, bus to memory data transfer
patent, June 2010
- Wiedenman, Gregory B.; Eckel, Nathan A.; Artmann, Joel B.
- US Patent Document 7,739,451
Dead reckoning routing of packet data within a network of nodes having generally regular topology
patent, August 2001
- Cotter, David; Tatham, Martin C.
- US Patent Document 6,272,548
Partitioning of processing elements in a SIMD/MIMD array processor
patent, March 1999
- Wilkinson, Paul Amba; Dieffenderfer, James Warren; Kogge, Peter M.
- US Patent Document 5,878,241
Performing process migration with allreduce operations
patent, December 2010
- Archer, Charles J.; Peters, Amanda; Wallenfelt, Brian Paul
- US Patent Document 7,853,639
Apparatus and methods for connecting modules using remote switching
patent-application, February 2002
- Carvey, Philip P.; Dally, William J.; Dennison, Larry R.
- US Patent Application 09/765138; 20020016901
Hierarchical interconnection network architecture for parallel processing, having interconnections between bit-addressible nodes based on address bit permutations
patent, April 1996
- Cypher, Robert E.; Sanz, Jorge L. C.
- US Patent Document 5,513,371
Dynamic load balancing among processors in a parallel computer
patent, September 2001
- Hardwick, Jonathan C.
- US Patent Document 6,292,822
Input/output controller for coupling the processor-memory complex to the fabric in fabric-backplane interprise servers
patent, February 2010
- Lovett, Thomas D.; Mehrotra, Sharad; Nicolaou, Cosmos
- US Patent Document 7,664,110
Performing An Allreduce Operation On A Plurality Of Compute Nodes Of A Parallel Computer
patent-application, November 2009
- Faraj, Ahmad
- US Patent Application 12/124763; 20090292905
Deterministic real time hierarchical distributed computing system
patent, August 2007
- Ambuel, Jack
- US Patent Document 7,263,598
Protocol for self-organizing network using a logical spanning tree backbone
patent, January 2006
- Lee, Chung-Chieh; Hester, Lance; O'Dea, Robert J.
- US Patent Document 6,982,960
System and method for configuring computer applications and devices using inheritance
patent-application, July 2002
- Melchione, Daniel; Kouznetsov, Victor
- US Patent Application 09/755525; 20020091819
Building packet buffers using interleaved memories
conference, January 2005
- Shrimali, G.; McKeown, N.
- HPSR. 2005 Workshop on High Performance Switching and Routing, 2005.
Performing an Allreduce Operation Using Shared Memory
patent-application, December 2008
- Archer, Charles J.; Dozsa, Gabor; Ratterman, Joseph D.
- US Patent Application 11/754782; 20080301683
Adaptive congestion control mechanism for modular computer networks
patent, September 1999
- Scott, Steven L.; Pribnow, Richard D.; Logghe, Peter G.
- US Patent Document 5,958,017
Bandwidth Efficient All-reduce Operation on Tree Topologies
conference, March 2007
- Patarasuk, Pitch; Yuan, Xin
- 2007 IEEE International Parallel and Distributed Processing Symposium
System and method for generating object code for map-reduce idioms in multiprocessor systems
patent-application, May 2008
- Liao, Shih-wei; Huang, Bo; Chen, Guilin
- US Patent Application 11/516292; 20080127146
Apparatus and method for controlling direct memory access
patent-application, August 2006
- Seong, Shee-Hoon
- US Patent Application 11/341787: 20060179181
Apparatus, system, and method for reliable, fast, and scalable multicast message delivery in service overlay networks
patent-application, May 2007
- Tang, Chunqiang; Chang, Rong Nickle; Ward, Christopher
- US Patent Application 11/281792; 20070110063
Signaling completion of a message transfer from an origin compute node to a target compute node
patent, May 2011
- Blocksome, Michael A.; Parker, Jeffrey J.
- US Patent Document 7,948,999
Method and apparatus for storing tree data structures among and within multiple memory channels
patent, April 2008
- Rangarajan, Vijay; Maniyar, Shyamsundar N.; Eatherton, William N.
- US Patent Document 7,352,739
Multi-use data access descriptor
patent-application, October 2002
- Schmisseur, Mark A.
- US Patent Application 09/820121; 20020144027
Performing a local reduction operation on a parallel computer
patent, December 2012
- Blocksome, Michael A.; Faraj, Daniel A.
- US Patent Document 8,332,460
Parallel computer system using properties of messages to route them through an interconnect network and to select virtual channel circuits therewithin
patent, April 1999
- Yasuda, Yoshiko; Tanaka, Teruo
- US Patent Document 5,892,923
Configuring compute nodes of a parallel computer in an operational group into a plurality of independent non-overlapping collective networks
patent, March 2010
- Archer, Charles J.; Inglett, Todd A.; Ratterman, Joseph D.
- US Patent Document 7,673,011
Arithmetic functions in torus and tree networks
patent-application, April 2004
- Bhanot, Gyan; Blumrich, Matthias A.; Chen, Dong
- US Patent Application 10/468991; 20040073590
Optimized Collectives Using a DMA on a Parallel Computer
patent-application, January 2009
- Chen, Dong; Gabor, Dozsa; Giampapa, Mark E.
- US Patent Application 11/768645; 20090006662
Cross-Channel Network Operation Offloading for Collective Operations
patent-application, May 2011
- Bloch, Noam; Bloch, Gil; Shachar, Ariel
- US Patent Application 12/945904; 20110119673
Method for performing alltoall communication in parallel computers
patent, December 2001
- Kureya, Kimihide
- US Patent Document 6,334,138
Video output controller and video card
patent, July 2005
- Amemiya, Jiro; Uesugi, Kouki
- US Patent Document 6,914,606
Parallel processing method and system using a lazy parallel data type to reduce inter-processor communication
patent, April 2001
- Hardwick, Jonathan C.
- US Patent Document 6,212,617
High-speed, parallel, processor architecture for front-end electronics, based on a single type of ASIC, and method use thereof
patent, August 1999
- Crosetto, Dario B.
- US Patent Document 5,937,202
Inter-computer message routing system with each computer having separate routinng automata for each dimension of the network
patent, April 1992
- Flaig, Charles M.; Seitz, Charles L.
- US Patent Document 5,105,424
Cluster Computing Support for Application Programs
patent-application, December 2007
- Tannenbaum, Zvi; Dauger, Dean E.
- US Patent Application 11/744461; 20070288935
Computer Hardware Fault Diagnosis
patent-application, October 2007
- Archer, Charles J.; Megerian, Mark G.; Ratterman, Joseph D.
- US Patent Application 1/279573; 20070242611
Executing an Allgather Operation on a Parallel Computer
patent-application, October 2007
- Archer, Charles J.; Moreira, JOse F.; Ratterman, Joseph D.
- US Patent Application 11/279620; 20070245122
Irregular network
patent-application, November 2003
- Dally, William J.; Mann, William F.; Carvey, Philip P.
- US Patent Application 10/457718; 20030212877
Computing parallel prefix and reduction using coterie structures
conference, January 1992
- Herbordt, M. C.; Weems, C. C.
- [1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation
Distributed processing multi-processor computer
patent-application, September 2003
- Bremner, Neale
- US Patent Application 10/276634; 20030182376
Data gathering/scattering system for a plurality of processors in a parallel computer
patent, November 1998
- Kato, Sadaharu; Ishihata, Hiroaki; Horie, Takeshi
- US Patent Document 5,832,215
Adaptive Address Mapping with Dynamic Runtime Memory Mapping Selection
patent-application, June 2011
- Schafer, Andre; Gries, Matthias
- US Patent Application 12/646248; 20110153908
Pattern generation and shift plane operations for a mesh connected computer
patent, May 2000
- Meeker, Woodrow; Abercrombie, Andrew P.
- US Patent Document 6,067,609
Method, System, and Program for Handling Input/Output Commands
patent-application, July 2006
- Bissessur, Sailesh; Mackey, Richard P.; Schmisseur, Mark A.
- US Patent Application 11/279086; 20060168359
Parallel Programming Development Environment
patent-application, May 2002
- Ladd, Patrick G.
- US Patent Application 09/222482; 20020054051
Tracking Network Contention
patent-application, June 2009
- Archer, Charles J.; Peters, Amanda; Smith, Brian E.
- US Patent Application 11/955474; 20090154486
Direct memory access controller system with message-based programming
patent-application, July 2005
- Clayton, Shawn Adam; Fortin, Brian Mark; Willie, Daniel Brian
- US Patent Application 11/088344; 20050165980
Towards Efficient Execution of MPI Applications on the Grid: Porting and Optimization Issues
journal, January 2003
- Keller, Rainer; Gabriel, Edgar; Krammer, Bettina
- Journal of Grid Computing, Vol. 1, Issue 2
Routing resource reserve/release protocol for multi-processor computer systems
patent, June 2000
- Nugent, Steven F.
- US Patent Document 6,076,131
Method and apparatus for storing tree data structures among and within multiple memory channels
patent, November 2009
- Rangarajan, Vijay; Maniyar, Shyamsundar N.; Eatherton, William N.
- US Patent Document 7,613,134
Method and apparatus for internetworking buffer management
patent, August 2000
- Van Seters, Stephen L.; Hauser, Stephen A.; Sankey, Mark A.
- US Patent Document 6,108,692
Performing Process Migration with Allreduce Operations
patent-application, July 2010
- Archer, Charles Jens; Peters, Amanda; Wallenfelt, Brian Paul
- US Patent Application 11/531175; 20100185718
Data transfer apparatus and method
patent-application, October 2003
- Yosimoto, Atuyuki; Hayasaka, Kazumi; Saito, Hiroshi
- US Patent Application 10/400669; 20030188054
Facilitating intra-node data transfer in collective communications
patent, May 2009
- Blackmore, Robert S.; Jia, Bin; Treumann, Richard R.
- US Patent Document 7,539,989
Virtual private networks within a packet network having a mesh topology
patent-application, May 2005
- Ashwood-Smith, Peter
- US Patent Application 10/694833; 20050094577
Fast restoration mechanism and method of determining minimum restoration capacity in a transmission networks
patent, November 2006
- Weis, Bernd
- US Patent Document 7,133,359
Systems for communicating current and future activity information among mobile internet users and methods therefor
patent, October 2007
- Jhanji, Neeraj
- US Patent Document 7,284,033
Handling potential deadlocks and correctness problems of reduce operations in parallel systems
patent-application, March 2009
- Ohly, Patrick; Shumilin, Victor
- US Patent Application 11/897480; 20090064176
Optimizing threaded MPI execution on SMP clusters
conference, January 2001
- Tang, Hong; Yang, Tao
- Proceedings of the 15th international conference on Supercomputing - ICS '01
Method and apparatus for the connection of a closed ring through a telephone exchange
patent, December 1987
- Nilsson, Olof E.
- US Patent Document 4,715,032
SMP clusters with remote resource managers for distributing work to other clusters while reducing bus traffic to a minimum
patent, March 2000
- VanHuben, Gary A.; Blake, Michael A.; Mak, Pak-kin
- US Patent Document 6,038,651
Managing Hardware Resources by Sending Messages Amongst Servers in a Data Center
patent-application, July 2011
- Mayo, Mark G.; Duncan, James; Candel, Pedro Palazon
- US Patent Application 12/696802; 20110179134
Method and apparatus for controlling (N+I) I/O channels with (N) data managers in a homogenous software programmable environment
patent, January 1999
- Carmichael, Richard; Ward, Joel M.; Winchell, Michael A.
- US Patent Document 5,864,712
Method and apparatus for wire speed IP multicast forwarding
patent, June 2004
- Brown, David A.
- US Patent Document 6,754,211
Efficient circuits for out-of-order microprocessors
patent-application, February 2004
- Kuszmaul, Bradley C.; Henry-Kuszmaul, Dana Sue
- US Patent Application 10/608621; 20040034678
Self-timed mesh routing chip with data broadcasting
patent, July 1994
- Dunning, Dave
- US Patent Document 5,333,279
Extending the message passing interface (MPI)
conference, January 1995
- Skjellum, A.; Doss, N. E.; Viswanathan, K.
- Proceedings Scalable Parallel Libraries Conference
Efficient MPI Collective Operations for Clusters in Long-and-Fast Networks
conference, September 2006
- Matsuda, Motohiko; Kudoh, Tomohiro; Kodama, Yuetsu
- 2006 IEEE International Conference on Cluster Computing
Dynamically matching users for group communications based on a threshold degree of matching of sender and recipient predetermined acceptance criteria
patent, November 2002
- Olivier, Michael
- US Patent Document 6,480,885
Apparatus and method for capacity planning for data center server consolidation and workload reassignment
patent-application, March 2008
- Neuse, Douglas M.; Matchett, Douglas K.; Walton, Chris
- US Patent Application 11/525511; 20080077366
Efficient algorithms for all-to-all communications in multiport message-passing systems
journal, January 1997
- Bruck, J.; Kipnis, S.
- IEEE Transactions on Parallel and Distributed Systems, Vol. 8, Issue 11
Prediction system for RF power distribution
patent, September 1999
- Feisullin, Farid; Naylor, Bruce E.; Raukumar, Ajay
- US Patent Document 5,949,988
Coprocessor design to support MPI primitives in configurable multiprocessors
journal, April 2007
- Ziavras, Sotirios G.; Gerbessiotis, Alexandros V.; Bafna, Rohan
- Integration, the VLSI Journal, Vol. 40, Issue 3, p. 235-252
Communicator-based token/buffer management for eager protocol support in collective communication operations
patent-application, May 2010
- Jia, Bin
- US Patent Application 12/267730; 20100122268
Message transfer system and method for parallel computer with message transfers being scheduled by skew and roll functions to avoid bottlenecks
patent, April 1997
- Heller, Steven K.
- US Patent Document 5,617,538
Mechanism For Process Migration On A Massively Parallel Computer
patent-application, March 2009
- Archer, Charles; Darrington, David; McCarthy, Patrick
- US Patent Application 11/853927; 20090067334
Method of optimizing recognition of collective data movement in a parallel distributed system
patent, October 1998
- Ogasawara, Takeshi; Komatsu, Hideaki
- US Patent Document 5,822,604
Multicomputer memory access architecture
patent, February 1998
- Frisch, Robert C.
- US Patent Document 5,721,828
Non-Volatile Memory And Method With Non-Sequential Update Block Management
patent-application, January 2009
- Sinclair, Alan Walsh; Gorobets, Sergey Anatolievich; Bennett, Alan David
- US Patent Application 12/239489; 20090019218
Data processing architectures for packet handling wherein batches of data packets of unpredictable size are distributed across processing elements arranged in a SIMD array operable to process different respective packet protocols at once while executing a single common instruction stream
patent, December 2010
- Rhoades, John; Cameron, Ken; Winser, Paul
- US Patent Document 7,856,543
Method and apparatus for zeroing a transfer buffer memory as a background task
patent-application, May 2002
- Thompson, Mark J.; Zimmer, Vincent J.
- US Patent Application 10/006553; 20020065984
Development of parallel/distributed applications
patent-application, December 2006
- Joublin, Frank; Georick, Christian; Ceravola, Antonello
- US Patent Application 11/429383; 20060277323
Optimization of MPI collectives on clusters of large-scale SMP's
conference, January 1999
- Sistare, Steve; vandeVaart, Rolf; Loh, Eugene
- Proceedings of the 1999 ACM/IEEE conference on Supercomputing (CDROM) - Supercomputing '99
Method, system and program product for communicating among processes in a symmetric multi-processing cluster environment
patent-application, July 2007
- Jia, Bin; Treumann, Richard R.
- US Patent Application 11/282011; 20070174558
Packet-switched self-routing multistage interconnection network having contention-free fanout, low-loss routing, and fanin buffering to efficiently realize arbitrarily low packet loss
patent, July 1996
- Krishnamoorthy, Ashok V.; Kiamilev, Fouad
- US Patent Document 5,541,914
Method and apparatus for efficient transfer of data packets
patent, May 2004
- Hellum, Pål Longva; Kleven, Bjørn Kristian
- US Patent Document 6,742,063
DADO: A tree-structured machine architecture for production systems
report, March 1982
- Stolfo, Salvatore; Shaw, David Elliot
- Columbia University, 15 p.
- CUCS-24-82
Massively parallel supercomputer
patent, June 2009
- Blumrich, Matthias A.; Chen, Dong; Chiu, George Liang-Tai
- US Patent Document 7,555,566
System and method for automatic generation of a hierarchical tree network and the use of two complementary learning algorithms, optimized for each leaf of the hierarchical tree network
patent, May 2010
- Kil, David H.; Pottschmidt, David B.
- US Patent Document 7,725,329