Performing a local reduction operation on a parallel computer

Blocksome, Michael A.; Faraj, Daniel A.

Performing a local reduction operation on a parallel computer

Patent · Tue Dec 11 04:00:00 EST 2012

OSTI ID:1082353

Blocksome, Michael A.; Faraj, Daniel A.

A parallel computer including compute nodes, each including two reduction processing cores, a network write processing core, and a network read processing core, each processing core assigned an input buffer. Copying, in interleaved chunks by the reduction processing cores, contents of the reduction processing cores' input buffers to an interleaved buffer in shared memory; copying, by one of the reduction processing cores, contents of the network write processing core's input buffer to shared memory; copying, by another of the reduction processing cores, contents of the network read processing core's input buffer to shared memory; and locally reducing in parallel by the reduction processing cores: the contents of the reduction processing core's input buffer; every other interleaved chunk of the interleaved buffer; the copied contents of the network write processing core's input buffer; and the copied contents of the network read processing core's input buffer.

View Patent

Research Organization:: International Business Machines Corporation, Armonk, NY (United States)

Sponsoring Organization:: USDOE

Assignee:: International Business Machines Corporation (Armonk, NY)

Patent Number(s):: 8,332,460

Application Number:: 12/760,020

OSTI ID:: 1082353

Country of Publication:: United States

Language:: English

References (14)

Efficient algorithms for all-to-all communications in multiport message-passing systems Bruck, J.; Kipnis, S. IEEE Transactions on Parallel and Distributed Systems, Vol. 8, Issue 11 https://doi.org/10.1109/71.642949	journal	January 1997
Efficient MPI Collective Operations for Clusters in Long-and-Fast Networks Matsuda, Motohiko; Kudoh, Tomohiro; Kodama, Yuetsu 2006 IEEE International Conference on Cluster Computing https://doi.org/10.1109/CLUSTR.2006.311848	conference	September 2006
Computing parallel prefix and reduction using coterie structures Herbordt, M. C.; Weems, C. C. [1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation https://doi.org/10.1109/FMPC.1992.234895	conference	January 1992
Coprocessor design to support MPI primitives in configurable multiprocessors Ziavras, Sotirios G.; Gerbessiotis, Alexandros V.; Bafna, Rohan Integration, the VLSI Journal, Vol. 40, Issue 3, p. 235-252 https://doi.org/10.1016/j.vlsi.2005.10.001	journal	April 2007
Interleaved all-to-all reliable broadcast on meshes and hypercubes No authors listed IEEE Transactions on Parallel and Distributed Systems, Vol. 5, Issue 5 https://doi.org/10.1109/71.282556	journal	May 1994
Optimization of MPI collectives on clusters of large-scale SMP's Sistare, Steve; vandeVaart, Rolf; Loh, Eugene Proceedings of the 1999 ACM/IEEE conference on Supercomputing (CDROM) - Supercomputing '99 https://doi.org/10.1145/331532.331555	conference	January 1999
Bandwidth Efficient All-reduce Operation on Tree Topologies Patarasuk, Pitch; Yuan, Xin 2007 IEEE International Parallel and Distributed Processing Symposium https://doi.org/10.1109/IPDPS.2007.370405	conference	March 2007
Extending the message passing interface (MPI) Skjellum, A.; Doss, N. E.; Viswanathan, K. Proceedings Scalable Parallel Libraries Conference https://doi.org/10.1109/SPLC.1994.376998	conference	January 1995
Universality of mixed action extrapolation formulae Chen, Jiunn-Wei; Walker-Loud, André; O'Connell, Donal Journal of High Energy Physics, Vol. 2009, Issue 04 https://doi.org/10.1088/1126-6708/2009/04/090	journal	April 2009
Building packet buffers using interleaved memories Shrimali, G.; McKeown, N. HPSR. 2005 Workshop on High Performance Switching and Routing, 2005. https://doi.org/10.1109/HPSR.2005.1503183	conference	January 2005
Optimizing threaded MPI execution on SMP clusters Tang, Hong; Yang, Tao Proceedings of the 15th international conference on Supercomputing - ICS '01 https://doi.org/10.1145/377792.377895	conference	January 2001
DADO: A tree-structured machine architecture for production systems Stolfo, Salvatore; Shaw, David Elliot Columbia University, 15 p. https://doi.org/10.7916/D8SQ97CN CUCS-24-82	report	March 1982
An All-Reduce Operation in Star Networks Using All-to-All Broadcast Communication Pattern Oh, Eunseuk; Choi, Hongsik; Primeaux, David Lecture Notes in Computer Science https://doi.org/10.1007/11428831_52	book	January 2005
Computing the Hough transform on a scan line array processor (image processing) Fisher, A. L.; Highnam, P. T. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 11, Issue 3 https://doi.org/10.1109/34.21795	journal	March 1989

Similar Records

Performing a local reduction operation on a parallel computer

Patent · Tue Jun 04 00:00:00 EDT 2013 · OSTI ID:1084349

Internode data communications in a parallel computer

Patent · Tue Sep 03 00:00:00 EDT 2013 · OSTI ID:1092903

Internode data communications in a parallel computer

Patent · Mon Feb 10 23:00:00 EST 2014 · OSTI ID:1119575

Related Subjects

97 MATHEMATICS AND COMPUTING

Performing a local reduction operation on a parallel computer

Citation Formats

References (14)

Similar Records

Related Subjects