Ultrascalable petaflop parallel supercomputer
Abstract
A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.
- Inventors:
-
- Ridgefield, CT
- Croton On Hudson, NY
- Cross River, NY
- Katonah, NY
- Yorktown Heights, NY
- Mount Kisco, NY
- Irvington, NY
- Pleasantville, NY
- Cortlandt Manor, NY
- Chappaqua, NY
- Mahopac, NY
- Brewster, NY
- Issue Date:
- Research Org.:
- International Business Machines Corp., Armonk, NY (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 993067
- Patent Number(s):
- 7761687
- Application Number:
- 11/768,905
- Assignee:
- International Business Machines Corporation (Armonk, NY)
- Patent Classifications (CPCs):
-
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
- DOE Contract Number:
- B554331
- Resource Type:
- Patent
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING
Citation Formats
Blumrich, Matthias A, Chen, Dong, Chiu, George, Cipolla, Thomas M, Coteus, Paul W, Gara, Alan G, Giampapa, Mark E, Hall, Shawn, Haring, Rudolf A, Heidelberger, Philip, Kopcsay, Gerard V, Ohmacht, Martin, Salapura, Valentina, Sugavanam, Krishnan, and Takken, Todd. Ultrascalable petaflop parallel supercomputer. United States: N. p., 2010.
Web.
Blumrich, Matthias A, Chen, Dong, Chiu, George, Cipolla, Thomas M, Coteus, Paul W, Gara, Alan G, Giampapa, Mark E, Hall, Shawn, Haring, Rudolf A, Heidelberger, Philip, Kopcsay, Gerard V, Ohmacht, Martin, Salapura, Valentina, Sugavanam, Krishnan, & Takken, Todd. Ultrascalable petaflop parallel supercomputer. United States.
Blumrich, Matthias A, Chen, Dong, Chiu, George, Cipolla, Thomas M, Coteus, Paul W, Gara, Alan G, Giampapa, Mark E, Hall, Shawn, Haring, Rudolf A, Heidelberger, Philip, Kopcsay, Gerard V, Ohmacht, Martin, Salapura, Valentina, Sugavanam, Krishnan, and Takken, Todd. Tue .
"Ultrascalable petaflop parallel supercomputer". United States. https://www.osti.gov/servlets/purl/993067.
@article{osti_993067,
title = {Ultrascalable petaflop parallel supercomputer},
author = {Blumrich, Matthias A and Chen, Dong and Chiu, George and Cipolla, Thomas M and Coteus, Paul W and Gara, Alan G and Giampapa, Mark E and Hall, Shawn and Haring, Rudolf A and Heidelberger, Philip and Kopcsay, Gerard V and Ohmacht, Martin and Salapura, Valentina and Sugavanam, Krishnan and Takken, Todd},
abstractNote = {A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2010},
month = {7}
}
Works referenced in this record:
Performance Evaluation and Design Trade-Offs for Network-on-Chip Interconnect Architectures
journal, August 2005
- Pande, P. P.; Grecu, C.; Jones, M.
- IEEE Transactions on Computers, Vol. 54, Issue 8