DOE Patents title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Ultrascalable petaflop parallel supercomputer

Abstract

A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.

Inventors:
 [1];  [2];  [3];  [4];  [5];  [6];  [7];  [8];  [9];  [9];  [5];  [5];  [10];  [11];  [12]
  1. Ridgefield, CT
  2. Croton On Hudson, NY
  3. Cross River, NY
  4. Katonah, NY
  5. Yorktown Heights, NY
  6. Mount Kisco, NY
  7. Irvington, NY
  8. Pleasantville, NY
  9. Cortlandt Manor, NY
  10. Chappaqua, NY
  11. Mahopac, NY
  12. Brewster, NY
Issue Date:
Research Org.:
International Business Machines Corp., Armonk, NY (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
993067
Patent Number(s):
7761687
Application Number:
11/768,905
Assignee:
International Business Machines Corporation (Armonk, NY)
Patent Classifications (CPCs):
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
DOE Contract Number:  
B554331
Resource Type:
Patent
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Blumrich, Matthias A, Chen, Dong, Chiu, George, Cipolla, Thomas M, Coteus, Paul W, Gara, Alan G, Giampapa, Mark E, Hall, Shawn, Haring, Rudolf A, Heidelberger, Philip, Kopcsay, Gerard V, Ohmacht, Martin, Salapura, Valentina, Sugavanam, Krishnan, and Takken, Todd. Ultrascalable petaflop parallel supercomputer. United States: N. p., 2010. Web.
Blumrich, Matthias A, Chen, Dong, Chiu, George, Cipolla, Thomas M, Coteus, Paul W, Gara, Alan G, Giampapa, Mark E, Hall, Shawn, Haring, Rudolf A, Heidelberger, Philip, Kopcsay, Gerard V, Ohmacht, Martin, Salapura, Valentina, Sugavanam, Krishnan, & Takken, Todd. Ultrascalable petaflop parallel supercomputer. United States.
Blumrich, Matthias A, Chen, Dong, Chiu, George, Cipolla, Thomas M, Coteus, Paul W, Gara, Alan G, Giampapa, Mark E, Hall, Shawn, Haring, Rudolf A, Heidelberger, Philip, Kopcsay, Gerard V, Ohmacht, Martin, Salapura, Valentina, Sugavanam, Krishnan, and Takken, Todd. Tue . "Ultrascalable petaflop parallel supercomputer". United States. https://www.osti.gov/servlets/purl/993067.
@article{osti_993067,
title = {Ultrascalable petaflop parallel supercomputer},
author = {Blumrich, Matthias A and Chen, Dong and Chiu, George and Cipolla, Thomas M and Coteus, Paul W and Gara, Alan G and Giampapa, Mark E and Hall, Shawn and Haring, Rudolf A and Heidelberger, Philip and Kopcsay, Gerard V and Ohmacht, Martin and Salapura, Valentina and Sugavanam, Krishnan and Takken, Todd},
abstractNote = {A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Tue Jul 20 00:00:00 EDT 2010},
month = {Tue Jul 20 00:00:00 EDT 2010}
}

Works referenced in this record:

Performance Evaluation and Design Trade-Offs for Network-on-Chip Interconnect Architectures
journal, August 2005