Ultrascalable petaflop parallel supercomputer

Blumrich, Matthias A; Chen, Dong; Chiu, George; Cipolla, Thomas M; Coteus, Paul W; Gara, Alan G; Giampapa, Mark E; Hall, Shawn; Haring, Rudolf A; Heidelberger, Philip; Kopcsay, Gerard V; Ohmacht, Martin; Salapura, Valentina; Sugavanam, Krishnan; Takken, Todd

Advanced Search OptionsAdvanced Search queries use a traditional Term Search. For more info, see our FAQ.

All Fields:

Patent Title:

Abstract:

Assignee:

Inventor(s):

Patent Number:

Patent Classification (CPC):

All Classifications
A - human necessities
A01 - agriculture
A21 - baking
A22 - butchering
A23 - foods or foodstuffs
A24 - tobacco
A41 - wearing apparel
A42 - headwear
A43 - footwear
A44 - haberdashery
A45 - hand or travelling articles
A46 - brushware
A47 - furniture
A61 - medical or veterinary science
A62 - life-saving
A63 - sports
A99 - subject matter not otherwise provided for in this section
B - performing operations
B01 - physical or chemical processes or apparatus in general
B02 - crushing, pulverising, or disintegrating
B03 - separation of solid materials using liquids or using pneumatic tables or jigs
B04 - centrifugal apparatus or machines for carrying-out physical or chemical processes
B05 - spraying or atomising in general
B06 - generating or transmitting mechanical vibrations in general
B07 - separating solids from solids
B08 - cleaning
B09 - disposal of solid waste
B21 - mechanical metal-working without essentially removing material
B22 - casting
B23 - machine tools
B24 - grinding
B25 - hand tools
B26 - hand cutting tools
B27 - working or preserving wood or similar material
B28 - working cement, clay, or stone
B29 - working of plastics
B30 - presses
B31 - making articles of paper, cardboard or material worked in a manner analogous to paper
B32 - layered products
B33 - additive manufacturing technology
B41 - printing
B42 - bookbinding
B43 - writing or drawing implements
B44 - decorative arts
B60 - vehicles in general
B61 - railways
B62 - land vehicles for travelling otherwise than on rails
B63 - ships or other waterborne vessels
B64 - aircraft
B65 - conveying
B66 - hoisting
B67 - opening, closing {or cleaning} bottles, jars or similar containers
B68 - saddlery
B81 - microstructural technology
B82 - nanotechnology
B99 - subject matter not otherwise provided for in this section
C - chemistry
C01 - inorganic chemistry
C02 - treatment of water, waste water, sewage, or sludge
C03 - glass
C04 - cements
C05 - fertilisers
C06 - explosives
C07 - organic chemistry
C08 - organic macromolecular compounds
C09 - dyes
C10 - petroleum, gas or coke industries
C11 - animal or vegetable oils, fats, fatty substances or waxes
C12 - biochemistry
C13 - sugar industry
C14 - skins
C21 - metallurgy of iron
C22 - metallurgy
C23 - coating metallic material
C25 - electrolytic or electrophoretic processes
C30 - crystal growth
C40 - combinatorial technology
C99 - subject matter not otherwise provided for in this section
D - textiles
D01 - natural or man-made threads or fibres
D02 - yarns
D03 - weaving
D04 - braiding
D05 - sewing
D06 - treatment of textiles or the like
D07 - ropes
D10 - indexing scheme associated with sublasses of section d, relating to textiles
D21 - paper-making
D99 - subject matter not otherwise provided for in this section
E - fixed constructions
E01 - construction of roads, railways, or bridges
E02 - hydraulic engineering
E03 - water supply
E04 - building
E05 - locks
E06 - doors, windows, shutters, or roller blinds in general
E21 - earth drilling
E99 - subject matter not otherwise provided for in this section
F - mechanical engineering
F01 - machines or engines in general
F02 - combustion engines
F03 - machines or engines for liquids
F04 - positive - displacement machines for liquids
F05 - indexing schemes relating to engines or pumps in various subclasses of classes f01-f04
F15 - fluid-pressure actuators
F16 - engineering elements and units
F17 - storing or distributing gases or liquids
F21 - lighting
F22 - steam generation
F23 - combustion apparatus
F24 - heating
F25 - refrigeration or cooling
F26 - drying
F27 - furnaces
F28 - heat exchange in general
F41 - weapons
F42 - ammunition
F99 - subject matter not otherwise provided for in this section
G - physics
G01 - measuring
G02 - optics
G03 - photography
G04 - horology
G05 - controlling
G06 - computing
G07 - checking-devices
G08 - signalling
G09 - education
G10 - musical instruments
G11 - information storage
G12 - instrument details
G16 - information and communication technology [ict] specially adapted for specific application fields
G21 - nuclear physics
G99 - subject matter not otherwise provided for in this section
H - electricity
H01 - basic electric elements
H02 - generation
H03 - basic electronic circuitry
H04 - electric communication technique
H05 - electric techniques not otherwise provided for
H99 - subject matter not otherwise provided for in this section
Y - new / cross sectional technologies
Y02 - technologies or applications for mitigation or adaptation against climate change
Y04 - information or communication technologies having an impact on other technology areas
Y10 - technical subjects covered by former uspc

More Options ...

Title: Ultrascalable petaflop parallel supercomputer

Abstract

A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.

Inventors:

Blumrich, Matthias A ^[1]; Chen, Dong ^[2]; Chiu, George ^[3]; Cipolla, Thomas M ^[4]; Coteus, Paul W ^[5]; Gara, Alan G ^[6]; Giampapa, Mark E ^[7]; Hall, Shawn ^[8]; Haring, Rudolf A ^[9]; Heidelberger, Philip ^[9]; Kopcsay, Gerard V ^[5]; Ohmacht, Martin ^[5]; Salapura, Valentina ^[10]; Sugavanam, Krishnan ^[11]; Takken, Todd ^[12]

Ridgefield, CT
Croton On Hudson, NY
Cross River, NY
Katonah, NY
Yorktown Heights, NY
Mount Kisco, NY
Irvington, NY
Pleasantville, NY
Cortlandt Manor, NY
Chappaqua, NY
Mahopac, NY
Brewster, NY

Issue Date:: Tue Jul 20 00:00:00 EDT 2010

Research Org.:: International Business Machines Corp., Armonk, NY (United States)

Sponsoring Org.:: USDOE

OSTI Identifier:: 993067

Patent Number(s):: 7761687

Application Number:: 11/768,905

Assignee:: International Business Machines Corporation (Armonk, NY)

Patent Classifications (CPCs):: G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING

Show more

G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
G06F15/17337 - {Direct connection machines, e.g. completely connected computers, point to point communication networks

Show less

DOE Contract Number:: B554331

Resource Type:: Patent

Country of Publication:: United States

Language:: English

Subject:: 97 MATHEMATICS AND COMPUTING

Citation Formats


                    Blumrich, Matthias A, Chen, Dong, Chiu, George, Cipolla, Thomas M, Coteus, Paul W, Gara, Alan G, Giampapa, Mark E, Hall, Shawn, Haring, Rudolf A, Heidelberger, Philip, Kopcsay, Gerard V, Ohmacht, Martin, Salapura, Valentina, Sugavanam, Krishnan, and Takken, Todd. Ultrascalable petaflop parallel supercomputer.  United States: N. p., 2010. 
        Web.

Copy to clipboard


                    Blumrich, Matthias A, Chen, Dong, Chiu, George, Cipolla, Thomas M, Coteus, Paul W, Gara, Alan G, Giampapa, Mark E, Hall, Shawn, Haring, Rudolf A, Heidelberger, Philip, Kopcsay, Gerard V, Ohmacht, Martin, Salapura, Valentina, Sugavanam, Krishnan, & Takken, Todd. Ultrascalable petaflop parallel supercomputer.  United States.

Copy to clipboard


                    Blumrich, Matthias A, Chen, Dong, Chiu, George, Cipolla, Thomas M, Coteus, Paul W, Gara, Alan G, Giampapa, Mark E, Hall, Shawn, Haring, Rudolf A, Heidelberger, Philip, Kopcsay, Gerard V, Ohmacht, Martin, Salapura, Valentina, Sugavanam, Krishnan, and Takken, Todd. Tue .  
        "Ultrascalable petaflop parallel supercomputer".  United States.  https://www.osti.gov/servlets/purl/993067.

Copy to clipboard


                    
@article{osti_993067,

  title        = {Ultrascalable petaflop parallel supercomputer},

  author       = {Blumrich, Matthias A and Chen, Dong and Chiu, George and Cipolla, Thomas M and Coteus, Paul W and Gara, Alan G and Giampapa, Mark E and Hall, Shawn and Haring, Rudolf A and Heidelberger, Philip and Kopcsay, Gerard V and Ohmacht, Martin and Salapura, Valentina and Sugavanam, Krishnan and Takken, Todd},

  abstractNote = {A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.},

  doi          = {},

  journal      = {},
number       = ,

  volume       = ,

  place        = {United States},

  year         = {Tue Jul 20 00:00:00 EDT 2010},

  month        = {Tue Jul 20 00:00:00 EDT 2010}

}

Copy to clipboard

Patent:

Save / Share:

Export Metadata

Save to My Library

Works referenced in this record:

Performance Evaluation and Design Trade-Offs for Network-on-Chip Interconnect Architectures
journal, August 2005

Pande, P. P.; Grecu, C.; Jones, M.
IEEE Transactions on Computers, Vol. 54, Issue 8
https://doi.org/10.1109/TC.2005.134

Similar Records in DOE Patents and OSTI.GOV collections:

Multi-petascale highly efficient parallel supercomputer

Patent Asaad, Sameh; Bellofatto, Ralph E.; Blocksome, Michael A.; ...

A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC). Each ASIC computing node comprises a system-on-chip ASIC utilizing four or more processors integrated into one die, with each having full access to all system resources and enabling adaptive partitioning of the processors to functions such as compute or messaging I/Omore » « less
Full Text Available
Multi-petascale highly efficient parallel supercomputer

Patent Asaad, Sameh; Bellofatto, Ralph E.; Blocksome, Michael A.; ...

A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaflop-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC). The ASIC nodes are interconnected by a five dimensional torus network that optimally maximize the throughput of packet communications between nodes and minimize latency. The network implements collective network and a global asynchronous network that provides global barrier and notification functions. Integrated in the node design include a list-based prefetcher. The memory system implements transaction memory, thread level speculation, and multiversioning cache that improves soft error rate at the same time andmore » « less
Full Text Available
Petaflops router

Patent Baker, Zachary Kent; Power, John Fredrick; Tripp, Justin Leonard; ...

Disclosed is a method and system for performing operations on at least one input data vector in order to produce at least one output vector to permit easy, scalable and fast programming of a petascale equivalent supercomputer. A PetaFlops Router may comprise one or more PetaFlops Nodes, which may be connected to each other and/or external data provider/consumers via a programmable crossbar switch external to the PetaFlops Node. Each PetaFlops Node has a FPGA and a programmable intra-FPGA crossbar switch that permits input and output variables to be configurably connected to various physical operators contained in the FPGA as desiredmore » « less
Full Text Available
Fault tolerance in a supercomputer through dynamic repartitioning

Patent Chen, Dong [Croton On Hudson, NY]; Coteus, Paul W [Yorktown Heights, NY]; Gara, Alan G [Mount Kisco, NY]; ...

A multiprocessor, parallel computer is made tolerant to hardware failures by providing extra groups of redundant standby processors and by designing the system so that these extra groups of processors can be swapped with any group which experiences a hardware failure. This swapping can be under software control, thereby permitting the entire computer to sustain a hardware failure but, after swapping in the standby processors, to still appear to software as a pristine, fully functioning system.
Full Text Available
Endpoint-based parallel data processing in a parallel active messaging interface of a parallel computer

Patent Archer, Charles J; Blocksome, Michael E; Ratterman, Joseph D; ...

Endpoint-based parallel data processing in a parallel active messaging interface ('PAMI') of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing a data communications geometry, the geometry specifying, for tasks representing processes of execution of the parallel application, a set of endpoints that are used in collective operations of the PAMI including a plurality of endpoints for one ofmore » « less
Full Text Available

Similar Records

Title: Ultrascalable petaflop parallel supercomputer

Abstract

Citation Formats

Performance Evaluation and Design Trade-Offs for Network-on-Chip Interconnect Architectures journal, August 2005

Performance Evaluation and Design Trade-Offs for Network-on-Chip Interconnect Architectures
journal, August 2005