You need JavaScript to view this

APEnet+: a 3D Torus network optimized for GPU-based HPC Systems

Abstract

In the supercomputing arena, the strong rise of GPU-accelerated clusters is a matter of fact. Within INFN, we proposed an initiative - the QUonG project - whose aim is to deploy a high performance computing system dedicated to scientific computations leveraging on commodity multi-core processors coupled with latest generation GPUs. The inter-node interconnection system is based on a point-to-point, high performance, low latency 3D torus network which is built in the framework of the APEnet+ project. It takes the form of an FPGA-based PCIe network card exposing six full bidirectional links running at 34 Gbps each that implements the RDMA protocol. In order to enable significant access latency reduction for inter-node data transfer, a direct network-to-GPU interface was built. The specialized hardware blocks, integrated in the APEnet+ board, provide support for GPU-initiated communications using the so called PCIe peer-to-peer (P2P) transactions. This development is made in close collaboration with the GPU vendor NVIDIA. The final shape of a complete QUonG deployment is an assembly of standard 42U racks, each one capable of 80 TFLOPS/rack of peak performance, at a cost of 5 k Euro-Sign /T F LOPS and for an estimated power consumption of 25 kW/rack. In this paper we  More>>
Authors:
Ammendola, R; [1]  Biagioni, A; Frezza, O; Lo Cicero, F; Lonardo, A; Paolucci, P S; Rossetti, D; Simula, F; Tosoratto, L; Vicini, P [2] 
  1. INFN Tor Vergata (Italy)
  2. INFN Roma (Italy)
Publication Date:
Dec 13, 2012
Product Type:
Journal Article
Resource Relation:
Journal Name: Journal of Physics. Conference Series (Online); Journal Volume: 396; Journal Issue: 4; Conference: CHEP2012: International conference on computing in high energy and nuclear physics 2012, New York, NY (United States), 21-25 May 2012; Other Information: Country of input: International Atomic Energy Agency (IAEA)
Subject:
97 MATHEMATICAL METHODS AND COMPUTING; CALCULATION METHODS; COMPUTER CALCULATIONS; COMPUTER CODES; COMPUTER NETWORKS; DATA ACQUISITION; DATA ACQUISITION SYSTEMS; DATA TRANSMISSION; DISTRIBUTED DATA PROCESSING; PARALLEL PROCESSING; PERFORMANCE; SUPERCOMPUTERS
OSTI ID:
22079391
Country of Origin:
United Kingdom
Language:
English
Other Identifying Numbers:
Journal ID: ISSN 1742-6596; TRN: GB13O2519038186
Availability:
Available from http://dx.doi.org/10.1088/1742-6596/396/4/042059
Submitting Site:
INIS
Size:
[10 page(s)]
Announcement Date:
Apr 04, 2013

Citation Formats

Ammendola, R, Biagioni, A, Frezza, O, Lo Cicero, F, Lonardo, A, Paolucci, P S, Rossetti, D, Simula, F, Tosoratto, L, and Vicini, P. APEnet+: a 3D Torus network optimized for GPU-based HPC Systems. United Kingdom: N. p., 2012. Web. doi:10.1088/1742-6596/396/4/042059.
Ammendola, R, Biagioni, A, Frezza, O, Lo Cicero, F, Lonardo, A, Paolucci, P S, Rossetti, D, Simula, F, Tosoratto, L, & Vicini, P. APEnet+: a 3D Torus network optimized for GPU-based HPC Systems. United Kingdom. https://doi.org/10.1088/1742-6596/396/4/042059
Ammendola, R, Biagioni, A, Frezza, O, Lo Cicero, F, Lonardo, A, Paolucci, P S, Rossetti, D, Simula, F, Tosoratto, L, and Vicini, P. 2012. "APEnet+: a 3D Torus network optimized for GPU-based HPC Systems." United Kingdom. https://doi.org/10.1088/1742-6596/396/4/042059.
@misc{etde_22079391,
title = {APEnet+: a 3D Torus network optimized for GPU-based HPC Systems}
author = {Ammendola, R, Biagioni, A, Frezza, O, Lo Cicero, F, Lonardo, A, Paolucci, P S, Rossetti, D, Simula, F, Tosoratto, L, and Vicini, P}
abstractNote = {In the supercomputing arena, the strong rise of GPU-accelerated clusters is a matter of fact. Within INFN, we proposed an initiative - the QUonG project - whose aim is to deploy a high performance computing system dedicated to scientific computations leveraging on commodity multi-core processors coupled with latest generation GPUs. The inter-node interconnection system is based on a point-to-point, high performance, low latency 3D torus network which is built in the framework of the APEnet+ project. It takes the form of an FPGA-based PCIe network card exposing six full bidirectional links running at 34 Gbps each that implements the RDMA protocol. In order to enable significant access latency reduction for inter-node data transfer, a direct network-to-GPU interface was built. The specialized hardware blocks, integrated in the APEnet+ board, provide support for GPU-initiated communications using the so called PCIe peer-to-peer (P2P) transactions. This development is made in close collaboration with the GPU vendor NVIDIA. The final shape of a complete QUonG deployment is an assembly of standard 42U racks, each one capable of 80 TFLOPS/rack of peak performance, at a cost of 5 k Euro-Sign /T F LOPS and for an estimated power consumption of 25 kW/rack. In this paper we report on the status of final rack deployment and on the R and D activities for 2012 that will focus on performance enhancement of the APEnet+ hardware through the adoption of new generation 28 nm FPGAs allowing the implementation of PCIe Gen3 host interface and the addition of new fault tolerance-oriented capabilities.}
doi = {10.1088/1742-6596/396/4/042059}
journal = []
issue = {4}
volume = {396}
journal type = {AC}
place = {United Kingdom}
year = {2012}
month = {Dec}
}