Abstract
In the supercomputing arena, the strong rise of GPU-accelerated clusters is a matter of fact. Within INFN, we proposed an initiative - the QUonG project - whose aim is to deploy a high performance computing system dedicated to scientific computations leveraging on commodity multi-core processors coupled with latest generation GPUs. The inter-node interconnection system is based on a point-to-point, high performance, low latency 3D torus network which is built in the framework of the APEnet+ project. It takes the form of an FPGA-based PCIe network card exposing six full bidirectional links running at 34 Gbps each that implements the RDMA protocol. In order to enable significant access latency reduction for inter-node data transfer, a direct network-to-GPU interface was built. The specialized hardware blocks, integrated in the APEnet+ board, provide support for GPU-initiated communications using the so called PCIe peer-to-peer (P2P) transactions. This development is made in close collaboration with the GPU vendor NVIDIA. The final shape of a complete QUonG deployment is an assembly of standard 42U racks, each one capable of 80 TFLOPS/rack of peak performance, at a cost of 5 k Euro-Sign /T F LOPS and for an estimated power consumption of 25 kW/rack. In this paper we
More>>
Ammendola, R;
[1]
Biagioni, A;
Frezza, O;
Lo Cicero, F;
Lonardo, A;
Paolucci, P S;
Rossetti, D;
Simula, F;
Tosoratto, L;
Vicini, P
[2]
- INFN Tor Vergata (Italy)
- INFN Roma (Italy)
Citation Formats
Ammendola, R, Biagioni, A, Frezza, O, Lo Cicero, F, Lonardo, A, Paolucci, P S, Rossetti, D, Simula, F, Tosoratto, L, and Vicini, P.
APEnet+: a 3D Torus network optimized for GPU-based HPC Systems.
United Kingdom: N. p.,
2012.
Web.
doi:10.1088/1742-6596/396/4/042059.
Ammendola, R, Biagioni, A, Frezza, O, Lo Cicero, F, Lonardo, A, Paolucci, P S, Rossetti, D, Simula, F, Tosoratto, L, & Vicini, P.
APEnet+: a 3D Torus network optimized for GPU-based HPC Systems.
United Kingdom.
https://doi.org/10.1088/1742-6596/396/4/042059
Ammendola, R, Biagioni, A, Frezza, O, Lo Cicero, F, Lonardo, A, Paolucci, P S, Rossetti, D, Simula, F, Tosoratto, L, and Vicini, P.
2012.
"APEnet+: a 3D Torus network optimized for GPU-based HPC Systems."
United Kingdom.
https://doi.org/10.1088/1742-6596/396/4/042059.
@misc{etde_22079391,
title = {APEnet+: a 3D Torus network optimized for GPU-based HPC Systems}
author = {Ammendola, R, Biagioni, A, Frezza, O, Lo Cicero, F, Lonardo, A, Paolucci, P S, Rossetti, D, Simula, F, Tosoratto, L, and Vicini, P}
abstractNote = {In the supercomputing arena, the strong rise of GPU-accelerated clusters is a matter of fact. Within INFN, we proposed an initiative - the QUonG project - whose aim is to deploy a high performance computing system dedicated to scientific computations leveraging on commodity multi-core processors coupled with latest generation GPUs. The inter-node interconnection system is based on a point-to-point, high performance, low latency 3D torus network which is built in the framework of the APEnet+ project. It takes the form of an FPGA-based PCIe network card exposing six full bidirectional links running at 34 Gbps each that implements the RDMA protocol. In order to enable significant access latency reduction for inter-node data transfer, a direct network-to-GPU interface was built. The specialized hardware blocks, integrated in the APEnet+ board, provide support for GPU-initiated communications using the so called PCIe peer-to-peer (P2P) transactions. This development is made in close collaboration with the GPU vendor NVIDIA. The final shape of a complete QUonG deployment is an assembly of standard 42U racks, each one capable of 80 TFLOPS/rack of peak performance, at a cost of 5 k Euro-Sign /T F LOPS and for an estimated power consumption of 25 kW/rack. In this paper we report on the status of final rack deployment and on the R and D activities for 2012 that will focus on performance enhancement of the APEnet+ hardware through the adoption of new generation 28 nm FPGAs allowing the implementation of PCIe Gen3 host interface and the addition of new fault tolerance-oriented capabilities.}
doi = {10.1088/1742-6596/396/4/042059}
journal = []
issue = {4}
volume = {396}
journal type = {AC}
place = {United Kingdom}
year = {2012}
month = {Dec}
}
title = {APEnet+: a 3D Torus network optimized for GPU-based HPC Systems}
author = {Ammendola, R, Biagioni, A, Frezza, O, Lo Cicero, F, Lonardo, A, Paolucci, P S, Rossetti, D, Simula, F, Tosoratto, L, and Vicini, P}
abstractNote = {In the supercomputing arena, the strong rise of GPU-accelerated clusters is a matter of fact. Within INFN, we proposed an initiative - the QUonG project - whose aim is to deploy a high performance computing system dedicated to scientific computations leveraging on commodity multi-core processors coupled with latest generation GPUs. The inter-node interconnection system is based on a point-to-point, high performance, low latency 3D torus network which is built in the framework of the APEnet+ project. It takes the form of an FPGA-based PCIe network card exposing six full bidirectional links running at 34 Gbps each that implements the RDMA protocol. In order to enable significant access latency reduction for inter-node data transfer, a direct network-to-GPU interface was built. The specialized hardware blocks, integrated in the APEnet+ board, provide support for GPU-initiated communications using the so called PCIe peer-to-peer (P2P) transactions. This development is made in close collaboration with the GPU vendor NVIDIA. The final shape of a complete QUonG deployment is an assembly of standard 42U racks, each one capable of 80 TFLOPS/rack of peak performance, at a cost of 5 k Euro-Sign /T F LOPS and for an estimated power consumption of 25 kW/rack. In this paper we report on the status of final rack deployment and on the R and D activities for 2012 that will focus on performance enhancement of the APEnet+ hardware through the adoption of new generation 28 nm FPGAs allowing the implementation of PCIe Gen3 host interface and the addition of new fault tolerance-oriented capabilities.}
doi = {10.1088/1742-6596/396/4/042059}
journal = []
issue = {4}
volume = {396}
journal type = {AC}
place = {United Kingdom}
year = {2012}
month = {Dec}
}