Analysis of performance improvements for host and GPU interface of the APENet+ 3D Torus network

Ammendola A, R; Biagioni, A; Frezza, O; Lo Cicero, F; Lonardo, A; Paolucci, P S; Rossetti, D; Simula, F; Tosoratto, L; Vicini, P

doi:10.1088/1742-6596/523/1/012013

Made available by
U.S. Department of Energy
Office of Scientific and Technical Information

ETDEWeb

ETDEWEB / / Analysis of performance improvements for host and GPU interface of the APENet+ 3D Torus network

Analysis of performance improvements for host and GPU interface of the APENet+ 3D Torus network

Full Record

Abstract

APEnet+ is an INFN (Italian Institute for Nuclear Physics) project aiming to develop a custom 3-Dimensional torus interconnect network optimized for hybrid clusters CPU-GPU dedicated to High Performance scientific Computing. The APEnet+ interconnect fabric is built on a FPGA-based PCI-express board with 6 bi-directional off-board links showing 34 Gbps of raw bandwidth per direction, and leverages upon peer-to-peer capabilities of Fermi and Kepler-class NVIDIA GPUs to obtain real zero-copy, GPU-to-GPU low latency transfers. The minimization of APEnet+ transfer latency is achieved through the adoption of RDMA protocol implemented in FPGA with specialized hardware blocks tightly coupled with embedded microprocessor. This architecture provides a high performance low latency offload engine for both trasmit and receive side of data transactions: preliminary results are encouraging, showing 50% of bandwidth increase for large packet size transfers. In this paper we describe the APEnet+ architecture, detailing the hardware implementation and discuss the impact of such RDMA specialized hardware on host interface latency and bandwidth.

Authors:

Ammendola A, R; ^[1] Biagioni, A; Frezza, O; Lo Cicero, F; Lonardo, A; Paolucci, P S; Rossetti, D; Simula, F; Tosoratto, L; Vicini, P ^[2]

INFN Roma II, Via della Ricerca Scientifica 1 – 00133 Roma (Italy)
INFN Roma I, P.le Aldo Moro 2 – 00185 Roma (Italy)

Publication Date:

Jun 06, 2014

DOI:

https://doi.org/10.1088/1742-6596/523/1/012013

Product Type:

Journal Article

Resource Relation:

Journal Name: Journal of Physics. Conference Series (Online); Journal Volume: 523; Journal Issue: 1; Conference: ACAT2013: 15. international workshop on advanced computing and analysis techniques in physics research, Beijing (China), 16-21 May 2013; Other Information: Country of input: International Atomic Energy Agency (IAEA)

Subject:

97 MATHEMATICAL METHODS AND COMPUTING; COMPUTER ARCHITECTURE; COMPUTER NETWORKS; DISTRIBUTED DATA PROCESSING; HOST; IMPLEMENTATION; INTERFACES; MICROPROCESSORS; PERFORMANCE; THREE-DIMENSIONAL CALCULATIONS

OSTI ID:

22377881

Country of Origin:

United Kingdom

Language:

English

Other Identifying Numbers:

Journal ID: ISSN 1742-6596; TRN: GB15P9858083343

Availability:

Available from http://dx.doi.org/10.1088/1742-6596/523/1/012013

Submitting Site:

INIS

Size:

[8 page(s)]

Announcement Date:

Aug 13, 2015

Citation Formats

Ammendola A, R, Biagioni, A, Frezza, O, Lo Cicero, F, Lonardo, A, Paolucci, P S, Rossetti, D, Simula, F, Tosoratto, L, and Vicini, P. Analysis of performance improvements for host and GPU interface of the APENet+ 3D Torus network. United Kingdom: N. p., 2014. Web. doi:10.1088/1742-6596/523/1/012013.

Ammendola A, R, Biagioni, A, Frezza, O, Lo Cicero, F, Lonardo, A, Paolucci, P S, Rossetti, D, Simula, F, Tosoratto, L, & Vicini, P. Analysis of performance improvements for host and GPU interface of the APENet+ 3D Torus network. United Kingdom. https://doi.org/10.1088/1742-6596/523/1/012013

Ammendola A, R, Biagioni, A, Frezza, O, Lo Cicero, F, Lonardo, A, Paolucci, P S, Rossetti, D, Simula, F, Tosoratto, L, and Vicini, P. 2014. "Analysis of performance improvements for host and GPU interface of the APENet+ 3D Torus network." United Kingdom. https://doi.org/10.1088/1742-6596/523/1/012013.

@misc{etde_22377881,
title = {Analysis of performance improvements for host and GPU interface of the APENet+ 3D Torus network}
author = {Ammendola A, R, Biagioni, A, Frezza, O, Lo Cicero, F, Lonardo, A, Paolucci, P S, Rossetti, D, Simula, F, Tosoratto, L, and Vicini, P}
abstractNote = {APEnet+ is an INFN (Italian Institute for Nuclear Physics) project aiming to develop a custom 3-Dimensional torus interconnect network optimized for hybrid clusters CPU-GPU dedicated to High Performance scientific Computing. The APEnet+ interconnect fabric is built on a FPGA-based PCI-express board with 6 bi-directional off-board links showing 34 Gbps of raw bandwidth per direction, and leverages upon peer-to-peer capabilities of Fermi and Kepler-class NVIDIA GPUs to obtain real zero-copy, GPU-to-GPU low latency transfers. The minimization of APEnet+ transfer latency is achieved through the adoption of RDMA protocol implemented in FPGA with specialized hardware blocks tightly coupled with embedded microprocessor. This architecture provides a high performance low latency offload engine for both trasmit and receive side of data transactions: preliminary results are encouraging, showing 50% of bandwidth increase for large packet size transfers. In this paper we describe the APEnet+ architecture, detailing the hardware implementation and discuss the impact of such RDMA specialized hardware on host interface latency and bandwidth.}
doi = {10.1088/1742-6596/523/1/012013}
journal = []
issue = {1}
volume = {523}
journal type = {AC}
place = {United Kingdom}
year = {2014}
month = {Jun}
}

ETDEWeb

Journal Article:

SAVE / SHARE

Abstract

Journal Article:

SAVE / SHARE

Citation Formats