Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Studying CPU and memory utilization of applications on Fujitsu A64FX and Nvidia Grace Superchip

Conference ·

ARM-based manycore CPU architectures are well-positioned to provide the rising memory throughput requirements of modern data intensive scientific applications in High Performance Computing (HPC). The Fujitsu A64FX CPU platform is based on the ARM v8.2A architecture, and is the processor of the flagship Japanese supercomputer - "Fugaku", which was previously ranked as the #1 supercomputer in the world according to the Top500 list. The Nvidia Grace superchip features 144 Neoverse V2 cores based on the ARMv9 architecture with 4x128b SVE2, providing exceptional computational power. The chip supports up to 480GB of memory, making it ideal for AI, machine learning, and scientific computing workloads. In this paper, we conduct a thorough performance exploration of a variety of parallel bandwidth-sensitive benchmarks and applications compiled with the native Fujitsu compiler on a Fugaku A64FX compute node and ARM (LLVM) Compiler on an NVIDIA Grace superchip compute node, engaging all the computational cores per cluster using OpenMP multithreading (assuming the cores can drive the available bandwidth). Our ultimate goals are to study the resource utilization of scientific applications and benchmarks on A64FX and Grace superchip, considering graph application scenarios ( GAP Benchmark suite) and eleven appli- cation proxies from the Rodinia heterogeneous benchmark suite (considering domains such as Data Mining, Bioinformatics, Fluid Dynamics, Pattern Recognition, etc.). Through exhaustive performance monitoring, we quantify the resource utilization of diverse OpenMP-based HPC applications on both the Fujitsu A64FX and the Nvidia Grace Superchip platforms.

Research Organization:
Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
2496226
Report Number(s):
PNNL-SA-203239
Country of Publication:
United States
Language:
English

Similar Records

Early Evaluation of Fugaku A64FX Architecture Using Climate Workloads
Conference · Fri Oct 01 00:00:00 EDT 2021 · OSTI ID:1965278

Ookami: Deployment and Initial Experiences
Conference · Sat Jul 17 00:00:00 EDT 2021 · Practice and Experience in Advanced Research Computing · OSTI ID:1907882

Application Experiences on a GPU-Accelerated Arm-based HPC Testbed
Conference · Tue Jan 31 23:00:00 EST 2023 · OSTI ID:1960691