A Case For Intra-rack Resource Disaggregation in HPC

Michelogiannakis, George; Klenk, Benjamin; Cook, Brandon; Teh, Min Yee; Glick, Madeleine; Dennison, Larry; Bergman, Keren; Shalf, John

doi:10.1145/3514245

A Case For Intra-rack Resource Disaggregation in HPC

Journal Article · Mon Mar 07 00:00:00 EST 2022 · ACM Transactions on Architecture and Code Optimization

DOI:https://doi.org/10.1145/3514245· OSTI ID:1878112

^[1]; ^[2]; ^[1]; ^[3]; ^[3]; ^[2]; ^[3]; ^[1]

Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
NVIDIA, Santa Clara, CA (United States)
Columbia University, New York, NY (United States)

The expected halt of traditional technology scaling is motivating increased heterogeneity in high-performance computing (HPC) systems with the emergence of numerous specialized accelerators. As heterogeneity increases, so does the risk of underutilizing expensive hardware resources if we preserve today’s rigid node configuration and reservation strategies. This has sparked interest in resource disaggregation to enable finer-grain allocation of hardware resources to applications. However, there is currently no data-driven study of what range of disaggregation is appropriate in HPC. To that end, we perform a detailed analysis of key metrics sampled in NERSC’s Cori, a production HPC system that executes a diverse open-science HPC workload. In addition, we profile a variety of deep-learning applications to represent an emerging workload. We show that for a rack (cabinet) configuration and applications similar to Cori, a central processing unit with intra-rack disaggregation has a 99.5% probability to find all resources it requires inside its rack. In addition, ideal intra-rack resource disaggregation in Cori could reduce memory and NIC resources by 5.36% to 69.01% and still satisfy the worst-case average rack utilization.

View Accepted Manuscript (DOE)

Research Organization:: Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)

Sponsoring Organization:: USDOE Advanced Research Projects Agency - Energy (ARPA-E); USDOE Office of Science (SC)

Grant/Contract Number:: AC02-05CH11231

OSTI ID:: 1878112

Journal Information:: ACM Transactions on Architecture and Code Optimization, Journal Name: ACM Transactions on Architecture and Code Optimization Journal Issue: 2 Vol. 19; ISSN 1544-3566

Publisher:: Association for Computing Machinery (ACM)Copyright Statement

Country of Publication:: United States

Language:: English

References (54)

Preparing NERSC users for Cori, a Cray XC40 system with Intel many integrated cores He, Yun; Cook, Brandon; Deslippe, Jack Concurrency and Computation: Practice and Experience, Vol. 30, Issue 1 https://doi.org/10.1002/cpe.4291	journal	August 2017
A Hierarchical Data-Partitioning Algorithm for Performance Optimization of Data-Parallel Applications on Heterogeneous Multi-Accelerator NUMA Nodes Khaleghzadeh, Hamidreza; Manumachu, Ravi Reddy; Lastovetsky, Alexey IEEE Access, Vol. 8 https://doi.org/10.1109/ACCESS.2019.2959905	journal	January 2020
EMF: Disaggregated GPUs in Datacenters for Efficiency, Modularity and Flexibility Guleria, Anubhav; Lakshmi, J.; Padala, Chakri 2019 IEEE International Conference on Cloud Computing in Emerging Markets (CCEM) https://doi.org/10.1109/CCEM48484.2019.000-5	conference	September 2019
TensorFlow on State-of-the-Art HPC Clusters: A Machine Learning use Case Ramirez-Gargallo, Guillem; Garcia-Gasulla, Marta; Mantovani, Filippo 2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) https://doi.org/10.1109/CCGRID.2019.00067	conference	May 2019
Towards Understanding Job Heterogeneity in HPC: A NERSC Case Study Rodrigo, Gonzalo P.; Ostberg, Per-Olov; Elmroth, Erik 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) https://doi.org/10.1109/CCGrid.2016.32	conference	May 2016
Transitioning HPC software to exascale heterogeneous computing Hwu, Wen-Mei; Chang, Li-Wen; Kim, Hee-Seok 2015 Computational Electromagnetics International Workshop (CEM) https://doi.org/10.1109/CEM.2015.7237412	conference	July 2015
QuADD: QUantifying Accelerator Disaggregated Datacenter Efficiency Guleria, Anubhav; Lakshmi, J.; Padala, Chakri 2019 IEEE 12th International Conference on Cloud Computing (CLOUD) https://doi.org/10.1109/CLOUD.2019.00064	conference	July 2019
Effective Running of End-to-End HPC Workflows on Emerging Heterogeneous Architectures Tang, Kun; Tiwari, Devesh; Gupta, Saurabh 2017 IEEE International Conference on Cluster Computing (CLUSTER) https://doi.org/10.1109/CLUSTER.2017.22	conference	September 2017
Evaluating Burst Buffer Placement in HPC Systems Khetawat, Harsh; Zimmer, Christopher; Mueller, Frank 2019 IEEE International Conference on Cluster Computing (CLUSTER) https://doi.org/10.1109/CLUSTER.2019.8891051	conference	September 2019
HPC Accelerators with 3D Memory Ujaldon, Manuel 2016 19th IEEE Intl Conference on Computational Science and Engineering (CSE), IEEE 14th Intl Conference on Embedded and Ubiquitous Computing (EUC), and 15th Intl Symposium on Distributed Computing and Applications for Business Engineering (DCABES), 2016 IEEE Intl Conference on Computational Science and Engineering (CSE) and IEEE Intl Conference on Embedded and Ubiquitous Computing (EUC) and 15th Intl Symposium on Distributed Computing and Applications for Business Engineering (DCABES) https://doi.org/10.1109/CSE-EUC-DCABES.2016.203	conference	August 2016
Deep Residual Learning for Image Recognition He, Kaiming; Zhang, Xiangyu; Ren, Shaoqing 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) https://doi.org/10.1109/CVPR.2016.90	conference	June 2016
Resource Disaggregation Versus Integrated Servers in Data Centers: Impact of Internal Transmission Capacity Limitation Cheng, Yuxin; De Andrade, Marilet; Wosinska, Lena 2018 European Conference on Optical Communication (ECOC) https://doi.org/10.1109/ECOC.2018.8535214	conference	September 2018
Performance Analysis of Communication Networks in Multi-Cluster Systems under Bursty Traffic with Communication Locality Wu, Yulei; Min, Geyong; Li, Keqiu GLOBECOM 2009 - 2009 IEEE Global Telecommunications Conference https://doi.org/10.1109/GLOCOM.2009.5425416	conference	November 2009
The Benefits of a Disaggregated Data Centre: A Resource Allocation Approach Papaioannou, Antonios D.; Nejabati, Reza; Simeonidou, Dimitra GLOBECOM 2016 - 2016 IEEE Global Communications Conference, 2016 IEEE Global Communications Conference (GLOBECOM) https://doi.org/10.1109/GLOCOM.2016.7842314	conference	December 2016
Knights landing (KNL): 2nd Generation Intel® Xeon Phi processor Sodani, Avinash 2015 IEEE Hot Chips 27 Symposium (HCS) https://doi.org/10.1109/HOTCHIPS.2015.7477467	conference	August 2015
Zion: Facebook Next- Generation Large Memory Training Platform Smelyanskiy, Misha 2019 IEEE Hot Chips 31 Symposium (HCS) https://doi.org/10.1109/HOTCHIPS.2019.8875650	conference	August 2019
Understanding GPU errors on large-scale HPC systems and the implications for system design and operation Tiwari, Devesh; Gupta, Saurabh; Rogers, James 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) https://doi.org/10.1109/HPCA.2015.7056044	conference	February 2015
High level programming of FPGAs for HPC and data centric applications Segal, Oren; Nasiri, Nasibeh; Margala, Martin 2014 IEEE High Performance Extreme Computing Conference (HPEC) https://doi.org/10.1109/HPEC.2014.7040979	conference	September 2014
Benchmarking Heterogeneous HPC Systems Including Reconfigurable Fabrics: Community Aspirations for Ideal Comparisons Jamieson, Peter; Sanaullah, Ahmed; Herbordt, Martin 2018 IEEE High Performance Extreme Computing Conference (HPEC) https://doi.org/10.1109/HPEC.2018.8547635	conference	September 2018
GPU Resource Sharing and Virtualization on High Performance Computing Systems Li, Teng; Narayana, Vikram K.; El-Araby, Esam 2011 International Conference on Parallel Processing (ICPP) https://doi.org/10.1109/ICPP.2011.88	conference	September 2011
SharP: Towards Programming Extreme-Scale Systems with Hierarchical Heterogeneous Memory Venkata, Manjunath Gorentla; Aderholdt, Ferrol; Parchman, Zachary 2017 46th International Conference on Parallel Processing Workshops (ICPPW) https://doi.org/10.1109/ICPPW.2017.32	conference	August 2017
Workload Estimation for Improving Resource Management Decisions in the Cloud Patel, Jemishkumar; Jindal, Vasu; Yen, I-Ling 2015 IEEE Twelfth International Symposium on Autonomous Decentralized System (ISADS), 2015 IEEE Twelfth International Symposium on Autonomous Decentralized Systems https://doi.org/10.1109/ISADS.2015.17	conference	March 2015
Evaluating and mitigating bandwidth bottlenecks across the memory hierarchy in GPUs Dublish, Saumay; Nagarajan, Vijay; Topham, Nigel 2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) https://doi.org/10.1109/ISPASS.2017.7975295	conference	April 2017
Investigating Fairness in Disaggregated Non-Volatile Memories Kommareddy, Vamsee Reddy; Hughes, Clayton; Hammond, Simon 2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) https://doi.org/10.1109/ISVLSI.2019.00028	conference	July 2019
Silicon Photonic Switch Topologies and Routing Strategies for Disaggregated Data Centers Cheng, Qixiang; Huang, Yishen; Yang, Hao IEEE Journal of Selected Topics in Quantum Electronics, Vol. 26, Issue 2 https://doi.org/10.1109/JSTQE.2019.2960950	journal	March 2020
Accelerators for Artificial Intelligence and High-Performance Computing Milojicic, Dejan Computer, Vol. 53, Issue 2 https://doi.org/10.1109/MC.2019.2954056	journal	February 2020
Disaggregated Data Centers: Challenges and Trade-offs Lin, Rui; Cheng, Yuxin; Andrade, Marilet De IEEE Communications Magazine, Vol. 58, Issue 2 https://doi.org/10.1109/MCOM.001.1900612	journal	February 2020
MLPerf: An Industry Standard Benchmark Suite for Machine Learning Performance Mattson, Peter; Reddi, Vijay Janapa; Cheng, Christine IEEE Micro, Vol. 40, Issue 2 https://doi.org/10.1109/MM.2020.2974843	journal	March 2020
NVIDIA A100 Tensor Core GPU: Performance and Innovation Choquette, Jack; Gandhi, Wishwesh; Giroux, Olivier IEEE Micro, Vol. 41, Issue 2 https://doi.org/10.1109/MM.2021.3061394	journal	March 2021
Comparative study of deep learning framework in HPC environments Asaadi, Hamidreza; Chapman, Barbara 2017 New York Scientific Data Summit (NYSDS) https://doi.org/10.1109/NYSDS.2017.8085040	conference	August 2017
Accelerating High Performance Computing Applications: Using CPUs, GPUs, Hybrid CPU/GPU, and FPGAs Liu, Bin; Zydek, Dawid; Selvaraj, Henry 2012 13th International Conference on Parallel and Distributed Computing Applications and Technologies (PDCAT), 2012 13th International Conference on Parallel and Distributed Computing, Applications and Technologies https://doi.org/10.1109/PDCAT.2012.34	conference	December 2012
Optically Connected Memory for Disaggregated Data Centers Gonzalez, Jorge; Gazman, Alexander; Hattink, Maarten 2020 IEEE 32nd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) https://doi.org/10.1109/SBAC-PAD49847.2020.00017	conference	September 2020
On the Memory Underutilization: Exploring Disaggregated Memory on HPC Systems Peng, Ivy; Pearce, Roger; Gokhale, Maya 2020 IEEE 32nd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) https://doi.org/10.1109/SBAC-PAD49847.2020.00034	conference	September 2020
A Scalable Cross-Platform Infrastructure for Application Performance Tuning Using Hardware Counters Browne, S.; Dongarra, J.; Garner, N. ACM/IEEE SC 2000 Conference (SC'00) https://doi.org/10.1109/SC.2000.10029	conference	January 2000
The Lightweight Distributed Metric Service: A Scalable Infrastructure for Continuous Monitoring of Large Scale Computing Systems and Applications Agelastos, Anthony; Allan, Benjamin; Brandt, Jim SC14: International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2014.18	conference	November 2014
Reliable and Efficient Performance Monitoring in Linux Dimakopoulou, Maria; Eranian, Stephane; Koziris, Nectarios SC16: International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2016.33	conference	November 2016
Disaggregated Cloud Memory with Elastic Block Management Koh, Kwangwon; Kim, Kangho; Jeon, Seunghyub IEEE Transactions on Computers, Vol. 68, Issue 1 https://doi.org/10.1109/TC.2018.2851565	journal	January 2019
Unaligned Burst-Aware Memory Subsystem Jang, Wooyoung IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 27, Issue 10 https://doi.org/10.1109/TVLSI.2019.2922621	journal	October 2019
The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions Hochreiter, Sepp International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, Vol. 06, Issue 02 https://doi.org/10.1142/S0218488598000094	journal	April 1998
Decoupled DIMM: building high-bandwidth memory system using low-speed DRAM devices Zheng, Hongzhong; Lin, Jiang; Zhang, Zhao ACM SIGARCH Computer Architecture News, Vol. 37, Issue 3 https://doi.org/10.1145/1555815.1555788	journal	June 2009
A Survey of CPU-GPU Heterogeneous Computing Techniques Mittal, Sparsh; Vetter, Jeffrey S. ACM Computing Surveys, Vol. 47, Issue 4 https://doi.org/10.1145/2788396	journal	July 2015
Reliability lessons learned from GPU experience with the Titan supercomputer at Oak Ridge leadership computing facility Tiwari, Devesh; Gupta, Saurabh; Gallarno, George Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '15 https://doi.org/10.1145/2807591.2807666	conference	January 2015
Main Memory in HPC: Do We Need More or Could We Live with Less? Zivanovic, Darko; Pavlovic, Milan; Radulovic, Milan ACM Transactions on Architecture and Code Optimization, Vol. 14, Issue 1 https://doi.org/10.1145/3023362	journal	March 2017
Operating and Runtime Systems Challenges for HPC Systems Maccabe, Arthur B. ROSS '17: International Workshop on Runtime and Operating Systems for Supercomputers ROSS 2017, Proceedings of the 7th International Workshop on Runtime and Operating Systems for Supercomputers ROSS 2017 https://doi.org/10.1145/3095770.3095771	conference	June 2017
Managing Heterogeneous Resources in HPC Systems Agosta, Giovanni; Fornaciari, William; Massari, Giuseppe Proceedings of the 9th Workshop and 7th Workshop on Parallel Programming and RunTime Management Techniques for Manycore Architectures and Design Tools and Architectures for Multicore Embedded Computing Platforms - PARMA-DITAM '18 https://doi.org/10.1145/3183767.3183769	conference	January 2018
Bandwidth steering in HPC using silicon nanophotonics Michelogiannakis, George; Shen, Yiwen; Teh, Min Yee SC '19: The International Conference for High Performance Computing, Networking, Storage, and Analysis, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1145/3295500.3356145	conference	November 2019
Scheduling Beyond CPUs for HPC Fan, Yuping; Lan, Zhiling; Rich, Paul HPDC '19: The 28th International Symposium on High-Performance Parallel and Distributed Computing, Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing https://doi.org/10.1145/3307681.3325401	conference	June 2019
Who limits the resource efficiency of my datacenter: an analysis of Alibaba datacenter traces Guo, Jing; Chang, Zihao; Wang, Sa IWQoS '19: IEEE/ACM International Symposium on Quality of Service, Proceedings of the International Symposium on Quality of Service https://doi.org/10.1145/3326285.3329074	conference	June 2019
DSPatch: Dual Spatial Pattern Prefetcher Bera, Rahul; Nori, Anant V.; Mutlu, Onur MICRO '52: The 52nd Annual IEEE/ACM International Symposium on Microarchitecture, Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture https://doi.org/10.1145/3352460.3358325	conference	October 2019
DRMaestro: orchestrating disaggregated resources on virtualized data-centers Amaral, Marcelo; Polo, Jordà; Carrera, David Journal of Cloud Computing, Vol. 10, Issue 1 https://doi.org/10.1186/s13677-021-00238-6	journal	March 2021
Optically Disaggregated Data Centers With Minimal Remote Memory Latency: Technologies, Architectures, and Resource Allocation [Invited] Zervas, Georgios; Yuan, Hui; Saljoghei, Arsalan Journal of Optical Communications and Networking, Vol. 10, Issue 2 https://doi.org/10.1364/JOCN.10.00A270	journal	January 2018
Survey of Photonic Switching Architectures and Technologies in Support of Spatially and Spectrally Flexible Optical Networking [Invited] Marom, Dan M.; Colbourne, Paul D.; D’Errico, Antonio Journal of Optical Communications and Networking, Vol. 9, Issue 1 https://doi.org/10.1364/JOCN.9.000001	journal	December 2016
Photonic switching in high performance datacenters [Invited] Cheng, Qixiang; Rumley, Sébastien; Bahadori, Meisam Optics Express, Vol. 26, Issue 12 https://doi.org/10.1364/OE.26.016022	journal	January 2018
Facebook’s Data Center Infrastructure: Open Compute, Disaggregated Rack, and Beyond Taylor, Jason Optical Fiber Communication Conference https://doi.org/10.1364/OFC.2015.W1D.5	conference	January 2015

Similar Records

Evaluating the potential of disaggregated memory systems for HPC applications

Journal Article · Thu May 30 20:00:00 EDT 2024 · Concurrency and Computation. Practice and Experience · OSTI ID:2369149

Towards understanding HPC users and systems: A NERSC case study

Journal Article · Wed Sep 13 20:00:00 EDT 2017 · Journal of Parallel and Distributed Computing · OSTI ID:1439236

Towards understanding HPC users and systems: A NERSC case study

Journal Article · Sun Dec 31 23:00:00 EST 2017 · Journal of Parallel and Distributed Computing · OSTI ID:1463670

Related Subjects

97 MATHEMATICS AND COMPUTING
HPC
LDMS
computer systems organization
disaggregation
emerging technologies
memory
utilization

A Case For Intra-rack Resource Disaggregation in HPC

Citation Formats

References (54)

Similar Records

Related Subjects