skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Enabling Diverse Software Stacks on Supercomputers using High Performance Virtual Clusters.

Abstract

While large-scale simulations have been the hallmark of the High Performance Computing (HPC) community for decades, Large Scale Data Analytics (LSDA) workloads are gaining attention within the scientific community not only as a processing component to large HPC simulations, but also as standalone scientific tools for knowledge discovery. With the path towards Exascale, new HPC runtime systems are also emerging in a way that differs from classical distributed com- puting models. However, system software for such capabilities on the latest extreme-scale DOE supercomputing needs to be enhanced to more appropriately support these types of emerging soft- ware ecosystems. In this paper, we propose the use of Virtual Clusters on advanced supercomputing resources to enable systems to support not only HPC workloads, but also emerging big data stacks. Specifi- cally, we have deployed the KVM hypervisor within Cray's Compute Node Linux on a XC-series supercomputer testbed. We also use libvirt and QEMU to manage and provision VMs directly on compute nodes, leveraging Ethernet-over-Aries network emulation. To our knowledge, this is the first known use of KVM on a true MPP supercomputer. We investigate the overhead our solution using HPC benchmarks, both evaluating single-node performance as well as weak scaling of amore » 32-node virtual cluster. Overall, we find single node performance of our solution using KVM on a Cray is very efficient with near-native performance. However overhead increases by up to 20% as virtual cluster size increases, due to limitations of the Ethernet-over-Aries bridged network. Furthermore, we deploy Apache Spark with large data analysis workloads in a Virtual Cluster, ef- fectively demonstrating how diverse software ecosystems can be supported by High Performance Virtual Clusters.« less

Authors:
 [1];  [1];  [1];  [1]
  1. Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Publication Date:
Research Org.:
Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Sponsoring Org.:
USDOE National Nuclear Security Administration (NNSA)
OSTI Identifier:
1367280
Report Number(s):
SAND-2017-5325
653446
DOE Contract Number:  
AC04-94AL85000
Resource Type:
Technical Report
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Younge, Andrew J., Pedretti, Kevin, Grant, Ryan, and Brightwell, Ron. Enabling Diverse Software Stacks on Supercomputers using High Performance Virtual Clusters.. United States: N. p., 2017. Web. doi:10.2172/1367280.
Younge, Andrew J., Pedretti, Kevin, Grant, Ryan, & Brightwell, Ron. Enabling Diverse Software Stacks on Supercomputers using High Performance Virtual Clusters.. United States. doi:10.2172/1367280.
Younge, Andrew J., Pedretti, Kevin, Grant, Ryan, and Brightwell, Ron. Mon . "Enabling Diverse Software Stacks on Supercomputers using High Performance Virtual Clusters.". United States. doi:10.2172/1367280. https://www.osti.gov/servlets/purl/1367280.
@article{osti_1367280,
title = {Enabling Diverse Software Stacks on Supercomputers using High Performance Virtual Clusters.},
author = {Younge, Andrew J. and Pedretti, Kevin and Grant, Ryan and Brightwell, Ron},
abstractNote = {While large-scale simulations have been the hallmark of the High Performance Computing (HPC) community for decades, Large Scale Data Analytics (LSDA) workloads are gaining attention within the scientific community not only as a processing component to large HPC simulations, but also as standalone scientific tools for knowledge discovery. With the path towards Exascale, new HPC runtime systems are also emerging in a way that differs from classical distributed com- puting models. However, system software for such capabilities on the latest extreme-scale DOE supercomputing needs to be enhanced to more appropriately support these types of emerging soft- ware ecosystems. In this paper, we propose the use of Virtual Clusters on advanced supercomputing resources to enable systems to support not only HPC workloads, but also emerging big data stacks. Specifi- cally, we have deployed the KVM hypervisor within Cray's Compute Node Linux on a XC-series supercomputer testbed. We also use libvirt and QEMU to manage and provision VMs directly on compute nodes, leveraging Ethernet-over-Aries network emulation. To our knowledge, this is the first known use of KVM on a true MPP supercomputer. We investigate the overhead our solution using HPC benchmarks, both evaluating single-node performance as well as weak scaling of a 32-node virtual cluster. Overall, we find single node performance of our solution using KVM on a Cray is very efficient with near-native performance. However overhead increases by up to 20% as virtual cluster size increases, due to limitations of the Ethernet-over-Aries bridged network. Furthermore, we deploy Apache Spark with large data analysis workloads in a Virtual Cluster, ef- fectively demonstrating how diverse software ecosystems can be supported by High Performance Virtual Clusters.},
doi = {10.2172/1367280},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2017},
month = {5}
}

Technical Report:

Save / Share: