Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
STORM: Lightning-Fast Resource Management Eitan Frachtenberg Fabrizio Petrini Juan Fernandez Scott Pakin
 

Summary: STORM: Lightning-Fast Resource Management
Eitan Frachtenberg Fabrizio Petrini Juan Fernandez Scott Pakin
Salvador Coll
CCS-3 Modeling, Algorithms, and Informatics Group
Computer and Computational Sciences (CCS) Division
Los Alamos National Laboratory
{eitanf,fabrizio,juanf,pakin,scoll}@lanl.gov
July 26, 2002
Abstract
Although workstation clusters are a common platform for high-performance computing
(HPC), they remain more difficult to manage than sequential systems or even symmetric mul-
tiprocessors. Furthermore, as cluster sizes increase, the quality of the resource-management
subsystem--essentially, all of the code that runs on a cluster other than the applications--
increasingly impacts application efficiency. In this paper, we present STORM, a resource-
management framework designed for scalability and performance. The key innovation behind
STORM is a software architecture that enables resource management to exploit low-level network
features. As a result of this HPC-application-like design, STORM is orders of magnitude faster
than the best reported results in the literature on two sample resource-management functions:
job launching and process scheduling.
1 Introduction

  

Source: Arnau, Salvador Coll - Departamento de Ingeniería Electrónica, Universitat Politècnica de València
Frachtenberg, Eitan - School of Computer Science and Engineering, Hebrew University of Jerusalem
Los Alamos National Laboratory, Computing, Communications, and Networking Division, Performance and Architecture Laboratory
Pakin, Scott - Computing, Communications, and Networking Division, Los Alamos National Laboratory

 

Collections: Computer Technologies and Information Sciences; Engineering