skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Production Experiences with the Cray-Enabled TORQUE Resource Manager

Conference ·
OSTI ID:1086656

High performance computing resources utilize batch systems to manage the user workload. Cray systems are uniquely different from typical clusters due to Cray s Application Level Placement Scheduler (ALPS). ALPS manages binary transfer, job launch and monitoring, and error handling. Batch systems require special support to integrate with ALPS using an XML protocol called BASIL. Previous versions of Adaptive Computing s TORQUE and Moab batch suite integrated with ALPS from within Moab, using PERL scripts to interface with BASIL. This would occasionally lead to problems when all the components would become unsynchronized. Version 4.1 of the TORQUE Resource Manager introduced new features that allow it to directly integrate with ALPS using BASIL. This paper describes production experiences at Oak Ridge National Laboratory using the new TORQUE software versions, as well as ongoing and future work to improve TORQUE.

Research Organization:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). National Center for Computational Sciences (NCCS)
Sponsoring Organization:
USDOE Office of Science (SC)
DOE Contract Number:
DE-AC05-00OR22725
OSTI ID:
1086656
Resource Relation:
Conference: Cray User Group, Napa Valley, CA, USA, 20130506, 20130509
Country of Publication:
United States
Language:
English

Similar Records

Certification of Completion of ASC FY08 Level-2 Milestone ID #2933
Technical Report · Thu Jun 12 00:00:00 EDT 2008 · OSTI ID:1086656

The ASC Sequoia Programming Model
Technical Report · Wed Aug 06 00:00:00 EDT 2008 · OSTI ID:1086656

Early Experiences with Node-Level Power Capping on the Cray XC40 Platform
Conference · Thu Oct 01 00:00:00 EDT 2015 · OSTI ID:1086656