skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: The Cray XT4 Quad-core : A First Look

Abstract

The Cray XT4 at Oak Ridge National Laboratory (ORNL), named Jaguar, has recently been up- graded, from dual-core to quad-core processors in addition to other significant changes. Although we have had very limited access to the machine and therefore are not presenting definitive performance results, we can share some meaningful and constructive experiences to the user community which could be of assistance as they gain access to Jaguar as well as other multi-core processor based parallel com- puters. These experiences were gained while porting a broad set of scientific application programs to Jaguar.

Authors:
 [1];  [1];  [1];  [1];  [1];  [1];  [1];  [1];  [1]
  1. ORNL
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Center for Computational Sciences
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1050241
DOE Contract Number:
DE-AC05-00OR22725
Resource Type:
Conference
Resource Relation:
Conference: Cray User Group (CUG) 2008, Helsinki, Finland, 20080505, 20080508
Country of Publication:
United States
Language:
English
Subject:
99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE; ORNL; PERFORMANCE; PARALLEL PROCESSING; SUPERCOMPUTERS; parallel programming; scientific computing

Citation Formats

Alam, Sadaf R, Barrett, Richard F, Eisenbach, Markus, Fahey, Mark R, Hartman-Baker, Rebecca J, Kuehn, Jeffery A, Poole, Stephen W, Sankaran, Ramanan, and Worley, Patrick H. The Cray XT4 Quad-core : A First Look. United States: N. p., 2008. Web.
Alam, Sadaf R, Barrett, Richard F, Eisenbach, Markus, Fahey, Mark R, Hartman-Baker, Rebecca J, Kuehn, Jeffery A, Poole, Stephen W, Sankaran, Ramanan, & Worley, Patrick H. The Cray XT4 Quad-core : A First Look. United States.
Alam, Sadaf R, Barrett, Richard F, Eisenbach, Markus, Fahey, Mark R, Hartman-Baker, Rebecca J, Kuehn, Jeffery A, Poole, Stephen W, Sankaran, Ramanan, and Worley, Patrick H. 2008. "The Cray XT4 Quad-core : A First Look". United States. doi:.
@article{osti_1050241,
title = {The Cray XT4 Quad-core : A First Look},
author = {Alam, Sadaf R and Barrett, Richard F and Eisenbach, Markus and Fahey, Mark R and Hartman-Baker, Rebecca J and Kuehn, Jeffery A and Poole, Stephen W and Sankaran, Ramanan and Worley, Patrick H},
abstractNote = {The Cray XT4 at Oak Ridge National Laboratory (ORNL), named Jaguar, has recently been up- graded, from dual-core to quad-core processors in addition to other significant changes. Although we have had very limited access to the machine and therefore are not presenting definitive performance results, we can share some meaningful and constructive experiences to the user community which could be of assistance as they gain access to Jaguar as well as other multi-core processor based parallel com- puters. These experiences were gained while porting a broad set of scientific application programs to Jaguar.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = 2008,
month = 1
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:
  • An upgrade from dual-core to quad-core AMD processor on the Cray XT system at the Oak Ridge National Laboratory (ORNL) Leadership Computing Facility (LCF) has resulted in significant changes in the hardware and software stack, including a deeper memory hierarchy, SIMD instructions and a multi-core aware MPI library. In this paper, we evaluate impact of a subset of these key changes on large-scale scientific applications. We will provide insights into application tuning and optimization process and report on how different strategies yield varying rates of successes and failures across different application domains. For instance, we demonstrate that the vectorization instructionsmore » (SSE) provide a performance boost of as much as 50% on fusion and combustion applications. Moreover, we reveal how the resource contentions could limit the achievable performance and provide insights into how application could exploit Petascale XT5 system's hierarchical parallelism.« less
  • Abstract not provided.
  • The XGC1 code is used to model multiscale tokamak plasma turbulence dynamics in realistic diverted magnetic field geometry. In June 2009, XGC1 demonstrated nearly linear weak and strong scaling out to 150,000 cores on a Cray XT5 with 8-core nodes when solving problems of relevance to running experiments on the ITER tokamak. Here we compare performance, and discuss further performance optimizations, when running XGC1 on an XT5 with 12-core nodes on up to 224,000 cores.
  • No abstract prepared.
  • Multi-core processors based SMP servers have become building blocks for Linux clusters in recent years because they can deliver better performance for multi-threaded programs through on-chip multi-threading. However, a relative slow software barrier can hinder the performance of a data-parallel scientific application on a multi-core system. In this paper we study the performance of different software barrier algorithms on a server based on newly introduced AMD quad-core Opteron processors. We study how the memory architecture and the cache coherence protocol of the system influence the performance of barrier algorithms. We present an optimized barrier algorithm derived from the queue-based barriermore » algorithm. We find that the optimized barrier algorithm achieves speedup of 1.77 over the original queue-based algorithm. In addition, it has speedup of 2.39 over the software barrier generated by the Intel OpenMP compiler.« less