skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Understanding and Mitigating Multicore Performance Issues on theAMD Opteron Architecture

Abstract

Over the past 15 years, microprocessor performance hasdoubled approximately every 18 months through increased clock rates andprocessing efficiency. In the past few years, clock frequency growth hasstalled, and microprocessor manufacturers such as AMD have moved towardsdoubling the number of cores every 18 months in order to maintainhistorical growth rates in chip performance. This document investigatesthe ramifications of multicore processor technology on the new Cray XT4?systems based on AMD processor technology. We begin by walking throughthe AMD single-core and dual-core and upcoming quad-core processorarchitectures. This is followed by a discussion of methods for collectingperformance counter data to understand code performance on the Cray XT3?and XT4? systems. We then use the performance counter data to analyze theimpact of multicore processors on the performance of microbenchmarks suchas STREAM, application kernels such as the NAS Parallel Benchmarks, andfull application codes that comprise the NERSC-5 SSP benchmark suite. Weexplore compiler options and software optimization techniques that canmitigate the memory bandwidth contention that can reduce computingefficiency on multicore processors. The last section provides a casestudy of applying the dual-core optimizations to the NAS ParallelBenchmarks to dramatically improve their performance.

Authors:
; ; ; ; ; ; ; ; ; ; ; ; ;
Publication Date:
Research Org.:
COLLABORATION - CrayInc.
Sponsoring Org.:
USDOE
OSTI Identifier:
918496
Report Number(s):
LBNL-62500
R&D Project: KX1310; BnR: KJ0102000; TRN: US200818%%407
DOE Contract Number:  
DE-AC02-05CH11231
Resource Type:
Technical Report
Country of Publication:
United States
Language:
English
Subject:
99 GENERAL AND MISCELLANEOUS; COMPUTER ARCHITECTURE; BENCHMARKS; EFFICIENCY; KERNELS; MANUFACTURERS; MICROPROCESSORS; OPTIMIZATION; PERFORMANCE; PROCESSING; SUPERCOMPUTERS; multicore supercomputer processor performance

Citation Formats

Levesque, John, Larkin, Jeff, Foster, Martyn, Glenski, Joe, Geissler, Garry, Whalen, Stephen, Waldecker, Brian, Carter, Jonathan, Skinner, David, He, Helen, Wasserman, Harvey, Shalf, John, Shan,Hongzhang, and Strohmaier, Erich. Understanding and Mitigating Multicore Performance Issues on theAMD Opteron Architecture. United States: N. p., 2007. Web. doi:10.2172/918496.
Levesque, John, Larkin, Jeff, Foster, Martyn, Glenski, Joe, Geissler, Garry, Whalen, Stephen, Waldecker, Brian, Carter, Jonathan, Skinner, David, He, Helen, Wasserman, Harvey, Shalf, John, Shan,Hongzhang, & Strohmaier, Erich. Understanding and Mitigating Multicore Performance Issues on theAMD Opteron Architecture. United States. doi:10.2172/918496.
Levesque, John, Larkin, Jeff, Foster, Martyn, Glenski, Joe, Geissler, Garry, Whalen, Stephen, Waldecker, Brian, Carter, Jonathan, Skinner, David, He, Helen, Wasserman, Harvey, Shalf, John, Shan,Hongzhang, and Strohmaier, Erich. Wed . "Understanding and Mitigating Multicore Performance Issues on theAMD Opteron Architecture". United States. doi:10.2172/918496. https://www.osti.gov/servlets/purl/918496.
@article{osti_918496,
title = {Understanding and Mitigating Multicore Performance Issues on theAMD Opteron Architecture},
author = {Levesque, John and Larkin, Jeff and Foster, Martyn and Glenski, Joe and Geissler, Garry and Whalen, Stephen and Waldecker, Brian and Carter, Jonathan and Skinner, David and He, Helen and Wasserman, Harvey and Shalf, John and Shan,Hongzhang and Strohmaier, Erich},
abstractNote = {Over the past 15 years, microprocessor performance hasdoubled approximately every 18 months through increased clock rates andprocessing efficiency. In the past few years, clock frequency growth hasstalled, and microprocessor manufacturers such as AMD have moved towardsdoubling the number of cores every 18 months in order to maintainhistorical growth rates in chip performance. This document investigatesthe ramifications of multicore processor technology on the new Cray XT4?systems based on AMD processor technology. We begin by walking throughthe AMD single-core and dual-core and upcoming quad-core processorarchitectures. This is followed by a discussion of methods for collectingperformance counter data to understand code performance on the Cray XT3?and XT4? systems. We then use the performance counter data to analyze theimpact of multicore processors on the performance of microbenchmarks suchas STREAM, application kernels such as the NAS Parallel Benchmarks, andfull application codes that comprise the NERSC-5 SSP benchmark suite. Weexplore compiler options and software optimization techniques that canmitigate the memory bandwidth contention that can reduce computingefficiency on multicore processors. The last section provides a casestudy of applying the dual-core optimizations to the NAS ParallelBenchmarks to dramatically improve their performance.},
doi = {10.2172/918496},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Wed Mar 07 00:00:00 EST 2007},
month = {Wed Mar 07 00:00:00 EST 2007}
}

Technical Report:

Save / Share: