skip to main content

Title: TUNE: Compiler-Directed Automatic Performance Tuning

This project has developed compiler-directed performance tuning technology targeting the Cray XT4 Jaguar system at Oak Ridge, which has multi-core Opteron nodes with SSE-3 SIMD extensions, and the Cray XE6 Hopper system at NERSC. To achieve this goal, we combined compiler technology for model-guided empirical optimization for memory hierarchies with SIMD code generation, which have been developed by the PIs over the past several years. We examined DOE Office of Science applications to identify performance bottlenecks and apply our system to computational kernels that operate on dense arrays. Our goal for this performance-tuning technology has been to yield hand-tuned levels of performance on DOE Office of Science computational kernels, while allowing application programmers to specify their computations at a high level without requiring manual optimization. Overall, we aim to make our technology for SIMD code generation and memory hierarchy optimization a crucial component of high-productivity Petaflops computing through a close collaboration with the scientists in national laboratories.
  1. University of Utah
Publication Date:
OSTI Identifier:
Report Number(s):
DOE Contract Number:
Resource Type:
Technical Report
Research Org:
University of Utah
Sponsoring Org:
Contributing Orgs:
Argonne National Laboratory, USC/ISI
Country of Publication:
United States
97 MATHEMATICS AND COMPUTING; autotuning, compiler, SIMD