Parameterizing loop fusion for automated empirical tuning
Abstract
Traditional compilers are limited in their ability to optimize applications for different architectures because statically modeling the effect of specific optimizations on different hardware implementations is difficult. Recent research has been addressing this issue through the use of empirical tuning, which uses trial executions to determine the optimization parameters that are most effective on a particular hardware platform. In this paper, we investigate empirical tuning of loop fusion, an important transformation for optimizing a significant class of real-world applications. In spite of its usefulness, fusion has attracted little attention from previous empirical tuning research, partially because it is much harder to configure than transformations like loop blocking and unrolling. This paper presents novel compiler techniques that extend conventional fusion algorithms to parameterize their output when optimizing a computation, thus allowing the compiler to formulate the entire configuration space for loop fusion using a sequence of integer parameters. The compiler can then employ an external empirical search engine to find the optimal operating point within the space of legal fusion configurations and generate the final optimized code using a simple code transformation system. We have implemented our approach within our compiler infrastructure and conducted preliminary experiments using a simple empirical searchmore »
- Authors:
- Publication Date:
- Research Org.:
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 890608
- Report Number(s):
- UCRL-TR-217808
TRN: US200620%%749
- DOE Contract Number:
- W-7405-ENG-48
- Resource Type:
- Technical Report
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE; ALGORITHMS; CONFIGURATION; ENGINES; OPTIMIZATION; PERFORMANCE; SIMULATION; TRANSFORMATIONS; TUNING
Citation Formats
Zhao, Y, Yi, Q, Kennedy, K, Quinlan, D, and Vuduc, R. Parameterizing loop fusion for automated empirical tuning. United States: N. p., 2005.
Web. doi:10.2172/890608.
Zhao, Y, Yi, Q, Kennedy, K, Quinlan, D, & Vuduc, R. Parameterizing loop fusion for automated empirical tuning. United States. doi:10.2172/890608.
Zhao, Y, Yi, Q, Kennedy, K, Quinlan, D, and Vuduc, R. Thu .
"Parameterizing loop fusion for automated empirical tuning". United States.
doi:10.2172/890608. https://www.osti.gov/servlets/purl/890608.
@article{osti_890608,
title = {Parameterizing loop fusion for automated empirical tuning},
author = {Zhao, Y and Yi, Q and Kennedy, K and Quinlan, D and Vuduc, R},
abstractNote = {Traditional compilers are limited in their ability to optimize applications for different architectures because statically modeling the effect of specific optimizations on different hardware implementations is difficult. Recent research has been addressing this issue through the use of empirical tuning, which uses trial executions to determine the optimization parameters that are most effective on a particular hardware platform. In this paper, we investigate empirical tuning of loop fusion, an important transformation for optimizing a significant class of real-world applications. In spite of its usefulness, fusion has attracted little attention from previous empirical tuning research, partially because it is much harder to configure than transformations like loop blocking and unrolling. This paper presents novel compiler techniques that extend conventional fusion algorithms to parameterize their output when optimizing a computation, thus allowing the compiler to formulate the entire configuration space for loop fusion using a sequence of integer parameters. The compiler can then employ an external empirical search engine to find the optimal operating point within the space of legal fusion configurations and generate the final optimized code using a simple code transformation system. We have implemented our approach within our compiler infrastructure and conducted preliminary experiments using a simple empirical search strategy. Our results convey new insights on the interaction of loop fusion with limited hardware resources, such as available registers, while confirming conventional wisdom about the effectiveness of loop fusion in improving application performance.},
doi = {10.2172/890608},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Thu Dec 15 00:00:00 EST 2005},
month = {Thu Dec 15 00:00:00 EST 2005}
}
-
This report summarizes our effort and results of building an integrated optimization environment to effectively combine the programmable control and the empirical tuning of source-to-source compiler optimizations within the framework of multiple existing languages, specifically C, C++, and Fortran. The environment contains two main components: the ROSE analysis engine, which is based on the ROSE C/C++/Fortran2003 source-to-source compiler developed by Co-PI Dr.Quinlan et. al at DOE/LLNL, and the POET transformation engine, which is based on an interpreted program transformation language developed by Dr. Yi at University of Texas at San Antonio (UTSA). The ROSE analysis engine performs advanced compiler analysis,more »
-
Design, tuning, and performance evaluation of an automated pulmonary nodule detection system. Technical report
Radiologists miss approximately 25-30% of all pulmonary nodules smaller than 1.0 cm. in mass screenings. This paper describes a system for the automated detection of pulmonary nodules. It aids the radiologist by indicating the sites in the radiograph most likely to be nodules. Procedurally-driven image experts that respond to specific types of anatomic features are incorporated in a pattern recognizer which uses linear discriminant analysis to classify the candidate nodule sites. Sites not classified as nodules are eliminated from the list of sites presented to the radiologist for inspection. This system has been tested on 43 chest radiographs, and hasmore » -
Automated frequency tuning of SRF cavities at CEBAF
An automated cavity tuning procedure has been implemented in the CEBAF control system to tune the superconducting RF (SRF) cavities to their operating frequency of 1497 MHz. The capture range for coarse tuning algorithm (Burst Mode) is more than 20 cavity bandwidths (5 kHz). The fine tuning algorithm (Sweep Mode) calibrates the phase offset in the detuning angle measurement. This paper describes the implementation of these algorithms and experience of their operation in CEBAF control system. 3 refs., 5 figs.