skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Project Report on DOE Young Investigator Grant (Contract No. DE-FG02-02ER25525) Dynamic Scheduling and Fusion of Irregular Computation (August 15, 2002 to August 14, 2005)

Other ·
OSTI ID:929550

Computer simulation has become increasingly important in many scientiï¬ c disciplines, but its performance and scalability are severely limited by the memory throughput on today's computer systems. With the support of this grant, we ï¬ rst designed training-based prediction, which accurately predicts the memory performance of large applications before their execution. Then we developed optimization techniques using dynamic computation fusion and large-scale data transformation. The research work has three major components. The ï¬ rst is modeling and prediction of cache behav- ior. We have developed a new technique, which uses reuse distance information from training inputs then extracts a parameterized model of the program's cache miss rates for any input size and for any size of fully associative cache. Using the model we have built a web-based tool using three dimensional visualization. The new model can help to build cost-effective computer systems, design better benchmark suites, and improve task scheduling on heterogeneous systems. The second component is global computation for improving cache performance. We have developed an algorithm for dynamic data partitioning using sampling theory and probability distribution. Recent work from a number of groups show that manual or semi-manual computation fusion has signiï¬ cant beneï¬ ts in physical, mechanical, and biological simulations as well as information retrieval and machine veriï¬ cation. We have developed an au- tomatic tool that measures the potential of computation fusion. The new system can be used by high-performance application programmers to estimate the potential of locality improvement for a program before trying complex transformations for a speciï¬ c cache system. The last component studies models of spatial locality and the problem of data layout. In scientific programs, most data are stored in arrays. Grand challenge problems such as hydrodynamics simulation and data mining may use an enormous number of data elements. To optimize the layout across multiple arrays, we have developed a formal model called reference afï¬ nity. We collaborated with the IBM production compiler group and designed an efï¬ cient compiler analysis that performs as well as data or code proï¬ ling does. Based on these results, the IBM group has ï¬ led a patent and is including this technique in their product compiler. A major part of the project is the development of software tools. We have developed web-based visu- alization for program locality. In addition, we have implemented a prototype of array regrouping in the IBM compiler. The full implementation is expected to come out of IBM in the near future and to beneï¬ t scientiï¬ c applications running on IBM supercomputers. We have also developed a test environment for studying the limit of computation fusion. Finally, our work has directly influenced the design of the Intel Itanium compiler. The project has strengthened the research relation between the PI's group and groups in DoE labs. The PI was an invited speaker at the Center for Applied Scientiï¬ c Computing Seminar Series at the early stage of the project. The question that the most audience was curious about was the limit of computation fusion, which has been studied in depth in this research. In addition, the seminar directly helped a group at Lawrence Livermore to achieve four times speedup on an important DoE code. The PI helped to organize a number of high-performance computing forums, including the founding of a workshop on memory system performance (MSP). In the past two years, one fourth of the papers in the workshop came from researchers in Lawrence Livermore, Argonne, Las Alamos, and Lawrence Berkeley national laboratories. The PI lectured frequently on DoE funded research. In a broader context, high performance computing is central to America's scientific and economic stature in the world, and addresses many of the most scientiï¬ cally and socially important problems of our day. This research has improved the programming support for a variety of computational paradigms, including dynamic mesh, hydrodynamics, molecular dynamics, multi-grid methods, matrix algebra, and sequential and parallel sorting. In the process, the PI's group has developed and strengthened relationships with DoE laboratories and major hardware and software vendors.

Research Organization:
Univ. of Rochester, NY (United States)
Sponsoring Organization:
USDOE Office of Science (SC)
DOE Contract Number:
FG02-02ER25525
OSTI ID:
929550
Report Number(s):
DE/ER/25525- Final Report
Country of Publication:
United States
Language:
English