
- Petascale Parallelization of the Gyrokinetic Toroidal Code
- Message Passing Vs. Shared Address Space on a Cluster of SMPs Hongzhang Shan, Jaswinder Pal Singh
- February 22, 2000. For submission to The 6th Workshop on Job Scheduling Strategies for Parallel Processing
- Design Strategies for Irregularly Adapting
- 0743-7315/02 $35.00 2002 Elsevier Science (USA)
- A Performance Evaluation of the Cray X1 for Scientific Applications
- A Comparison of Three Programming Models for Adaptive Applications on the Origin2000
- Message passing and shared address space parallelism on an SMP cluster
- OPTIMIZATION AND PERFORMANCE MODELING OF STENCIL COMPUTATIONS ON MODERN MICROPROCESSORS
- Silicon Nanophotonic Network-On-Chip Using TDM Arbitration
- Optimization of Sparse Matrix-Vector Multiplication on Emerging Multicore Platforms
- Resource-Efficient, Hierarchical Auto-Tuning of a Hybrid Lattice Boltzmann Computation on the Cray XT4
- Communication Requirements and Interconnect Optimization for High-End Scientific Applications
- An Auto-Tuning Framework for Parallel Multicore Stencil Computations
- Parallel I/O Performance: From Events to Ensembles Andrew Uselton, Mark Howison, Nicholas J. Wright, David Skinner,
- G REEN FLASH PROJECT The electrical power demands of ultrascale computers threaten to limit the future
- Auto-tuning the 27-point Stencil for Multicore Kaushik Datta2
- C O M P U T A T I O N A L R E S E A R C H D I V I S I O N Multicore Optimization of
- Optimization of a Lattice Boltzmann Computation on State-of-the-Art Multicore Platforms
- Towards Ultra-High Resolution Models of Climate and Weather To appear in the International Journal of High Performance Computing Applications, 2008.
- Investigation Of Leading HPC I/O Performance Using A Scientific-Application Derived Benchmark
- Reconfigurable Hybrid Interconnection for Static and Dynamic Scientific Applications
- 1 Performance Characteristics of Potential Petascale Scientific Applications 3 Leonid Oliker, John Shalf, Jonathan Carter, Andrew Canning, Shoaib Kamil, Michael Lijewski
- Performance Evaluation of Scientific Applications
- Scientific Computing Kernels on the Cell Processor Samuel Williams, John Shalf, Leonid Oliker
- Performance Characteristics of an Adaptive Mesh Refinement Calculation on Scalar and Vector Platforms
- Performance Evaluation of Lattice-Boltzmann Magnetohydrodynamics Simulations on Modern
- Leading Computational Methods on Scalar and Vector HEC Platforms
- Analyzing Ultra-Scale Application Communication Requirements for a Reconfigurable Hybrid Interconnect
- Integrated Performance Monitoring of a Cosmology Application on Leading HEC Platforms
- SIAM REVIEW c 2002 Society for Industrial and Applied Mathematics Vol. 44, No. 3, pp. 373393
- Portable Parallel Programming for the Dynamic Load Balancing of Unstructured Grid Applications
- "!# $% &%(') ("012354678@9&9(('BACD6E6( 9F3HGIQP1D3%SRT $%U&$%23%&V W X`Y3acbedgfihpbrq3X`s
- Implicit and Explicit Optimizations for Stencil Computations
- ESP: A System Utilization Benchmark Adrian T. Wong, Leonid Oliker, William T. C. Kramer, Teresa L. Kaltz and David H. Bailey
- NEW COMPUTATIONAL METHODS FOR THE PREDICTION AND ANALYSIS OF HELICOPTER ACOUSTICS \Lambda
- Optimization of Sparse Matrix-Vector Multiplication on Emerging Multicore Platforms
- Journal of the Earth Simulator, Volume 3, April 2005, 0000 Performance of Ultra-Scale Applications on
- Memory-Intensive Benchmarks: IRAM vs. Cache-Based Machines Brian R. Gaeke , Parry Husbands , Xiaoye S. Li , Leonid Oliker ,
- Magnetohydrodynamic Turbulence Simulations on the Earth Simulator Using the Lattice Boltzmann
- Parallelization of a Dynamic Unstructured Algorithm using Three Leading Programming Paradigms
- Algorithms for Automatic Alignment of Arrays \Lambda Siddhartha Chatterjee y John R. Gilbert z Leonid Oliker x Robert Schreiber --
- Ordering Sparse Matrices for Cache-Based Systems
- HPC Global File System Performance Analysis Using A Scientific-Application Derived Benchmark
- Parallel Dynamic Load Balancing Strategies for Adaptive Irregular Applications
- Optimizing and Tuning the Fast Multipole Method for State-of-the-Art Multicore Architectures
- Job Superscheduler Architecture and Performance in Computational Grid Environments
- Large-scale gyrokinetic particle simulation of
- Hardware/Software Co-design of Global Cloud System Resolving Michael F. Wehner 1
- Jack Dongarra, David A. Bader, Jakub Kurzak Scientific Computing with
- Extracting Ultra-Scale Lattice Boltzmann Performance via Hierarchical and Distributed Auto-Tuning
- Jack Dongarra, David A. Bader, Jakub Kurzak Scientific Computing with
- Performance Characterization for Fusion Co-design Applications
- Gyrokinetic Toroidal Simulations on Leading Multi-and Manycore HPC Systems
- Gyrokinetic Particle-in-Cell Optimization on Emerging Multi-and Manycore Platforms
- David H. Bailey and Robert F. Lucas, Editors Performance Tuning of
- Guest Editorial Emerging Programming Paradigms for Large-Scale Scientific Computing
- Reduced-Bandwidth Multithreaded Algorithms for Sparse Matrix-Vector Multiplication
- Cosmic Microwave Background Map-Making at the Petascale and Beyond
- Hardware/Software Co-design for Energy-Efficient Seismic Modeling
- David H. Bailey and Robert F. Lucas, Editors Performance Tuning of