
- The PlayStation 3 for High Performance Scientific Computing Jakub Kurzak
- Bi-objective Scheduling Algorithms for Optimizing Makespan and Reliability on Heterogeneous Systems
- Templates for Linear Algebra Problems Zhaojun Bai
- MPI Collective Algorithm Selection and Quadtree Encoding Jelena PjesivacGrbovic, Graham E. Fagg,
- ForPeerReview Trace-based Performance Analysis for the Petascale
- Heterogeneous MPI Application Interoperation and Process Management under PVMPI ?
- Review of Performance Analysis Tools for MPI Parallel Programs
- Vita for Jack Dongarra April 1, 1999
- X. Parallel and Distributed Scientific A Numerical Linear Algebra Problem Solving Environment
- Self Adapting Numerical Software (SANS) Effort George Bosilca, Zizhong Chen, Jack Dongarra, Victor Eijkhout, Graham E. Fagg,
- THE DESIGN AND IMPLEMENTATION OF THE PARALLEL OUT-OF-CORE SCALAPACK LU, QR AND CHOLESKY
- Fully Dynamic Scheduler for Numerical Computing on Multicore Processors
- Exploiting Mixed Precision Floating Point Hardware in Scientific
- Two-Stage Tridiagonal Reduction for Dense Symmetric Matrices using Tile Algorithms on Multicore Architectures
- Implementing Linear Algebra Routines on Multi-Core Processors
- Providing Access to High Performance Computing Technologies
- JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING 1, 22-31 (1984) Multiprocessing Linear Algebra Algorithms on the
- HARNESS: Heterogeneous Adaptable Reconfigurable NEtworked SystemS
- CS 89 85 Performance of Various Computers Using Standard
- The LINPACK Benchmark: Past, Present, and Future Jack J. Dongarra, Piotr Luszczek, and Antoine Petitet
- A Parallel Implementation of the Nonsymmetric QR Algorithm for Distributed Memory Architectures
- Evaluation of HighPerformance Computing Software Shirley Browne \Lambda Jack Dongarra y Tom Rowan z
- LAPACK Working Notes LAPACK Working Note #1
- LocationIndependent Naming for Virtual Distributed Software Repositories \Lambda
- Management of the NHSE a Virtual Distributed Digital Library
- Algorithm-Based Diskless Checkpointing for Fault Tolerant Matrix Operations
- Application Runtime Support System (RSS) stop_application
- Network-Enabled Solvers and the NetSolve H. Casanova
- Changing Technologies of HPC Jack J. Dongarra \Lambda
- ORNL/TM12472 Engineering Physics and Mathematics Division
- Overview of PVM and MPI Jack Dongarra
- Performance Complexity of LU Factorization with Efficient Pipelining and Overlap on a Multiprocessor
- Dense Linear Algebra Solvers for Multicore with GPU Accelerators Stanimire Tomov, Rajib Nath, Hatem Ltaief, and Jack Dongarra
- LAPACK++ V. 1.0 High Performance Linear Algebra
- AlgorithmBased Diskless Checkpointing for Fault Tolerant Matrix Operations
- HighPerformance Computing in Industry Erich Strohmaier \Lambda ,
- Standardized Numerical Linear Algebra Software Jack J. Dongarra
- LAPACK Working Note 95 ScaLAPACK: A Portable Linear Algebra Library for Distributed
- THE DESIGN AND IMPLEMENTATION OF THE PARALLEL OUTOFCORE SCALAPACK LU, QR AND CHOLESKY
- CS 89 85 Performance of Various Computers Using Standard
- Scalable Networked Information Processing Environment (SNIPE) Graham Fagg 1
- Solving Linear Systems on Vector and Shared Memory Computers Jack J. Dongarra Iain S. Duff Danny C. Sorensen Henk A. Van der Vorst
- A Scalable Approach to MPI Application Performance Analysis
- TOP500 Supercomputer Sites Jack J. Dongarra
- NetSolve: A Network Server for Solving Computational Science Problems
- On the Convergence of Computational and Data Grids Dorian C. Arnold Sathish S. Vadhiyar Jack Dongarra
- Chapter in Wiley Encyclopedia of Electrical and Electronics Engineering
- ORNL TM-12470 Engineering Physics and Mathematics Division
- The Design and Implementation of the Parallel Out-of-core ScaLAPACK LU, QR and Cholesky
- National HPCC Software Exchange Shirley Browne, Jack Dongarra, Stan Green,
- MessagePassing Performance of Various Computers \Lambday Jack J. Dongarra z Tom Dunigan x
- LAPACK FOR FORTRAN90 Jack J. Dongarra Jeremy Du Croz y Sven Hammarling y
- Tools for Heterogeneous Network Computing \Lambda Adam Beguelin y , Jack Dongarra z , Al Geist x , Robert Manchek --,
- Evolving Software Repositories http://www.netlib.org/utk/projects/esr/
- Algorithmic Redistribution Methods for Block Cyclic Decompositions \Lambda
- ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers Design Issues and Performance \Lambda
- The Spectral Decomposition of Nonsymmetric Matrices on Distributed Memory Parallel Computers
- The International Exascale Software Project: A Call to Cooperative Action by the Global High Performance
- NetSolve's Network Enabled Server: Examples and Applications
- Some Issues in Dense Linear Algebra for Multicore and Special Purpose Architectures
- An Effective Empirical Search Method for Automatic Software Tuning
- SOFTWARE LIBRARIES FOR LINEAR ALGEBRA COMPUTATIONS ON HIGH PERFORMANCE COMPUTERS
- Hash functions for datatype signatures in MPI Julien Langou, George Bosilca, Graham Fagg, and Jack Dongarra
- Automated Empirical Optimizations of Software and the ATLAS project
- Enhancing Parallelism of Tile QR Factorization for Multicore Architectures
- The LAPACK for Clusters Project: an Example of Self Adapting Numerical Software
- 161. Anthony Skjell um, Lawrence Livermore Nati onal Laboratory, 7000 East Ave. , L-316, P. O. Box 808 Li vermore, CA94551
- and candor rarely encountered in a sin-gle work, the authors describe an evo-
- Using Mixed Precision for Sparse Matrix Computations to Enhance the
- Software Reuse in High Performance Computing Shirley Browne
- SOFTWARE LIBRARIES FOR LINEAR ALGEBRA COMPUTATIONS ON HIGH PERFORMANCE COMPUTERS
- Using Agentbased Software for Scientific Computing
- Software Standards and Tools for Concurrent Computing
- Message-Passing Performance of Various Computers y
- MARCH/APRIL 2005 Copublished by the IEEE CS and the AIP 1521-9615/05/$20.00 2005 IEEE 51 PERSPECTIVESP E R S P E C T I V E S I N C O M P U TAT I O N A L S C I E N C E
- ALGORITHM 589 SlCEDR: A FORTRAN Subroutine
- Recent Trends in the Marketplace of High Performance Computing
- Access 02 Summer 2005 Access 03 Summer 2005
- IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 1 Solving Systems of Linear Equations on the CELL
- Interior state computation of nano structures Andrew Canning1
- Scheduling Block-Cyclic Array Redistribution Fr ed eric Desprez1, Jack Dongarra2;3, Antoine Petitet2, Cyril Randriamaro1 and Yves Robert2
- Dense Linear Algebra Solvers for Multicore with GPU Accelerators Stanimire Tomov1
- Reliability Analysis of Self-Healing Network using Discrete-Event Simulation Thara Angskun 1
- This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research
- Decision Trees and MPI Collective Algorithm Selection Problem
- Level-3 Cholesky Kernel Subroutine of a Fully Portable High Performance Minimal Storage Hybrid
- X. Parallel and Distributed Scientific A Numerical Linear Algebra Problem Solving Environment
- A COMPARISON OF PARALLEL SOLVERS FOR DIAGONALLY DOMINANT AND GENERAL NARROWBANDED LINEAR
- High Performance Computing in the U.S. in An Analysis on the Basis of the TOP500 List
- [1] E. Anderson, Z. Bai, C. Bischof, J. Demmel, J. Dongarra, J. DuCroz, A. Greenbaum, S. Ham
- Numerical Linear Algebra Algorithms and Software Jack J. Dongarra
- The Impact of RISC and Parallel RISC Systems
- Advanced Architecture Computers* Jack J. Dongarra and Iain S. Duff
- Constructing Resiliant Communication Infrastructure for Runtime Environments
- Changing Technologies of HPC Jack J. Dongarra
- Message-Passing Performance of Various Computers y Jack J. Dongarraz
- Performance Instrumentation and Compiler Optimizations for MPI/OpenMP Applications
- SUPERCOMPUTING '96 ScaLAPACK: A Portable Linear Algebra Library
- Dynamic Reconfiguration and Virtual Machine Management in the Harness Metacomputing System
- Netlib, NHSE and other Sources Jack Dongarra
- [1] E. Anderson, Z. Bai, C. Bischof, J. Demmel, J. Dongarra, J. DuCroz, A. Greenbaum, S. Hammarling, A. McKenney, and D. Sorensen. LAPACK: A portable linear algebra library for highperformance
- Algorithmic Issues on Heterogeneous Computing Platforms Pierre Boulet 1 , Jack Dongarra 2;3 , Fabrice Rastello 4 , Yves Robert 4 and Fr ed eric Vivien 5
- NetworkEnabled Solvers and the NetSolve H. Casanova
- Fault Tolerant Matrix Operations for Networks of Workstations Using Multiple Checkpointing
- MPI Collective Algorithm Selection and Quadtree Encoding
- Improving Time to Solution with Automated Performance Analysis Shirley Moore, Felix Wolf, and Jack Dongarra
- Performance Analysis of MPI Collective Operations Jelena PjesivacGrbovic, Thara Angskun, George Bosilca,
- Implementation in ScaLAPACK of Divide-and-Conquer Algorithms for Banded and Tridiagonal Linear Systems
- Performance Instrumentation and Measurement for Terascale Systems
- NetSolve: A Network Server for Solving Computational Science Problems
- A Sparse Matrix Library in C++ for High Performance Architectures \Lambda
- IML++ v. 1.2 Iterative Methods Library
- Methodology, Relations and Results
- 161. Anthony Skjellum, Lawrence Livermore National Laboratory, 7000 East Ave., L 316, P.O. Box 808 Livermore, CA 94551
- The Spectral Decomposition of Nonsymmetric Matrices on Distributed Memory Parallel Computers
- A Rough Guide to Scientific Computing On the PlayStation 3 Technical Report UT-CS-07-595
- Numerical Libraries And The Grid Antoine Petitet
- High Performance Computing in the U.S. in An Analysis on the Basis of the TOP500 List
- Self Adapting Linear Algebra Algorithms and , Jack Dongarra
- 1 Disaster Survival Guide in Petascale Computing: An Algo-rithmic Approach 3
- ScaLAPACK: A Linear Algebra Library for MessagePassing Computers \Lambda
- A Test Suite for PVM Henri Casanova (casanova@cs.utk.edu)
- LAPACK Working Note 43 A Look at Scalable Dense Linear Algebra Libraries #
- [32] D. Sorensen and P. Tang. On the orthogonality of eigenvectors computed by divide andconquer techniques. Mathematics and Computer Science Division MCSP152
- Software Standards and Tools for Concurrent Computing
- National HPCC Software Exchange \Lambda Shirley Browne, Jack Dongarra, Stan Green,
- Determining the Idle Time of a Tiling: New Results \Lambda Fr'ed'eric Desprez 1 , Jack Dongarra 2;3 , Fabrice Rastello 1 and Yves Robert 2
- [11] R. van de Geijn, On global combine operations, LAPACK Working Note 29, Technical Report CS91129, University of Tennessee, 1991.
- NetSolve: A Network Server for Solving Computational Science Problems
- High Performance Computing Technologies Jack Dongarra
- Key Concepts For Parallel OutOfCore LU Factorization
- The Generalized QR Decomposition 22 [1] J. L. Barlow, N. K. Nichols, and R. J. Plemmons, Iterative methods for equality con
- Problem Solving Environments for Parallel Scientific Computation
- NetSolve's Network Enabled Server: Examples and Applications
- TOP500 Supercomputer Sites 13th Edition
- Optimal Routing in Binomial Graph Networks Thara Angskun 1
- Software Reuse in High Performance Computing Shirley Browne
- Building and using an Fault Tolerant MPI implementation Graham E Fagg +*
- Overview of PVM and MPI Jack Dongarra
- Taskers and General Resource Managers : PVM supporting DCE Process Management
- Static Tiling for Heterogeneous Computing Platforms Pierre Boulet 1 , Jack Dongarra 2;3 , Yves Robert 4 and Fr ed eric Vivien 5
- Document for a Standard MessagePassing Interface Message Passing Interface Forum
- Overview of ScaLAPACK Jack Dongarra
- [4] J. Dongarra, J. Du Croz, S. Hammarling, and R. Hanson. An extended set of fortran basic linear algebra subroutines. ACM Transactions on Mathematical Software, 14(1):1--17,
- Netlib, NHSE and other Sources Jack Dongarra
- Heterogeneous MPI Application Interoperation and Process Management under PVMPI ?
- Developing numerical libraries in Java RONALD F. BOISVERT 1 , JACK J. DONGARRA 2 , ROLDAN POZO 1 ,
- An Introduction to the MPI Standard Jack J. Dongarra
- Implementation in ScaLAPACK of DivideandConquer Algorithms for Banded and Tridiagonal Linear Systems
- TOP500 Supercomputer Sites Jack J. Dongarra
- Parallel Tiled QR Factorization for Multicore Architectures
- The Dangers of Heterogeneous Network Computing: Heterogeneous Networks Considered Harmful
- C H A P T E R F O U R T E E N How Elegant Code Evolves with
- Adaptive Scheduling for Task Farming with Grid Middleware
- JLAPACK---Compiling LAPACK Fortran to Java, Phase 1 Technical Report cs97367
- An Object Oriented Design for High Performance Linear Algebra on Distributed Memory Architectures
- LAPACK Working Note 41 Installation Guide for LAPACK 1
- Chebyshev tau QZ Algorithm Methods for Calculating Spectra
- A Parallel Implementation of the Nonsymmetric QR Algorithm for Distributed Memory Architectures
- Evaluating Block Algorithm Variants in LAPACK * * This work was supported by the National Science Foundation
- Hybrid Multicore Cholesky Factorization with Multiple GPU Accelerators
- An Updated Set of Basic Linear Algebra Subprograms BLAS
- An Introduction to the MPI Standard Jack J. Dongarra
- ne petaflop per second is a rate of computation corresponding to 1015
- Automatic Optimisation of Parallel Linear Algebra Routines in Systems with Variable Load*
- Available online at www.sciencedirect.com Procedia ScienceProcedia Computer Science 00 (2009) 000000
- From CUDA to OpenCL: Towards a Performance-portable Solution for Multi-platform GPU Programming$,$$
- Current software releases and reports regarding HeNCE can be obtained from netlib by sending electronic mail to netlib@ornl.gov containing the line ``send index from hence''. Instructions on how to receive the
- ScaLAPACK Tutorial ? Jack Dongarra 1;2 and L. Susan Blackford ??1
- Integrated PVM Framework Supports Heterogeneous Network Computing
- Jack Dongarra Pete Beckman
- Accelerating TIME-TO-SOLUTION for Computational
- Rectangular Full Packed Format for Cholesky's Algorithm: Factorization,
- Scheduling Two-sided Transformations using Tile Algorithms on Multicore Architectures
- QR Factorization for the CELL Processor LAPACK Working Note 201
- Trace-based Performance Analysis for the Petascale Simulation Code FLASH
- Elsevier Editorial System(tm) for Future Generation Computer Systems Manuscript Draft
- This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research
- Rectangular Full Packed Format for Cholesky's Algorithm: Factorization, Solution and Inversion
- Cloud Computing and Software Services: Theory and Technique April 30, 2009
- Cloud Service Reliability: Modeling and Analysis Yuan-Shun Dai * a c
- Parallel Two-Sided Matrix Reduction to Band Bidiagonal Form on Multicore Architectures
- The Problem with the Linpack Benchmark Matrix Generator June 28, 2008
- Analytical Modeling and Optimization for Affinity Based Thread Scheduling on Multicore Systems
- Dynamic Task Scheduling for Linear Algebra Algorithms on Distributed-Memory Multicore Systems
- A Note on Auto-tuning GEMM for GPUs , Jack Dongarra1,2,3
- Scheduling Linear Algebra Operations on Multicore Processors LAPACK Working Note 213
- Scheduling Two-sided Transformations using Algorithms-by-Tiles on Multicore Architectures
- Comparative Study of One-Sided Factorizations with Multiple Software Packages on Multi-Core Hardware
- Accelerating the reduction to upper Hessenberg form through hybrid GPU-based computing
- Accelerating Scientific Computations with Mixed Precision Algorithms
- Algorithmic Based Fault Tolerance Applied to High Performance Computing May 23, 2008
- Parallel Block Hessenberg Reduction using Algorithms-By-Tiles for Multicore Architectures
- Request Sequencing: Enabling Workflow for Efficient Problem Solving in , Jack Dongarra 2
- CTWatch Quarterly Creating Software Tools and Libraries for Leaders... http://www.ctwatch.org/quarterly/print.php?p=92 1 of 5 12/13/2007 12:22 PM
- Fast and Small Short Vector SIMD Matrix Multiplication Kernels for the Synergistic Processing Element of the CELL Processor
- Matrix Product on Heterogeneous Master-Worker Platforms Jack Dongarra1
- COMPUTING THE CONDITIONING OF THE COMPONENTS OF A LINEAR LEAST SQUARES SOLUTION
- K EN KE NNE DY 51S C I D A C R E V I E W F A L L 2 0 0 7 W W W . S C I D A C R E V I E W . O R G
- A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures
- Self Adaptive Application Level Fault Tolerance for Parallel and Distributed Computing
- Mixed Precision Iterative Refinement Techniques for the Solution of Dense Linear Systems
- International Journal of Foundations of Computer Science c World Scientific Publishing Company
- High Performance Development for High End Computing with Python Language Wrapper (PLW)
- This article was originally published in a journal published by Elsevier, and the attached copy is provided by Elsevier for the
- Int. J. Computational Science and Engineering, Vol. 2, Nos. 3/4, 2006 205 Conjugate-gradient eigenvalue solvers in
- Parallel Processing Letters fc World Scientific Publishing Company
- Solving Systems of Linear Equations on the CELL Processor Using Cholesky Factorization LAPACK Working Note 184
- O'Reilly Media, Inc. 4/17/2007 How Elegant Code Evolves with
- The Impact of Multicore on Math Software Alfredo Buttari1
- Parallel Linear Algebra Victor Eijkhout, Julien Langou, and Jack Dongarra
- Using Mixed Precision for Sparse Matrix Computations to Enhance the
- University of California/Davis University of California/Berkeley
- Summary of Software for Linear Algebra
- Twenty-Plus Years of Netlib and NA-Net Jack Dongarra, Gene Golub, Eric Grosse, Cleve Moler, Keith Moore
- Biological Sequence Alignment On The Computational Grid Using The Grads Framework
- NanoPSE: Nanoscience Problem Solving Environment for atomistic electronic structure of
- Performance Analysis of MPI Collective Operations Jelena Pjesivac-Grbovic1, Thara Angskun1, George Bosilca1,
- An Asynchronous Algorithm on NetSolve Global Computing System
- Conjugate-Gradient Eigenvalue Solvers in Computing Electronic Properties of
- NetSolve: Grid Enabling Scientific Computing Environments
- UncorrectedProof DOI: 10.1007/s10766-005-3584-4
- A Scalable Approach to MPI Application Performance Analysis
- Recovery Patterns for Iterative Methods in a Parallel Unstable Environment
- Automatic Experimental Analysis of Communication Patterns in Virtual Topologies
- Introduction to the HPCChallenge Benchmark Suite Jack J. Dongarra Piotr Luszczek
- CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2004; 00:126 Prepared using cpeauth.cls [Version: 2002/09/19
- Active Logistical State Management in GridSolve/L Micah Beck, Jack Dongarra, Jian Huang, Terry Moore and James S. Plank
- An Algebra for Cross-Experiment Performance Analysis Fengguang Song, Felix Wolf, Nikhil Bhatia, Jack Dongarra, and Shirley Moore
- AN OVERVIEW OF HETEROGENEOUS HIGH PERFORMANCE AND GRID COMPUTING
- Efficient Pattern Search in Large Traces through Successive Refinement
- %&'() 0(1)2345 0)(6(7 %8 9(''5 (@& A(BC A8 DE@'())( F@@EG(H2G3 IE7PQH2@' R(1E)(HE)S5 IE7PQH3) TB23@B3 D3P()H373@H5
- A Fault-Tolerant Communication Library for Grid Environments
- Applying Aspect-Orient Programming Concepts to a Component-based Programming Model
- Energy Minimization of Protein Tertiary Structure by Parallel Simulated Annealing using Genetic Crossover
- Fault Tolerant Communication Library and Applications for High Performance Graham E. Fagg, Edgar Gabriel, Zizhon Chen,
- GrADSolve -a Grid-based RPC system for Parallel Computing with Application-Level
- NetSolve: past, present, and future; a look at a Grid enabled server1
- Optimization Problem Solving System using Grid RPC Tomoyuki HIROYASU Mitsunori MIKI Hisashi SHIMOSAKA
- SCALABLE, TRUSTWORTHY NETWORK COMPUTING USING UNTRUSTED INTERMEDIARIES
- SCHEDULING IN THE GRID APPLICATION DEVELOPMENT SOFTWARE PROJECT
- Self-adapting Numerical Software and Automatic Tuning of Jack Dongarra, Victor Eijkhout
- Hamparsum Bozdogan Statistical Data Mining, and
- 0 100 200 300 400 500 600 700 800 900 1000 Vector dimension
- A Metascheduler For The Grid Sathish S. Vadhiyar and Jack J. Dongarra
- A Performance Oriented Migration Framework For The Grid Sathish S. Vadhiyar and Jack J. Dongarra
- Network File System Sun's NFS (RPC/UDP)
- NetBuild: Transparent Cross-Platform Access to Computational Software Libraries Keith Moore
- Lecture Notes in Computer Science 1 Parallel IO support for Meta-Computing Applications
- Performance Modeling for Self Adapting Collective Communications for MPI
- The GrADS Project: Software Support for High-Level Grid Application Development
- Using PAPI for hardware performance monitoring on Linux Jack Dongarra, Kevin London, Shirley Moore, Phil Mucci, and Dan Terpstra
- A Scalable Cross-Platform Infrastructure for Application Performance Tuning Using Hardware Counters
- Automatically Tuned Collective Communications Sathish S. Vadhiyar+
- Lecture Notes in Computer Science 1 FT-MPI: Fault Tolerant MPI, supporting dynamic
- Telescoping Languages: A Strategy for Automatic Generation of Scientific Problem-Solving Systems from Annotated Libraries
- JLAPACK|Compiling LAPACK Fortran to Java David M. Doolin
- This paper describes an approach for the automatic generation and optimization of numer-ical software for processors with deep memory hierarchies and pipelined functional units.
- Determining the Idle Time of a Tiling: New Results Fr ed eric Desprez1, Jack Dongarra2;3, Fabrice Rastello1 and Yves Robert2
- Using Agent-based Software for Scienti c Computing
- Case Studies on The Development of ScaLAPACK and the NAG
- Java Access to Numerical Libraries Henri Casanova
- JLAPACK|Compiling LAPACK Fortran to Java, Phase 1 Technical Report cs-97-367
- Key Concepts For Parallel Out-Of-Core LU Factorization
- Scheduling Block-Cyclic Array Redistribution Fr ed eric Desprez1, Jack Dongarra2;3, Antoine Petitet2, Cyril Randriamaro1 and Yves Robert2
- A Test Matrix Collection for Non-Hermitian Eigenvalue Problems
- Matrix Market : A Web Resource for Test Matrix Collections
- NetSolve: A Network Server for Solving Computational Science Problems
- Practical Experience in the Dangers of Heterogeneous L. S. Blackford, A. Clearyy, J. Demmelz, I. Dhillonz, J. Dongarrax,
- The Spectral Decomposition of Nonsymmetric Matrices on Distributed Memory Parallel Computers
- The use of Java in the NetSolve project
- Algorithmic Bombardment for the Iterative Solution of Linear Systems
- Future Linear Algebra Libraries Jack Dongarra
- Key Concepts For Parallel Out-Of-Core LU Factorization
- PVMPI: An Integration of the PVM and MPI Systems Graham E. Fagg
- ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers -Design Issues and Performance
- Templates for Linear Algebra Problems Zhaojun Bai
- TOP500 Supercomputer Sites Jack J. Dongarra
- A Highly Parallel Algorithm for the Reduction of a Nonsymmetric Matrix to Block Upper-Hessenberg Form
- Digital Software and Data Repositories for Support of Scienti c Computing
- Toward a Proposal for a set of Parallel Basic Linear Algebra Subprograms
- IML++ v. 1.2 Iterative Methods Library
- Location-Independent Naming for Virtual Distributed Software Repositories
- Overview of VPE: A Visual Environment for Message-Passing Peter Newton
- A Sparse Matrix Library in C++ for High Performance Architectures
- SIAM J. ScI. COMPUT. Vol. 14, No. 3, pp. 542-569, May 1993
- Tools for Heterogeneous Network Computing Adam Begueliny
- SUPERCOMPUTING '96 ScaLAPACK: A Portable Linear Algebra Library
- High Performance Computing Technologies Jack Dongarra
- Problem Solving Environments for Parallel Scienti c Computation
- Evolving Software Repositories http: www.netlib.org utk projects esr
- A Framework For Migrating Applications Under Changing Load Conditions In The Grid
- Reliability and Performance Models for Grid Computing Yuan-Shun Dai 1,2
- Performance Analysis of MPI Collective Jelena Pjesivac-Grbovic1
- Parallel Band Two-Sided Matrix Bidiagonalization for Multicore Architectures
- An Implementation of the Tile QR Factorization for a GPU and Multiple CPUs
- Enabling Workflows in GridSolve: Request Sequencing and Service Trading
- ITW2004, San Antonio, Texas, October 24 29, 2004 Numerically Stable Real-Number Codes Based on Random Matrices
- A Comparison of Search Heuristics for Empirical Code Optimization
- ScaLAPACK Tutorial ? Jack Dongarra1;2 and L. Susan Blackford??1
- Practical Experience in the Dangers of Heterogeneous L. S. Blackford \Lambda , A. Cleary y , J. Demmel z , I. Dhillon z , J. Dongarra x ,
- MPI: A MessagePassing Interface Standard Message Passing Interface Forum
- Sparse Matrix Libraries in C++ for High Performance Architectures
- Overview of Templates Jack Dongarra
- Overview of ScaLAPACK Jack Dongarra
- A Parallel Implementation of the Nonsymmetric QR Algorithm for Distributed Memory Architectures
- The Performance of PVM on MPP Systems
- CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. (2010)
- Predicting the electronic properties of 3D, million-atom semiconductor nanostructure architectures
- LAPACK Working Note #224 QR Factorization of Tall and Skinny Matrices in a
- More on Scheduling BlockCyclic Array Redistribution \Lambda Fr'ed'eric Desprez 1 , St'ephane Domas 1 , Jack Dongarra 2;3 ,
- MPI WALL Determine elapsed wallclock time in seconds. double precision function MPI WALL ( )
- Providing Infrastructure and Interface to HighPerformance Applications in a Distributed Setting. (Extended Abstract)
- A Fully Parallel Algorithm for the Symmetric Eigenvalue Problem J.J. Dongarra and D. C. Sorensen
- A Test Matrix Collection for NonHermitian Eigenvalue Problems \Lambda
- Values of N Subroutine 100 200 300 400 500
- Chebyshev tau -QZ Algorithm Methods for Calculating Spectra
- Optimization Problem Solving System using GridRPC
- ATLAS on the BlueGene/L Preliminary Results Keith Seymour Haihang You Jack Dongarra
- Extending the MPI Specification for Process Fault Tolerance on High Performance Computing Systems
- Reliability and Performance Modeling and Analysis for Grid Computing
- Netlib and NA-Net: building a scientific computing community Jack Dongarra, Gene Golub, Eric Grosse, Cleve Moler, Keith Moore
- Parallel Computing 3 (1986) 25-34 25 North-Holland
- Comparative Study of One-Sided Factorizations with Multiple Software Packages on Multi-Core Hardware
- EXPERIMENTS WITH STRASSEN'S ALGORITHM: FROM SEQUENTIAL TO PARALLEL
- Static tiling for heterogeneous computing platforms 1
- Deploying Fault-tolerance and Task Migration with NetSolve James S. Plank
- Algorithm-Based Checkpoint-Free Fault Tolerance for Parallel Matrix Computations on Volatile Resources
- Methodology, Relations and Results
- Developing numerical libraries in Java RONALD F. BOISVERT1, JACK J. DONGARRA2, ROLDAN POZO1,
- Parallelizing the Divide and Conquer Algorithm
- 144. Andrew B. White, Los Alamos National Laboratory, P. O. Box 1663, MS265, Los Alamos, 145. David L. Williamson, National Center for Atmospheric Research, P. O. Box 3000, Boulder,
- Java Access to Numerical Libraries Henri Casanova \Lambda Jack Dongarra \Lambda y David M. Doolin \Lambda
- The Spectral Decomposition of Nonsymmetric Matrices on Distributed Memory Parallel Computers
- Possibilities for Active Messaging in PVM Philip J. Mucci
- Fast and Small Short Vector SIMD Matrix Multiplication Kernels for the Synergistic Processing Element of the CELL Processor
- Interactive and Dynamic Content in Software Repositories
- Scanning the Issue Special Issue on Program Generation, Optimization, and
- [10] H. J. Symm and J. H. Wilkinson. Realistic error bounds for a simple eigenvalue and its associated eigenvector. Num. Math., 35:113--, 1980.
- Fault Tolerant MPI for the HARNESS MetaComputing system
- The Dangers of Heterogeneous Network Computing: Heterogeneous Networks Considered Harmful
- Key Concepts For Parallel OutOfCore LU Factorization
- Algorithmic Bombardment for the Iterative Solution of Linear Systems
- PVMPI: An Integration of the PVM and MPI Systems Graham E. Fagg \Lambda Jack J. Dongarra y
- CTWatch Quarterly Enabling Advanced Scientific Computing Software http://www.ctwatch.org/quarterly/print.php?p=91 1 of 5 12/13/2007 12:21 PM
- LAPACK Working Note 95 ScaLAPACK: A Portable Linear Algebra Library for Distributed
- This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research
- ST-HEC: Reliable and Scalable Software for Linear Algebra Computations on High End James Demmel (U California, Berkeley) and Jack Dongarra (U Tennessee, Knoxville)
- Logistical Computing and Internetworking: Middleware for the Use of Storage in Communication
- LAPACK++ V. 1.0 High Performance Linear Algebra
- UT-CS-01-460 April 28, 2001
- Dynamic Recon guration and Virtual Machine Management in the Harness Metacomputing
- Integrated Tool Capabilities for Performance Instrumentation and Measurement
- Vita for Jack Dongarra April 1, 1999
- Overview of recent supercomputers Aad J. van der Steen
- Revisiting Matrix Product on Master-Worker Platforms Jack Dongarra2
- CTWatch Quarterly The Impact of Multicore on Computational Scienc... http://www.ctwatch.org/quarterly/print.php?p=67 1 of 8 2/28/2007 4:31 PM
- Providing Access to High Performance Computing Technologies
- Overview of VPE: A Visual Environment for MessagePassing Peter Newton
- This paper describes an approach for the automatic generation and optimization of numer ical software for processors with deep memory hierarchies and pipelined functional units.
- 0 100 200 300 400 500 600 Order of vectors/matrices
- UNCORRECTEDPROOF 2 HARNESS and fault tolerant MPI
- Integrated PVM Framework Supports Heterogeneous Network Computing
- Scalability Analysis of the SPEC OpenMP Benchmarks on Large-Scale Shared Memory
- Automatic search for patterns of inefficient
- Process Fault-Tolerance: Semantics, Design and Applications for High Performance Computing
- visPerf: Monitoring Tool for Grid Computing DongWoo Lee1
- Another Architecture: PVM on Windows 95 NT Markus Fischer
- DOI: 10.1007/s10766-005-3577-3 International Journal of Parallel Programming, Vol. 33, No. 2, June 2005 ( 2005)
- Binomial Graph: A Scalable and Fault-Tolerant Logical Network Topology
- Analysis of Preparation
- The Performance of PVM on MPP Systems
- Experiences and Lessons Learned with a Portable Interface to Hardware Performance Counters
- UsersApplications Fault Tolerance
- A PARALLEL IMPLEMENTATION OF THE NONSYMMETRIC QR ALGORITHM FOR DISTRIBUTED MEMORY ARCHITECTURES
- from SIAM News, Volume 32, Number 6 Atlanta Organizers Put Mathematics to Work
- Feedback-Directed Thread Scheduling with Memory Considerations
- SIAM J. ScI. STAT. COMPUT. Vol. 7, No. 1, January 1986
- Automatic Blocking of Nested Loops Robert Schreiber
- Design of Interactive Environment for Numerically Intensive Parallel Linear Algebra Calculations
- The Impact of RISC and Parallel RISC Systems
- Fault Tolerant Matrix Operations for Networks of Workstations Using Multiple Checkpointing
- Case Studies on The Development of ScaLAPACK and the NAG
- HARNESS: A Next Generation Distributed Virtual Machine
- Sparse Matrix Libraries in C++ for High Performance Architectures \Lambda
- LAPACK Working Note #16 Results from the Initial Release of LAPACK *
- Last name: Rosener First name: Bill
- subroutine prec3s( x, px, a2, a3, a4, nx, ny, nz, rkr ) implicit double precision (ah,oz)
- GridSolve: The Evolution of A Network Enabled Solver
- Fault Tolerant MPI for the HARNESS MetaComputing system
- Scalable Fault Tolerant Protocol for Parallel Runtime Environments
- EXPERIENCES WITH WINDOWS 95 NT AS A CLUSTER COMPUTING PLATFORM FOR PARALLEL COMPUTING.
- Flexible collective communication tuning architecture applied to Open MPI
- Optimizing Matrix Multiplication for a Short-Vector SIMD Architecture CELL
- HIGH PERFORMANCE COMPUTING TODAY Jack Dongarra
- LAPACK Working Note 139 A Numerical Linear Algebra Problem Solving Environment
- An Asynchronous Algorithm on NetSolve Global Computing System
- ORNL/TM-2004/13 Cray X1 Evaluation Status Report
- End-user Tools for Application Performance Analysis Using Hardware Counters
- Overview of Templates Jack Dongarra
- MessagePassing Performance of Various Computers \Lambday
- Another Architecture: PVM on Windows 95/NT Markus Fischer \Lambday Jack Dongarra \Lambda z
- LAPACK Working Notes: LAPACK Working Note #1: J. Demmel and J. Dongarra and J. Du Croz and A. Greenbaum
- Interactive and Dynamic Content in Software Repositories \Lambda
- Users' Guidei Vincent A. Barker 1 L. Susan Blackford 2
- Matrix Market : A Web Resource for Test Matrix Collections
- Benchmarking Performance
- ORNL/TM12470 Engineering Physics and Mathematics Division
- LAPACK Working Note 58 The Design of Linear Algebra Libraries for High Performance Computers \Lambda
- ORNL/TM12309 Engineering Physics and Mathematics Division
- Management of the NHSE --a Virtual Distributed Digital Library \Lambda
- An Introduction to the MPI Standard Jack J. Dongarra
- Revisiting Matrix Product on Master-Worker Platforms Jack Dongarra2
- Numerical Libraries and Tools for Scalable Parallel Cluster Computing Shirley Browne, Jack Dongarra, and Anne Trefethen*
- CS -89 -85 Performance of Various Computers Using Standard
- Packages of Subroutines
- Optimizing Performance and Reliability in Distributed Computing Systems Through Wide Spectrum Storage
- Middleware for the Use of Storage in Communication , Dorian Arnold
- Toward a Proposal for a set of Parallel Basic Linear Algebra Subprograms
- Table 4: Timings for Wilkinson Shift and Perfect Shift Algorithms n = 100 n = 200
- [1] E. Anderson and J. Dongarra, LAPACK Working Note 16: Results from the Initial Release of LAPACK, University of Tennessee, CS8989, November 1989.
- Stochastic Performance Prediction for Iterative Algorithms in Distributed Environments
- CS 89 85 Performance of Various Computers Using Standard
- Scheduling BlockCyclic Array Redistribution \Lambda Fr'ed'eric Desprez 1 , Jack Dongarra 2;3 , Antoine Petitet 2 , Cyril Randriamaro 1 and Yves Robert 2
- Chapter in Wiley Encyclopedia of Electrical and Electronics Engineering
- [13] T.Y. Li, Zhonggang Zeng, and Luan Cong. Solving eigenvalue problems of real nonsym metric matrices with real homotopies. Preprint, Michigan State University, E. Lansing, MI
- Taskers and General Resource Managers : PVM supporting DCE Process Management
- The Marketplace of High Performance Erich Strohmaiera 1 , Jack J. Dongarraa;b 2 , Hans W. Meuerc 3
- Scalable Fault Tolerant MPI: Extending the recovery algorithm
- Toward a Framework for Preparing and Executing Adaptive Grid Programs
- ScaLAPACK: A Linear Algebra Library for Message-Passing L. S. Blackford,yJ. Choi,zA. Cleary,y E. D'Azevedo,xJ. Demmel, I. Dhillon, J. Dongarra,k
- DOI: 10.1007/s10766-005-3584-4 International Journal of Parallel Programming, Vol. 33, Nos. 2/3, June 2005 ( 2005)
- Overview of HPC Jack Dongarra
- [9] T.H. Dunigan. Performance of the Intel iPSC/860 and NCUBE 6400 hypercubes. Technical Report ORNL/TM11790, Oak Ridge National Laboratory, Oak Ridge, Tennessee, 1991.
- ARGONNE NATIONAL LABORATORY 9700 South Cass Avenue
- ARGONNE NATIONAL LABORATOR 9700 South Cass Avenue
- LAPACK Working Note ? LAPACK Block Factorization Algorithms
- Implementation of a Mixed-Precision in Solving Systems of Linear Equations
- Self-adapting Numerical Software for Next Generation Applications
- DARPA's HPCS Program: History, Models, Tools, Languages Jack Dongarra, University of Tennessee and Oak Ridge National Lab
- High-Performance Computing in Industry Erich Strohmaier
- HARNESS Fault Tolerant MPI design, usage and performance issues
- Algorithmic Redistribution Methods for Block Cyclic Decompositions
- Review of Performance Analysis Tools for MPI Parallel Programs Shirley Moore, David Cronk, Kevin London, and Jack Dongarra
- SELF-HEALING NETWORK FOR SCALABLE FAULT TOLERANT
- Evaluation of High-Performance Computing Software Shirley Browne Jack Dongarray Tom Rowanz
- CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2008; 20:15731590
- National HPCC Software Exchange Shirley Browne, Jack Dongarra, Stan Green,
- Parallelizing the Divide and Conquer Algorithm
- Prospectus for the Next LAPACK and ScaLAPACK Libraries
- The use of Java in the NetSolve project
- Tiling on Systems with Communication/Computation Overlap \Lambda
- integer mynum, hostnum, bytes, msgtype, ... double precision result, data(100), ...
- Scalable Networked Information Processing Environment (SNIPE)
- Automatic Blocking of Nested Loops Robert Schreiber \Lambda
- Future Linear Algebra Libraries Jack Dongarra
- Distributed Probabilistic Model-Building Genetic Algorithm
- Creating Software Technology to Harness the Power of Leadership-class Computing Systems
- Overview of HPC Jack Dongarra
- Self-Healing in Binomial Graph Networks Thara Angskun1
- Exploiting the Performance of 32 bit Floating Point Arithmetic in Obtaining
- Self-adapting software for numerical linear algebra and LAPACK for clusters q
- Future Generation Computer Systems 15 (1999) 595605 Scalable networked information processing environment (SNIPE)
- The use of bulk states to accelerate the band edge state calculation of a semiconductor quantum dot q
- Noname manuscript No. (will be inserted by the editor)
- SIAM J. MATRIX ANAL. APPL. c 2005 Society for Industrial and Applied Mathematics Vol. 27, No. 3, pp. 603620
- Predicting the electronic properties of 3D, million-atom semiconductor nanostructure architectures
- Applying Netsolve's Network-Enabled Server
- An iterative solver benchmark1 Jack Dongarra, Victor Eijkhout and
- CONCURRENCY: PRACTICE AND EXPERIENCE Concurrency: Pract. Exper. 2000; 12:14811493
- T h e F i r s t W o r d Copublished by the IEEE CS and the AIP 1521-9615/08/$25.00 2008 IEEE Computing in SCienCe & engineering
- Solving Systems of Linear Equations on the CELL Processor Using Cholesky Factorization
- Future Generation Computer Systems 22 (2006) 279290 An asynchronous algorithm on the NetSolve global
- http://hpc.sagepub.com/ Computing Applications
- Recent trends in the marketplace of high performance computing
- Scientific Programming 18 (2010) 3550 35 DOI 10.3233/SPR-2010-0297
- HARNESS and fault tolerant MPI Graham E. Fagg *, Antonin Bukovsky, Jack J. Dongarra
- TOP500 Supercomputer Sites 10th Edition
- SIAM J. ScI. STAT. COMPUT. Vol. 4, No. 4, December 1983
- Future Generation Computer Systems 22 (2006) 665675 www.elsevier.com/locate/fgcs
- Highly Scalable Self-Healing Algorithms for High Performance Scientific Computing
- http://hpc.sagepub.com/ Computing Applications
- The TOP500 and Computational Science A not-so-simple matter of software
- NanoPSE: Nanoscience Problem Solving Environment for atomistic electronic structure of semiconductor nanostructures
- A Proposal [or an ExLcndcd Set of IrorLran Basic Linear Algebra Subprograms
- PERFORMANCE STUDY OF LU FACTORIZATION WITH LOW COMMUNICATION OVERHEAD ON MULTIPROCESSORS
- Future Generation Computer Systems 18 (2002) 11271142 HARNESS fault tolerant MPI design, usage
- Optimizing Symmetric Dense Matrix-Vector Multiplication Computer Science and
- Future Generation Computer Systems 15 (1999) 745755 Deploying fault tolerance and taks migration with NetSolve
- CONCURRENCY: PRACTICE AND EXPERIENCE Concurrency: Pract. Exper., Vol. 11(3), 139153 (1999)
- ELSEVIER Parallel Computing 23 (1997) 49-70 Key concepts for parallel out-of-core LU
- A Parallel Divide and Conquer Algorithm for the Symmetric Eigenvalue
- ParallelComputingI (1984)223-235 223 North-Holland
- An Updated Set of Basic Linear Algebra Subprograms (BLAS)
- The Eigenvalue Problem for Hermitian Matrices with Time Reversal Symmetry
- Future Generation Computer Systems 27 (2011) 357369 Contents lists available at ScienceDirect
- CONCURRENCY: PRACTICE AND EXPERIENCE, VOL. 9(10), 915926 (OCTOBER 1997) Message-passing performance of various
- Parallel Two-Sided Matrix Reduction to Band Bidiagonal Form on Multicore Architectures
- http://hpc.sagepub.com/ Computing Applications
- CONCURRENCY: PRACTICE AND EXPERIENCE Concurrency: Pract. Exper., Vol. 10(1113), 11171129 (1998)
- Future Generation Computer Systems 21 (2005) 980986 Biological sequence alignment on the computational
- Jack Dongarra Universityof Tennessee,Knoxville
- Automated empirical optimizations of software and the ATLAS project q
- Future Generation Computer Systems 15 (1999) 571582 HARNESS: a next generation distributed virtual machine
- Using Mixed Precision for Sparse Matrix Computations to Enhance the Performance
- ELSEVIER Parallel Computing 210995) 1189-1211 A parallel algorithm for the reduction of a nonsymmetric
- Practical Experience in the Numerical Dangers of Heterogeneous Computing
- ne petaflop per second is a rate of computation corresponding to 1015
- Journal of Parallel and Distributed Computing 58, 68 91 (1999) Stochastic Performance Prediction for Iterative
- An Improved Magma Gemm For Fermi Graphics Processing Units
- DOI: 10.1007/s10766-005-3577-3 International Journal of Parallel Programming, Vol. 33, Nos. 2/3, June 2005 ( 2005)
- SIAM REVIEW Vol. 26, No. 1, January, 1984
- CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2003; 15:803820 (DOI: 10.1002/cpe.728)
- CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2010; 22:24672487
- The marketplace of high-performance computing Erich Strohmaiera,*, Jack J. Dongarraa,b
- A COMPARISON OF PARALLEL SOLVERS FOR DIAGONALLY DOMINANT AND GENERAL NARROW-BANDED LINEAR
- from SIAM News, Volume 34, Number 9 Biannual Top-500 Computer Lists Track Changing Environments
- Algorithm-Based Fault Tolerance for Fail-Stop Failures
- CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2007; 19:13711385
- TOP500 Supercomputer Sites Jack J. Dongarra
- http://hpc.sagepub.com/ Computing Applications
- THE SPECTRAL DECOMPOSITION OF NONSYMMETRIC MATRICES ON DISTRIBUTED MEMORY PARALLEL COMPUTERS
- Journal of Computational and Applied Mathematics 123 (2000) 489514 www.elsevier.nl/locate/cam
- JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING 43, 125138 (1997) ARTICLE NO. PC971336
- Cluster Comput (2009) 12: 101122 DOI 10.1007/s10586-009-0080-4
- Network-enabled Solvers: A Step Toward Grid-based December 13, 2001
- SOFTWARE-PRACTICE AND EXPERIENCE, VOL. 9, 219-226 (1979) Unrolling Loops in FORTRAN*
- Netlib and NA-Net: Building A Scientific Computing Community Jack Dongarra
- Middleware for the use of storage in communication q
- http://hpc.sagepub.com/ Computing Applications
- D-Lib Magazine ISSN 1082-9873
- Squeezing the Most out of an Algorithm in CRAY FORTRAN
- When we try to assess how much progress we have made in computational modeling and simulation, recalling some history
- Scientific Programming 17 (2009) 3142 31 DOI 10.3233/SPR-2009-0268
- http://hpc.sagepub.com/ Computing Applications
- http://hpc.sagepub.com/ Computing Applications
- The Netlib Mathematical Software Repository
- Parallel Processing Letters, Vol. 17, No. 1 (2007) 47-59 %%k. ... _, ~ ** ,, ,, , ., ,, . x.c ,, ,.. .. ,, v
- ELSEVIER Parallel Computing 21(1995) 1387-1405 Parallel matrix transpose algorithms
- Optimizing matrix multiplication for a short-vector SIMD architecture CELL processor
- ParallelComputingI (1984) 133-142 133 North-Holland
- 192 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 9, NO. 2, FEBRUARY 1998 Scheduling Block-Cyclic Array Redistribution
- CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2010; 22:21962211
- CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2003; 15:207222 (DOI: 10.1002/cpe.657)
- Software Distribution Using Xnetlib JACK DONGARRA
- Review: [untitled] Author(s): Jack Dongarra
- http://hpc.sagepub.com/ Computing Applications
- COMPUTATIONAL AND APPLIEDMATHEMATICS
- Supporting Heterogeneous Network Computing: Jack J. Dongarra
- JOURNAL OF COMPUTATIONAL PHYSICS 54, 278-288 (1984) Solving the Secular Equation Including Spin Orbit Coupling
- 84 July 1996/Vol. 39, No. 7 COMMUNICATIONS OF THE ACM HE Message Passing Interface
- Self-Adapting Linear Algebra Algorithms and Software
- Using agent-based software for scientic computing in the NetSolve system
- http://hpc.sagepub.com/ Computing Applications
- CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2010; 22:1544
- CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2007; 19:14811496
- Algorithmic Redistribution Methods for Block-Cyclic Decompositions
- Future Generation Computer Systems 26 (2010) 479485 Contents lists available at ScienceDirect
- Squeezing the Most out of Eigenvalue Solvers on High-Performance Computers
- Numerical Algorithms 10(1995)379-399 379 of a parallel dense linear algebra software
- Journal of Parallel and Distributed Computing 61, 1803 1826 (2001) Telescoping Languages: A Strategy for Automatic
- CONCURRENCY: PRACTICE AND EXPERIENCE, VOL. 9(11), 12791291 (NOVEMBER 1997) Java access to numerical libraries
- http://hpc.sagepub.com/ Computing Applications
- EL`SVIER Future Generation Computer Systems 12 (1997) 461-474 Changing technologies of HPC
- http://hpc.sagepub.com/ Computing Applications
- State-of-the-art eigensolvers for electronic structure calculations of large scale nano-systems q
- Editor: Michael A. Gray, gray@american.edu 84 Copublished by the IEEE CS and the AIP 1521-9615/08/$25.00 2008 IEEE Computing in SCienCe & engineering
- http://hpc.sagepub.com/ Computing Applications
- Implementation of the HPC Challenge
- Towards dense linear algebra for hybrid GPU accelerated manycore systems Stanimire Tomov a,*, Jack Dongarra a,b,c
- MPI: A Message-Passing Interface Standard Message Passing Interface Forum
- A Comparison of Parallel Solvers for Diagonally Dominant and General Narrow-Banded Linear
- CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2002; 14:14571479 (DOI: 10.1002/cpe.678)
- CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2002; 14:14451456 (DOI: 10.1002/cpe.670)
- Exploiting Fine-Grain Parallelism in Recursive LU Factorization1
- Process FaultTolerance: Semantics, Design and Applications for High Performance Computing
- Fault Tolerant Communication Library and Applications for High Performance Graham E. Fagg, Edgar Gabriel, Zizhon Chen,
- CONCURRENCYPRACTICE AND EXPERIENCE, VOL.8(7), 517-535 (SEPTEMBER 1996) PB-BLAS: A set of parallel block basic linear
- JLAPACK compiling LAPACK FORTRAN David M. Doolin a, Jack Dongarra b,c,
- Jack Dongarra A Historical Overview and
- INTERACTIVE GRID-ACCESS USING GRIDSOLVE AND K. Seymour,1
- Logistical quality of service in NetSolve , H. Casanovaa
- http://hpc.sagepub.com/ Computing Applications
- SIAM J. NUMER. ANAL. Vol. 20, No. 1, February 1983
- ~" NUMERICAL MATHEMATICS
- Recursive approach in sparse matrix LU factorization
- Parallel Processing Letters, Vol. 11, Nos. 2 & 3 (2001) 187-202 World Scientific Publishing Company
- Linear Algebra on High Performance Computers JJ. Dongarra and D.C. Sorensen
- CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2005; 17:235257
- A Software Tool for Accurate Estimation of Parameters of
- Performance of Various bompuacrs U~,~ ~~c.~ Linear Equations So/tv:ar'ein a Fortre.n [i]nvironm ant
- Computer Physics Communications 180 (2009) 25262533 Contents lists available at ScienceDirect
- J. Parallel Distrib. Comput. 69 (2009) 410416 Contents lists available at ScienceDirect
- Visual Programming and Debugging for Parallel .
- J. Parallel Distrib. Comput. 64 (2004) 774783 GrADSolve--a grid-based RPC system for parallel computing with
- NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. (2008)
- A Class of Hybrid LAPACK Algorithms for Multicore and GPU Architectures Mitch Horton, Stanimire Tomov and Jack Dongarra
- Dynamic Reconfiguration and Virtual Machine Management in the Harness Metacomputing System
- Static Tiling for Heterogeneous Computing Platforms Pierre Boulet 1 , Jack Dongarra 2;3 , Yves Robert 4 and Fr ed eric Vivien 5
- Software Reuse in High Performance Computing Shirley Browne
- Parallelizing the Divide and Conquer Algorithm
- Parallel Processing Letters, Vol. 17, No. 1 (2007) 47-59 %%k. ... _, ~ ** ,, ,, , ., ,, . x.c ,, ,.. .. ,, v
- A Scalable Approach to MPI Application Performance Analysis
- JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING 1, 22-31 (1984) Multiprocessing Linear Algebra Algorithms on the
- CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2008; 20:15731590
- Impact of Kernel-Assisted MPI Communication over Scientific Applications: CPMD and FFTW
- Network-enabled Solvers: A Step Toward Grid-based December 13, 2001
- Optimizing matrix multiplication for a short-vector SIMD architecture CELL processor
- Recursive approach in sparse matrix LU factorization
- From CUDA to OpenCL: Towards a Performance-portable Solution for Multi-platform GPU Programming$,$$
- DOI: 10.1007/s10766-005-3584-4 International Journal of Parallel Programming, Vol. 33, Nos. 2/3, June 2005 ( 2005)
- Implementing Matrix Factorizations on the Cell B. E.
- The marketplace of high-performance computing Erich Strohmaiera,*, Jack J. Dongarraa,b
- A Comprehensive Study of Task Coalescing for Selecting Parallelism Granularity in a Two-Stage Bidiagonal Reduction
- Achieving Numerical Accuracy and High Performance using Recursive Tile LU Factorization
- Integrated Tool Capabilities for Performance Instrumentation and Measurement
- ELSEVIER Parallel Computing 21(1995) 1387-1405 Parallel matrix transpose algorithms
- Future Generation Computer Systems 22 (2006) 279290 An asynchronous algorithm on the NetSolve global
- Hybrid Multicore Cholesky Factorization with Multiple GPU Accelerators
- OMPIO: A Modular Software Architecture for Mohamad Chaarawi1
- Acta Numerica (2012), pp. 001 c Cambridge University Press, 2012 doi:10.1017/S09624929 Printed in the United Kingdom
- JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING 43, 125138 (1997) ARTICLE NO. PC971336
- Self-adapting Numerical Software and Automatic Tuning of Jack Dongarra, Victor Eijkhout
- CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2007; 19:13711385
- An Implementation of the Tile QR Factorization for a GPU and Multiple CPUs
- April 20, 2011 17:0 9in x 6in b1189-ch11 Solving the Schrodinger Equation 1st Reading Chapter 111
- DAGuE: A generic distributed DAG engine for high performance computing
- Dense Linear Algebra for Hybrid GPU-Based Systems
- Overlapping Computation and Communication for Advection on
- LU Factorization for Accelerator-based Systems Emmanuel Agullo, Cedric Augonnet, Jack Dongarra, Mathieu Faverge,
- Redesigning the Message Logging Model for High Performance Aurelien Bouteiller, George Bosilca, Jack Dongarra
- NanoPSE: Nanoscience Problem Solving Environment for atomistic electronic structure of semiconductor nanostructures
- Computer Physics Communications 180 (2009) 25262533 Contents lists available at ScienceDirect
- COMPUTATIONAL AND APPLIEDMATHEMATICS
- High Performance Dense Linear System Solver with Soft Error Resilience
- ALGORITHM 589 SlCEDR: A FORTRAN Subroutine
- The Netlib Mathematical Software Repository
- and candor rarely encountered in a sin-gle work, the authors describe an evo-
- Self-adapting software for numerical linear algebra and LAPACK for clusters q
- Scientific Programming 17 (2009) 3142 31 DOI 10.3233/SPR-2009-0268
- CONCURRENCY: PRACTICE AND EXPERIENCE Concurrency: Pract. Exper., Vol. 10(1113), 11171129 (1998)
- Hash functions for datatype signatures in MPI Julien Langou, George Bosilca, Graham Fagg, and Jack Dongarra
- Parallel Computing 3 (1986) 25-34 25 North-Holland
- from SIAM News, Volume 34, Number 9 Biannual Top-500 Computer Lists Track Changing Environments
- Evaluation of the HPC Challenge Benchmarks in Virtualized Environments
- Correlated Set Coordination in Fault Tolerant Message Logging Protocols
- Future Generation Computer Systems 15 (1999) 745755 Deploying fault tolerance and taks migration with NetSolve
- J. Parallel Distrib. Comput. 64 (2004) 774783 GrADSolve--a grid-based RPC system for parallel computing with
- Software Distribution Using Xnetlib JACK DONGARRA
- ne petaflop per second is a rate of computation corresponding to 1015
- An iterative solver benchmark1 Jack Dongarra, Victor Eijkhout and
- CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2005; 17:235257
- Trace-based performance analysis for the petascale simulation code
- High Performance Development for High End Computing with Python Language Wrapper (PLW)
- Scheduling Linear Algebra Operations on Multicore Processors LAPACK Working Note 213
- http://hpc.sagepub.com/ Computing Applications
- Journal of Parallel and Distributed Computing 58, 68 91 (1999) Stochastic Performance Prediction for Iterative
- Recent trends in the marketplace of high performance computing
- Implementing Matrix Multiplication on the Cell B. E.
- A Proposal [or an ExLcndcd Set of IrorLran Basic Linear Algebra Subprograms
- A Software Tool for Accurate Estimation of Parameters of
- CONCURRENCYPRACTICE AND EXPERIENCE, VOL.8(7), 517-535 (SEPTEMBER 1996) PB-BLAS: A set of parallel block basic linear
- High Performance Bidiagonal Reduction using Tile Algorithms on Homogeneous Multicore Architectures
- Optimizing Symmetric Dense Matrix-Vector Multiplication Computer Science and
- ELSEVIER Parallel Computing 23 (1997) 49-70 Key concepts for parallel out-of-core LU
- Centre de recherche INRIA Bordeaux Sud Ouest Domaine Universitaire -351, cours de la Libration 33405 Talence Cedex
- HierKNEM: An Adaptive Framework for Kernel-Assisted and Topology-Aware Collective Communications on Many-core Clusters
- Hierarchical QR factorization algorithms for multi-core cluster systems
- DAGuE: A generic distributed DAG engine for high performance computing
- List of Figures 1.1 The software stack of PLASMA, Version 2.3. . . . . . . . . . 4
- Middleware for the use of storage in communication q
- Scalability Analysis of the SPEC OpenMP Benchmarks on Large-Scale Shared Memory
- Rectangular Full Packed Format for Cholesky's Algorithm: Factorization,
- Creating Software Technology to Harness the Power of Leadership-class Computing Systems
- Linear Algebra on High Performance Computers JJ. Dongarra and D.C. Sorensen
- Visual Programming and Debugging for Parallel .
- http://hpc.sagepub.com/ Computing Applications
- Performance of Various bompuacrs U~,~ ~~c.~ Linear Equations So/tv:ar'ein a Fortre.n [i]nvironm ant
- TransparentCross-PlatformAccessto
- SIAM J. ScI. STAT. COMPUT. Vol. 4, No. 4, December 1983
- INTERACTIVE GRID-ACCESS USING GRIDSOLVE AND K. Seymour,1
- CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2002; 14:14451456 (DOI: 10.1002/cpe.670)
- An Improved Magma Gemm For Fermi Graphics Processing Units
- Faster, Cheaper, Better a Hybridization Methodology to Develop Linear Algebra Software
- http://hpc.sagepub.com/ Computing Applications
- Numerical Algorithms 10(1995)379-399 379 of a parallel dense linear algebra software
- JOURNAL OF COMPUTATIONAL PHYSICS 54, 278-288 (1984) Solving the Secular Equation Including Spin Orbit Coupling
- A Comparison of Search Heuristics for Empirical Code Optimization
- Parallel Processing Letters, Vol. 11, Nos. 2 & 3 (2001) 187-202 World Scientific Publishing Company
- Journal of Computational and Applied Mathematics 123 (2000) 489514 www.elsevier.nl/locate/cam
- SIAM J. NUMER. ANAL. Vol. 20, No. 1, February 1983
- Reducing the Amount of Pivoting in Symmetric Indefinite Systems
- Algorithm-Based Fault Tolerance for Fail-Stop Failures
- http://hpc.sagepub.com/ Computing Applications
- Scientific Programming 18 (2010) 3550 35 DOI 10.3233/SPR-2010-0297
- Highly Scalable Self-Healing Algorithms for High Performance Scientific Computing
- SOFTWARE-PRACTICE AND EXPERIENCE, VOL. 9, 219-226 (1979) Unrolling Loops in FORTRAN*
- http://hpc.sagepub.com/ Computing Applications
- CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2010; 22:21962211
- Cluster Comput (2009) 12: 101122 DOI 10.1007/s10586-009-0080-4
- Jack Dongarra A Historical Overview and
- Automated empirical optimizations of software and the ATLAS project q
- http://hpc.sagepub.com/ Computing Applications
- THE SPECTRAL DECOMPOSITION OF NONSYMMETRIC MATRICES ON DISTRIBUTED MEMORY PARALLEL COMPUTERS
- Reliability and Performance Modeling and Analysis for Grid Computing
- The Eigenvalue Problem for Hermitian Matrices with Time Reversal Symmetry
- ParallelComputingI (1984) 133-142 133 North-Holland
- http://hpc.sagepub.com/ Computing Applications
- Review: [untitled] Author(s): Jack Dongarra
- A COMPARISON OF PARALLEL SOLVERS FOR DIAGONALLY DOMINANT AND GENERAL NARROW-BANDED LINEAR
- National HPCC Software Exchange Shirley Browne, Jack Dongarra, Stan Green,
- Solving Systems of Linear Equations on the CELL Processor Using Cholesky Factorization
- SOFTWARE LIBRARIES FOR LINEAR ALGEBRA COMPUTATIONS ON HIGH PERFORMANCE COMPUTERS
- Future Generation Computer Systems 18 (2002) 11271142 HARNESS fault tolerant MPI design, usage
- On Scalability for MPI Runtime Systems George Bosilca
- SIAM REVIEW Vol. 26, No. 1, January, 1984
- Accelerating linear system solutions using randomization techniques
- When we try to assess how much progress we have made in computational modeling and simulation, recalling some history
- Determining the Idle Time of a Tiling: New Results Fr ed eric Desprez1, Jack Dongarra2;3, Fabrice Rastello1 and Yves Robert2
- Implementation of the HPC Challenge
- Dense Linear Algebra on Distributed Heterogeneous Hardware with a Symbolic DAG
- NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. (2008)
- from SIAM News, Volume 32, Number 6 Atlanta Organizers Put Mathematics to Work
- Static tiling for heterogeneous computing platforms 1
- Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems
- Self Adapting Numerical Software (SANS) Effort George Bosilca, Zizhong Chen, Jack Dongarra, Victor Eijkhout, Graham E. Fagg,
- The TOP500 and Computational Science A not-so-simple matter of software
- ~" NUMERICAL MATHEMATICS
- Programming the LU Factorization for a Multicore System with Accelerators
- Using agent-based software for scientic computing in the NetSolve system
- TOP500 Supercomputer Sites Jack J. Dongarra
- 192 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 9, NO. 2, FEBRUARY 1998 Scheduling Block-Cyclic Array Redistribution
- Parallel Reduction to Condensed Forms for Symmetric Eigenvalue Problems using Aggregated Fine-Grained and
- A Parallel Divide and Conquer Algorithm for the Symmetric Eigenvalue
- CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2010; 22:24672487
- High-performance high-resolution semi-Lagrangian tracer transport on a sphere
- A PARALLEL IMPLEMENTATION OF THE NONSYMMETRIC QR ALGORITHM FOR DISTRIBUTED MEMORY ARCHITECTURES
- http://hpc.sagepub.com/ Computing Applications
- Conjugate-Gradient Eigenvalue Solvers in Computing Electronic Properties of
- CTWatch Quarterly The Impact of Multicore on Computational Scienc... http://www.ctwatch.org/quarterly/print.php?p=67 1 of 8 2/28/2007 4:31 PM
- Applying Netsolve's Network-Enabled Server
- Future Generation Computer Systems 21 (2005) 980986 Biological sequence alignment on the computational
- Self-Adapting Linear Algebra Algorithms and Software
- Revisiting Matrix Product on Master-Worker Platforms Jack Dongarra2
- Squeezing the Most out of Eigenvalue Solvers on High-Performance Computers
- MARCH/APRIL 2005 Copublished by the IEEE CS and the AIP 1521-9615/05/$20.00 2005 IEEE 51 PERSPECTIVESP E R S P E C T I V E S I N C O M P U TAT I O N A L S C I E N C E
- CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2010; 22:1544
- Predicting the electronic properties of 3D, million-atom semiconductor nanostructure architectures
- CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2003; 15:207222 (DOI: 10.1002/cpe.657)
- CONCURRENCY: PRACTICE AND EXPERIENCE Concurrency: Pract. Exper. 2000; 12:14811493
- CONCURRENCY: PRACTICE AND EXPERIENCE, VOL. 9(10), 915926 (OCTOBER 1997) Message-passing performance of various
- PERFORMANCE STUDY OF LU FACTORIZATION WITH LOW COMMUNICATION OVERHEAD ON MULTIPROCESSORS
- Logistical quality of service in NetSolve , H. Casanovaa
- CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2002; 14:14571479 (DOI: 10.1002/cpe.678)
- MPI: A Message-Passing Interface Standard Message Passing Interface Forum
- http://hpc.sagepub.com/ Computing Applications
- Future Generation Computer Systems 26 (2010) 479485 Contents lists available at ScienceDirect
- Practical Experience in the Numerical Dangers of Heterogeneous Computing
- http://hpc.sagepub.com/ Computing Applications
- EL`SVIER Future Generation Computer Systems 12 (1997) 461-474 Changing technologies of HPC
- JLAPACK compiling LAPACK FORTRAN David M. Doolin a, Jack Dongarra b,c,
- UT-CS-01-460 April 28, 2001
- Exploiting Fine-Grain Parallelism in Recursive LU Factorization1
- 3 Empirical Performance Tuning of Dense Linear Algebra Soft-Jack Dongarra and Shirley Moore
- Automated Empirical Optimizations of Software and the ATLAS project
- Scanning the Issue Special Issue on Program Generation, Optimization, and
- High Performance Matrix Inversion Based on LU Factorization for Multicore Architectures
- 84 July 1996/Vol. 39, No. 7 COMMUNICATIONS OF THE ACM HE Message Passing Interface
- Performance Portability of a GPU Enabled Factorization with the DAGuE George Bosilca, Aurelien Bouteiller, Thomas Herault, Pierre Lemarinier,
- Enhancing Parallelism of Tile Bidiagonal Transformation on Multicore Architectures using Tree
- BLAS for GPUs Department of Electrical Engineering and Computer Science, University of
- Jack Dongarra Universityof Tennessee,Knoxville
- An Updated Set of Basic Linear Algebra Subprograms (BLAS)
- Flexible Development of Dense Linear Algebra Algorithms on Massively Parallel Architectures with DPLASMA
- Future Generation Computer Systems 22 (2006) 665675 www.elsevier.com/locate/fgcs
- Towards Dense Linear Algebra for Hybrid GPU Accelerated Manycore Systems
- Solving the Generalized Symmetric Eigenvalue Problem using Tile
- Editor: Michael A. Gray, gray@american.edu 84 Copublished by the IEEE CS and the AIP 1521-9615/08/$25.00 2008 IEEE Computing in SCienCe & engineering
- Self-adapting Numerical Software for Next Generation Applications
- Scalable Runtime for MPI: Efficiently Building the Communication Infrastructure
- ELSEVIER Parallel Computing 210995) 1189-1211 A parallel algorithm for the reduction of a nonsymmetric
- PVMPI: An Integration of the PVM and MPI Systems Graham E. Fagg
- Automatic Experimental Analysis of Communication Patterns in Virtual Topologies
- http://hpc.sagepub.com/ Computing Applications
- Autotuning GEMMs for Fermi Jakub Kurzak
- Future Generation Computer Systems 27 (2011) 357369 Contents lists available at ScienceDirect
- SIAM J. MATRIX ANAL. APPL. c 2005 Society for Industrial and Applied Mathematics Vol. 27, No. 3, pp. 603620
- CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2003; 15:803820 (DOI: 10.1002/cpe.728)
- http://hpc.sagepub.com/ Computing Applications
- Using Mixed Precision for Sparse Matrix Computations to Enhance the Performance
- http://hpc.sagepub.com/ Computing Applications
- A Comparison of Parallel Solvers for Diagonally Dominant and General Narrow-Banded Linear
- HARNESS Fault Tolerant MPI design, usage and performance issues
- QR Factorization on a Multicore Node Enhanced with Multiple GPU Accelerators Emmanuel Agullo, Cedric Augonnet, Jack Dongarra, Mathieu Faverge,
- DOI: 10.1007/s10766-005-3577-3 International Journal of Parallel Programming, Vol. 33, No. 2, June 2005 ( 2005)
- Two-Stage Tridiagonal Reduction for Dense Symmetric Matrices using Tile Algorithms on Multicore Architectures
- TOWARD HIGH PERFORMANCE DIVIDE AND CONQUER EIGENSOLVER FOR DENSE SYMMETRIC MATRICES
- CONCURRENCY: PRACTICE AND EXPERIENCE, VOL. 9(11), 12791291 (NOVEMBER 1997) Java access to numerical libraries
- State-of-the-art eigensolvers for electronic structure calculations of large scale nano-systems q
- banner above paper title From Serial Loops to Parallel Execution
- ParallelComputingI (1984)223-235 223 North-Holland
- Supporting Heterogeneous Network Computing: Jack J. Dongarra
- Algorithmic Redistribution Methods for Block-Cyclic Decompositions
- SIAM J. ScI. COMPUT. Vol. 14, No. 3, pp. 542-569, May 1993
- Towards dense linear algebra for hybrid GPU accelerated manycore systems Stanimire Tomov a,*, Jack Dongarra a,b,c
- Algorithm-based Fault Tolerance for Dense Matrix Factorizations
- Soft Error Resilient QR Factorization for Hybrid System with GPGPU
- CONCURRENCY: PRACTICE AND EXPERIENCE Concurrency: Pract. Exper., Vol. 11(3), 139153 (1999)
- J. Parallel Distrib. Comput. 69 (2009) 410416 Contents lists available at ScienceDirect
- SIAM J. ScI. STAT. COMPUT. Vol. 7, No. 1, January 1986
- Noname manuscript No. (will be inserted by the editor)
- Twenty-Plus Years of Netlib and NA-Net Jack Dongarra, Gene Golub, Eric Grosse, Cleve Moler, Keith Moore
- http://hpc.sagepub.com/ Computing Applications
- Netlib and NA-Net: Building A Scientific Computing Community Jack Dongarra
- T h e F i r s t W o r d Copublished by the IEEE CS and the AIP 1521-9615/08/$25.00 2008 IEEE Computing in SCienCe & engineering
- A Class of Hybrid LAPACK Algorithms for Multicore and GPU Architectures Mitch Horton, Stanimire Tomov and Jack Dongarra
- Level-3 Cholesky Factorization Routines as Part of Many Cholesky Algorithms
- Integrated PVM Framework Supports Heterogeneous Network Computing
- Fast and Small Short Vector SIMD Matrix Multiplication Kernels for the Synergistic Processing Element of the CELL Processor
- Access 02 Summer 2005 Access 03 Summer 2005
- Hamparsum Bozdogan Statistical Data Mining, and
- The use of bulk states to accelerate the band edge state calculation of a semiconductor quantum dot q
- CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2007; 19:14811496
- DOI: 10.1007/s10766-005-3577-3 International Journal of Parallel Programming, Vol. 33, Nos. 2/3, June 2005 ( 2005)
- 90 Copublished by the IEEE CS and the AIP 1521-9615/11/$26.00 2011 IEEE Computing in SCienCe & engineering N o v E l A r C h I t E C t u r E S
- Squeezing the Most out of an Algorithm in CRAY FORTRAN
- Parallel Two-Sided Matrix Reduction to Band Bidiagonal Form on Multicore Architectures
- Journal of Parallel and Distributed Computing 61, 1803 1826 (2001) Telescoping Languages: A Strategy for Automatic
- HARNESS and fault tolerant MPI Graham E. Fagg *, Antonin Bukovsky, Jack J. Dongarra
- http://hpc.sagepub.com/ Computing Applications
- Scalable Fault Tolerant MPI: Extending the recovery algorithm
- EXPERIENCES WITH WINDOWS 95 NT AS A CLUSTER COMPUTING PLATFORM FOR PARALLEL COMPUTING.
- A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures
- ATLAS on the BlueGene/L Preliminary Results Keith Seymour Haihang You Jack Dongarra
- Recovery Patterns for Iterative Methods in a Parallel Unstable Environment
- Future Generation Computer Systems 15 (1999) 595605 Scalable networked information processing environment (SNIPE)
- A parallel tiled solver for dense symmetric indefinite systems on multicore architectures
- TOP500 Supercomputer Sites 10th Edition
- D-Lib Magazine ISSN 1082-9873
- Accelerating the reduction to upper Hessenberg form through hybrid GPU-based computing
- Future Generation Computer Systems 15 (1999) 571582 HARNESS: a next generation distributed virtual machine
- Accelerating TIME-TO-SOLUTION for Computational