DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Using long vector extensions for MPI reductions

Journal Article · · Parallel Computing

Not Available

Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA)
OSTI ID:
1836606
Journal Information:
Parallel Computing, Journal Name: Parallel Computing Journal Issue: C Vol. 109; ISSN 0167-8191
Publisher:
ElsevierCopyright Statement
Country of Publication:
Netherlands
Language:
English

References (16)

Fast Parallel Algorithms for Short-Range Molecular Dynamics journal March 1995
Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation book January 2004
Flexible workload generation for HPC cluster efficiency benchmarking journal August 2011
An implementation of matrix–matrix multiplication on the Intel KNL processor with AVX-512 journal June 2018
Performance and energy effects on task-based parallelized applications: User-directed versus manual vectorization journal March 2018
A comparative study of automatic vectorizing compilers journal December 1991
Bandwidth optimal all-reduce algorithms for clusters of workstations journal February 2009
Implementing streaming SIMD extensions on the Pentium III processor journal January 2000
Using Arm Scalable Vector Extension to Optimize OPEN MPI conference May 2020
Use of SIMD Vector Operations to Accelerate Application Code Performance on Low-Powered ARM and Intel Platforms
  • Mitra, Gaurav; Johnston, Beau; Rendell, Alistair P.
  • 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW) https://doi.org/10.1109/IPDPSW.2013.207
conference May 2013
Haswell: The Fourth-Generation Intel Core Processor journal March 2014
Knights Landing: Second-Generation Intel Xeon Phi Product journal March 2016
An Evaluation of Vectorizing Compilers conference October 2011
Improving MPI Reduction Performance for Manycore Architectures with OpenMP and Data Compression
  • Shan, Hongzhang; Williams, Samuel; Johnson, Calvin W.
  • 2018 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS) https://doi.org/10.1109/PMBS.2018.8641632
conference November 2018
The TI ASC: a highly modular and flexible super computer architecture conference January 1972
Vector architectures: past, present and future conference January 1998

Similar Records

Callback-based completion notification using MPI Continuations
Journal Article · 2021 · Parallel Computing · OSTI ID:1813643

Fast classification of MPI applications using Lamport’s logical clocks
Journal Article · 2018 · Journal of Parallel and Distributed Computing · OSTI ID:1495296

NEW METHODS FOR REDUCTION OF GROUP REPRESENTATIONS USING AN EXTENSION OF SCHUR'S LEMMA
Journal Article · 1964 · Journal of Mathematical Physics (New York) (U.S.) · OSTI ID:4063563

Related Subjects