skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Reaching bandwidth saturation using transparent injection parallelization

Authors:
 [1];  [2];  [2];  [2]
  1. University of Oregon, OR, USA
  2. Lawrence Berkeley National Laboratory, CA, USA
Publication Date:
Sponsoring Org.:
USDOE
OSTI Identifier:
1437694
Resource Type:
Published Article
Journal Name:
International Journal of High Performance Computing Applications
Additional Journal Information:
Journal Name: International Journal of High Performance Computing Applications Journal Volume: 31 Journal Issue: 5; Journal ID: ISSN 1094-3420
Publisher:
SAGE Publications
Country of Publication:
United States
Language:
English

Citation Formats

Chaimov, Nicholas, Ibrahim, Khaled Z., Williams, Samuel, and Iancu, Costin. Reaching bandwidth saturation using transparent injection parallelization. United States: N. p., 2016. Web. doi:10.1177/1094342016672720.
Chaimov, Nicholas, Ibrahim, Khaled Z., Williams, Samuel, & Iancu, Costin. Reaching bandwidth saturation using transparent injection parallelization. United States. doi:10.1177/1094342016672720.
Chaimov, Nicholas, Ibrahim, Khaled Z., Williams, Samuel, and Iancu, Costin. Wed . "Reaching bandwidth saturation using transparent injection parallelization". United States. doi:10.1177/1094342016672720.
@article{osti_1437694,
title = {Reaching bandwidth saturation using transparent injection parallelization},
author = {Chaimov, Nicholas and Ibrahim, Khaled Z. and Williams, Samuel and Iancu, Costin},
abstractNote = {},
doi = {10.1177/1094342016672720},
journal = {International Journal of High Performance Computing Applications},
number = 5,
volume = 31,
place = {United States},
year = {2016},
month = {10}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record
DOI: 10.1177/1094342016672720

Save / Share:

Works referenced in this record:

Efficient all-to-all broadcast in all-port mesh and torus networks
conference, January 1999

  • Yuanyuan Yang,
  • Proceedings Fifth International Symposium on High-Performance Computer Architecture
  • DOI: 10.1109/HPCA.1999.744382

Scaling all-to-all multicast on fat-tree networks
conference, January 2004

  • Kumar, S.; Kale, L. V.
  • Proceedings. Tenth International Conference on Parallel and Distributed Systems, 2004. ICPADS 2004.
  • DOI: 10.1109/ICPADS.2004.1316097

Enabling MPI interoperability through flexible communication endpoints
conference, January 2013

  • Dinan, James; Balaji, Pavan; Goodell, David
  • Proceedings of the 20th European MPI Users' Group Meeting on - EuroMPI '13
  • DOI: 10.1145/2488551.2488553

Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes
conference, February 2009

  • Rabenseifner, Rolf; Hager, Georg; Jost, Gabriele
  • 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing
  • DOI: 10.1109/PDP.2009.43

Integrating Asynchronous Task Parallelism with MPI
conference, May 2013

  • Chatterjee, Sanjay; Tasirlar, Sagnak; Budimlic, Zoran
  • 2013 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), 2013 IEEE 27th International Symposium on Parallel and Distributed Processing
  • DOI: 10.1109/IPDPS.2013.78

Congestion avoidance on manycore high performance computing systems
conference, January 2012

  • Luo, Miao; Panda, Dhabaleswar K.; Ibrahim, Khaled Z.
  • Proceedings of the 26th ACM international conference on Supercomputing - ICS '12
  • DOI: 10.1145/2304576.2304594

Extracting ultra-scale Lattice Boltzmann performance via hierarchical and distributed auto-tuning
conference, January 2011

  • Williams, Samuel; Oliker, Leonid; Carter, Jonathan
  • Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '11
  • DOI: 10.1145/2063384.2063458

Gyrokinetic toroidal simulations on leading multi- and manycore HPC systems
conference, January 2011

  • Madduri, Kamesh; Ibrahim, Khaled Z.; Williams, Samuel
  • Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '11
  • DOI: 10.1145/2063384.2063415

MT-MPI: multithreaded MPI for many-core environments
conference, January 2014

  • Si, Min; Peña, Antonio J.; Balaji, Pavan
  • Proceedings of the 28th ACM international conference on Supercomputing - ICS '14
  • DOI: 10.1145/2597652.2597658

On the conditions for efficient interoperability with threads: an experience with PGAS languages using cray communication domains
conference, January 2014

  • Ibrahim, Khaled Z.; Yelick, Katherine
  • Proceedings of the 28th ACM international conference on Supercomputing - ICS '14
  • DOI: 10.1145/2597652.2597657

Initial study of multi-endpoint runtime for MPI+OpenMP hybrid programming model on multi-core systems
journal, February 2014


X10: an object-oriented approach to non-uniform cluster computing
conference, January 2005

  • Charles, Philippe; Grothoff, Christian; Saraswat, Vijay
  • Proceedings of the 20th annual ACM SIGPLAN conference on Object oriented programming systems languages and applications - OOPSLA '05
  • DOI: 10.1145/1094811.1094852

Hybrid PGAS runtime support for multicore nodes
conference, January 2010

  • Blagojević, Filip; Hargrove, Paul; Iancu, Costin
  • Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model - PGAS '10
  • DOI: 10.1145/2020373.2020376

MPI versus MPI+OpenMP on the IBM SP for the NAS Benchmarks
conference, January 2000


Test suite for evaluating performance of multithreaded MPI communication
journal, December 2009


Mpi on Millions of Cores
journal, March 2011

  • Balaji, Pavan; Buntinas, Darius; Goodell, David
  • Parallel Processing Letters, Vol. 21, Issue 01
  • DOI: 10.1142/S0129626411000060

An Evaluation of One-Sided and Two-Sided Communication Paradigms on Relaxed-Ordering Interconnect
conference, May 2014

  • Ibrahim, Khaled Z.; Hargrove, Paul H.; Iancu, Costin
  • 2014 IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2014 IEEE 28th International Parallel and Distributed Processing Symposium
  • DOI: 10.1109/IPDPS.2014.116

Minimizing MPI Resource Contention in Multithreaded Multicore Environments
conference, September 2010

  • Goodell, David; Balaji, Pavan; Buntinas, Darius
  • 2010 IEEE International Conference on Cluster Computing (CLUSTER)
  • DOI: 10.1109/CLUSTER.2010.11

The NAS parallel benchmarks---summary and preliminary results
conference, January 1991

  • Bailey, D. H.; Schreiber, R. S.; Simon, H. D.
  • Proceedings of the 1991 ACM/IEEE conference on Supercomputing - Supercomputing '91
  • DOI: 10.1145/125826.125925

Near-optimal all-to-all broadcast in multidimensional all-port meshes and tori
journal, January 2002

  • Yuanyuan Yang,
  • IEEE Transactions on Parallel and Distributed Systems, Vol. 13, Issue 2
  • DOI: 10.1109/71.983941

The Design and Implementation of FFTW3
journal, February 2005


Optimization of geometric multigrid for emerging multi- and manycore processors
conference, November 2012

  • Williams, Samuel; Kalamkar, Dhiraj D.; Singh, Amik
  • 2012 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis
  • DOI: 10.1109/SC.2012.85

Optimization of Collective Communication Operations in MPICH
journal, February 2005

  • Thakur, Rajeev; Rabenseifner, Rolf; Gropp, William
  • The International Journal of High Performance Computing Applications, Vol. 19, Issue 1
  • DOI: 10.1177/1094342005051521