
- In the workshop on Protocols for Fast Long Distance Networks (PFLDNet), April, 2006, Nara, Japan. Also available as technical reports at Univerity of Illinois Chicago (EVL RG 20051031 vishwanath), Ohio State University (OSU-CISRC-10/05-TR70) and Los
- Designing Efficient Systems Services and Primitives for Next-Generation Data-Centers K. Vaidyanathan S. Narravula P. Balaji D. K. Panda
- Distributed I/O with ParaMEDIC: Experiences with a Worldwide Supercomputer
- Sockets Direct Procotol over InfiniBand in Clusters: Is it Beneficial? P. Balaji S. Narravula K. Vaidyanathan S. Krishnamoorthy J. Wu D. K. Panda
- DOI 10.1007/s00450-009-0095-3 SPECIAL ISSUE P APER
- Making a Case for Proactive Flow Control in Optical Circuit-Switched Networks
- MPI at Exascale Rajeev Thakur,1 Pavan Balaji,1 Darius Buntinas,1 David Goodell,1 William Gropp,2
- Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand
- Implementing MPI on Windows: Comparison with Common Approaches on Unix
- Semantics-based Distributed I/O for mpiBLAST P. Balaji W. Feng J. Archuleta H. Lin R. Kettimuthu R. Thakur X. Ma
- DOI 10.1007/s00450-009-0090-8 SPECIAL ISSUE P APER
- Understanding Network Saturation Behavior on Large-Scale Blue Gene/P Systems
- Comput Sci Res Dev (2011) 26: 247256 DOI 10.1007/s00450-011-0168-y
- CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2010; 22:22662281
- J. Parallel Distrib. Comput. 65 (2005) 13481365 www.elsevier.com/locate/jpdc
- In the workshop on NSF Next Generation Software (NGS) Program; held in conjunction with IPDPS, Rhodes Island, Greece, April 2006. Also available as an Ohio State University technical report.
- Fault-Tolerant Communication Runtime Support for Data-Centric Programming Models
- Power and Performance Characterization of Computational Kernels on the GPU
- Designing Energy Efficient Communication Runtime Systems for Data Centric Programming
- Enabling Concurrent Multithreaded MPI Communication on Multicore Petascale Systems
- Asymmetric Interactions in Symmetric Multi-core Systems: Analysis, Enhancements and Evaluation
- Are Nonblocking Networks Really Needed for High-End-Computing Workloads?
- A Simple, Pipelined Algorithm for Large, Irregular All-gather Problems
- Semantics-based Distributed I/O with the ParaMEDIC Framework Math. and Comp. Science
- In the workshop on Communication Architecture for Clusters (CAC); held in conjunction with IPDPS, Rhodes Island, Greece, April 2006. Also available as Ohio State University technical report OSU-CISRC-10/05-TR68.
- Supporting iWARP Compatibility and Features for Regular Network Adapters
- Towards Provision of Quality of Service Guarantees in Job Scheduling Mohammad Islam Pavan Balaji P. Sadayappan D. K. Panda
- Opportune Job Shredding: An Effective Approach for Scheduling Parameter Sweep Applications
- QoPS: A QoS based scheme for Parallel Job Scheduling Mohammad Islam Pavan Balaji P. Sadayappan D. K. Panda
- High Performance User Level Sockets over Gigabit Ethernet Pavan Balaji*
- Hybrid Parallel Programming with MPI and Unified Parallel C
- Analyzing and Minimizing the Impact of Opportunity Cost in QoS-aware Job Scheduling
- Massively Parallel Genomic Sequence Search on the Blue Gene/P Architecture
- On the Provision of Prioritization and Soft QoS in Dynamically Reconfigurable Shared Data-Centers over InfiniBand
- Non-Data-Communication Overheads in MPI: Analysis on Blue Gene/P
- Sockets Direct Protocol for Hybrid Network Stacks: A Case Study with iWARP over
- Natively Supporting True One-sided Communication in MPI on Multi-core Systems with InfiniBand
- Toward Efficient Support for Multithreaded MPI Communication
- Minimizing MPI Resource Contention in Multithreaded Multicore Environments
- Impact of High Performance Sockets on Data Intensive Applications Pavan Balaji
- Communication Analysis of Parallel 3D FFT for Flat Cartesian Meshes on Large Blue Gene
- GePSeA: A General-Purpose Software Acceleration Framework for Lightweight Task Offloading
- In the past decade, a wide array of interconnect technologies have entered the sys-
- Performance Evaluation of RDMA over IP: A Case Study with the Ammasso Gigabit Ethernet NIC
- Evaluation of ConnectX Virtual Protocol Interconnect for Data Centers Ryan E. Grant1
- Impact of Network Sharing in Multi-core Architectures G. Narayanaswamy
- Analyzing the Impact of Supporting Out-of-Order Communication on In-order Performance with iWARP
- Proceedings of the International Conference on Parallel Processing (ICPP 2007), XiAn, China, September 2007. Also available as Argonne National Laboratory preprint ANL/MCS-P1422-0507.
- In the Proceedings of the IEEE International Conference on Cluster Computing (Cluster 2005), Boston, MA, Sep 2005. Also available as Los Alamos technical report LA-UR-05-4148 and Ohio State University technical report OSU-CISRC-5/05-TR35.
- Architecture for Caching Responses with Multiple Dynamic Dependencies in Multi-Tier Data-Centers over InfiniBand
- Workload-driven Analysis of File Systems in Shared Multi-tier Data-Centers over InfiniBand K. Vaidyanathan P. Balaji H. -W. Jin D. K. Panda
- Sockets vs RDMA Interface over 10-Gigabit Networks: An In-depth analysis of the Memory Traffic Bottleneck
- Supporting Strong Coherency for Active Caches in Multi-Tier Data-Centers over S. Narravula P. Balaji K. Vaidyanathan S. Krishnamoorthy J. Wu D. K. Panda
- Efficient Collective Operations using Remote Memory Operations on VIA-Based Rinku Gupta
- An Analysis of 10-Gigabit Ethernet Protocol Stacks in Multicore Environments
- PMI: A Scalable Parallel Process-Management Interface for Extreme-Scale Systems
- Improving Resource Availability by Relaxing Network Allocation Constraints on Blue Gene/P
- Proc. of the 13th IEEE Symp. on High-Performance Interconnects (Hot Interconnects 2005), Palo Alto, CA, August 2005. Also available as Los Alamos technical report LA-UR-05-2635.
- ParaMEDIC: Parallel Metadata Environment for Distributed I/O and Computing
- Dynamic Time-Variant Connection Management for PGAS Models on InfiniBand
- Building Algorithmically Nonstop Fault Tolerant MPI Programs
- Noncollective Communicator Creation in MPI James Dinan1