| | |
Summary: 1
Architectural Requirements and Scalability of the
NAS Parallel Benchmarks
Frederick C. Wong, Richard P. Martin, Remzi H. Arpaci-Dusseau,
and David E. Culler
Computer Science Division
Department of Electrical Engineering and Computer Science
University of California, Berkeley
{fredwong, rmartin, remzi, culler}@CS.Berkeley.EDU
Abstract
We present a study of the architectural requirements and scalability of the NAS Parallel
Benchmarks. Through direct measurements and simulations, we identify the factors which
affect the scalability of benchmark codes on two relevant and distinct platforms; a cluster
of workstations and a ccNUMA SGI Origin 2000.
We find that the benefit of increased global cache size is pronounced in certain applica-
tions and often offsets the communication cost. By constructing the working set profile of
the benchmarks, we are able to visualize the improvement of computational efficiency
under constant-problem-size scaling.
We also find that, while the Origin MPI has better point-to-point performance, the cluster
MPI layer is more scalable with communication load. However, communication perfor-
|