miniGMG is a compact geometric multigrid (MG benchmark designed to proxy the performance characteristics of the solves found in adaptive mesh refinement multigrid (AMR MG) applications. It solves the equation a*alpha*u - b div beta grad u = f on a large cubical domain using cell-centered values and periodic boundaries. a and b are scalar constants, alpha and beta are space varying constants, f is the right hand side, and u is the solution. The cubical domain is divided into cubical subdomains which are distributed across the supercomputer. The righthand side f is generated from a continuous function for u. Thus, one can solve this linear system of equations and verify the error properties of the discrete solution u by comparing to a sampling of the continuous u. miniGMG implements both v- and f- cycles, as well as two constant coeffficient discretications of the equation above (one 27pt 2nd order, and one 13pt 4th order). In both cases, the (embedded) v-cycles are truncated when a subdomain reaces 4^3, 2^3, or 1^3 cells (a u-cycle). The multigrd solver then swtiches to one of seven coarse grid (bottom) solves. relaxation, BiCGStab, CABiCGStab, TelescopingCABiCGStab, CG, or CACG. The code exploits two forms ofmore » communication-avoiding. The first is exemplified by the referenced SC'12 paper in which DRAM data movment is avoided when smoothing. The second, described in the IPDPS'14 paper avoids collectives in a bottom solve (the CA bottom solves). There are a number of implementations of the operators for both CPUs and NVIDIA GPUs. The CPUs have been optimized to exploit a wavefront approach to communication-avoiding. The GPU versions include a number of optimized and tunable implementations. The code has been demonstrated to scale to 46K nodes on the BGQ Mira (750K cores), 9K nodes on Edison (111K cores), and 14K GPUs on Titan (37M cuda cores). It thus acts as a scalable testbed for a wide range of computer science and applied math research. Nominally, one invokes the benchmark as ./run b x y z alpha beta gamma Each subdomain is (2^b)^3 cells Each MPI process receives a x-by-y-by-z collection of subdomains. There are alpha*beta*gamma total MPI processes For correctness (a cubical domain), *alpha==y*beta==z*gamma.« less
To initiate an order for this software, request consultation services, or receive further information, fill out the request form below. You may also reach us by email at: .
OSTI staff will begin to process an order for scientific and technical software once the payment and signed site license agreement are received. If the forms are not in order, OSTI will contact you. No further action will be taken until all required information and/or payment is received. Orders are usually processed within three to five business days.
Software Package Details
Some links on this page may take you to non-federal websites. Their policies may differ from this site.