Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Measuring FLOPS Using Hardware Performance Counter Technologies on LC systems

Technical Report ·
DOI:https://doi.org/10.2172/945513· OSTI ID:945513
FLOPS (FLoating-point Operations Per Second) is a commonly used performance metric for scientific programs that rely heavily on floating-point (FP) calculations. The metric is based on the number of FP operations rather than instructions, thereby facilitating a fair comparison between different machines. A well-known use of this metric is the LINPACK benchmark that is used to generate the Top500 list. It measures how fast a computer solves a dense N by N system of linear equations Ax=b, which requires a known number of FP operations, and reports the result in millions of FP operations per second (MFLOPS). While running a benchmark with known FP workloads can provide insightful information about the efficiency of a machine's FP pipelines in relation to other machines, measuring FLOPS of an arbitrary scientific application in a platform-independent manner is nontrivial. The goal of this paper is twofold. First, we explore the FP microarchitectures of key processors that are underpinning the LC machines. Second, we present the hardware performance monitoring counter-based measurement techniques that a user can use to get the native FLOPS of his or her program, which are practical solutions readily available on LC platforms. By nature, however, these native FLOPS metrics are not directly comparable across different machines mainly because FP operations are not consistent across microarchitectures. Thus, the first goal of this paper represents the base reference by which a user can interpret the measured FLOPS more judiciously.
Research Organization:
Lawrence Livermore National Laboratory (LLNL), Livermore, CA
Sponsoring Organization:
USDOE
DOE Contract Number:
W-7405-ENG-48
OSTI ID:
945513
Report Number(s):
LLNL-TR-406864
Country of Publication:
United States
Language:
English

Similar Records

Developing a tuned version of scaLAPACK's linear equation solver
Technical Report · Sun Oct 29 00:00:00 EDT 2000 · OSTI ID:15013126

Machine organization of the IBM RISC System/6000 processor
Journal Article · Sun Dec 31 23:00:00 EST 1989 · IBM Journal of Research and Development (International Business Machines); (USA) · OSTI ID:6764506

The implications of working set analysis on supercomputing memory hierarchy design.
Conference · Mon Feb 28 23:00:00 EST 2005 · OSTI ID:946978