Programming future architectures : dusty decks, memory walls, and the speed of light.

Rodrigues, Arun F

Title: Programming future architectures : dusty decks, memory walls, and the speed of light.

Conference · Mon Aug 01 00:00:00 EDT 2005

OSTI ID:968394

Rodrigues, Arun F

Due to advances in CMOS fabrication technology, high performance computing capabilities have continually grown. More capable hardware has allowed a range of complex scientific applications to be developed. However, these applications present a bottleneck to future performance. Entrenched 'legacy' codes - 'Dusty Decks' - demand that new hardware must remain compatible with existing software. Additionally, conventional architectures faces increasing challenges. Many of these challenges revolve around the growing disparity between processor and memory speed - the 'Memory Wall' - and difficulties scaling to large numbers of parallel processors. To a large extent, these limitations are inherent to the traditional computer architecture. As data is consumed more quickly, moving that data to the point of computation becomes more difficult. Barring any upward revision in the speed of light, this will continue to be a fundamental limitation on the speed of computation. This work focuses on these solving these problems in the context of Light Weight Processing (LWP). LWP is an innovative technique which combines Processing-In-Memory, short vector computation, multithreading, and extended memory semantics. It applies these techniques to try and answer the questions 'What will a next-generation supercomputer look like?' and 'How will we program it?' To that end, this work presents four contributions: (1) An implementation of MPI which uses features of LWP to substantially improve message processing throughput; (2) A technique leveraging extended memory semantics to improve message passing by overlapping computation and communication; (3) An OpenMP library modified to allow efficient partitioning of threads between a conventional CPU and LWPs - greatly improving cost/performance; and (4) An algorithm to extract very small 'threadlets' which can overcome the inherent disadvantages of a simple processor pipeline.

OSTI does not have a digital full text copy available. For more information, please see document availability, search WorldCat, or search Google Scholar.

Cite

Export

Save

Research Organization:: Sandia National Laboratories (SNL), Albuquerque, NM, and Livermore, CA (United States)

Sponsoring Organization:: USDOE

DOE Contract Number:: AC04-94AL85000

OSTI ID:: 968394

Report Number(s):: SAND2005-4797C; TRN: US200924%%414

Resource Relation:: Conference: Proposed for presentation at the Sandia Student Symposium held August 2, 2005 in Albuquerque, NM.

Country of Publication:: United States

Language:: English

Similar Records

PRIMA-X - Performance Retargeting of Instrumentation, Measurement, and Analysis Technologies for Exascale Computing

Technical Report · Thu Jun 27 00:00:00 EDT 2019 · OSTI ID:968394

Wolf, Felix; Lorenz, Daniel

MULTI-CORE AND OPTICAL PROCESSOR RELATED APPLICATIONS RESEARCH AT OAK RIDGE NATIONAL LABORATORY

Conference · Tue Jan 01 00:00:00 EST 2008 · OSTI ID:968394

Barhen, Jacob; Kerekes, Ryan A; ST Charles, Jesse Lee; +1 more

Development Status of the PEBBLES Code for Pebble Mechanics: Improved Physical Models and Speed-up

Technical Report · Tue Sep 01 00:00:00 EDT 2009 · OSTI ID:968394

Cogliati, Joshua J; Ougouag, Abderrafi M

Related Subjects

97 MATHEMATICAL METHODS AND COMPUTING
ALGORITHMS
COMPUTER ARCHITECTURE
PERFORMANCE
PROGRAMMING
SUPERCOMPUTERS
DATA TRANSMISSION
MEMORY MANAGEMENT

Title: Programming future architectures : dusty decks, memory walls, and the speed of light.

Citation Formats

Similar Records

Related Subjects