Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

Thread-Sensitive Scheduling for SMT Processors A simultaneous-multithreaded (SMT) processor executes multiple instructions from multiple threads every cycle. As

Summary: 1
Thread-Sensitive Scheduling for SMT Processors
A simultaneous-multithreaded (SMT) processor executes multiple instructions from multiple threads every cycle. As
a result, threads on SMT processors unlike those on traditional shared-memory machines simultaneously share all
low-level hardware resources in a single CPU. Because of this fine-grained resource sharing, SMT threads have the
ability to interfere or conflict with each other, as well as to share these resources to mutual benefit.
This paper examines thread-sensitive scheduling for SMT processors. When more threads exist than hardware execu-
tion contexts, the operating system is responsible for selecting which threads to execute at any instant, inherently
deciding which threads will compete for resources. Thread-sensitive scheduling uses thread-behavior feedback to
choose the best set of threads to execute together, in order to maximize processor throughput. We introduce several
thread-sensitive scheduling schemes and compare them to traditional oblivious schemes, such as round-robin. Our
measurements show how these scheduling algorithms impact performance and the utilization of low-level hardware
resources. We also demonstrate how thread-sensitive scheduling algorithms can be tuned to trade-off performance
and fairness. For the workloads we measured, we show that an IPC-based thread-sensitive scheduling algorithm can
achieve speedups over oblivious schemes of 7% to 15%, with minimal hardware costs.
1 Introduction
Simultaneous Multithreading (SMT) [22] is a processor design that combines the wide-issue capabilities of modern
superscalars with the latency-hiding abilities of hardware multithreading. Using multiple on-chip thread contexts, an
SMT processor issues instructions from multiple threads each cycle. The technique has been shown to boost proces-


Source: Anderson, Richard - Department of Computer Science and Engineering, University of Washington at Seattle


Collections: Computer Technologies and Information Sciences