Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

ThreadSensitive Scheduling for SMT Processors A simultaneousmultithreaded (SMT) processor executes multiple instructions from multiple threads every cycle. As

Summary: 1
Thread­Sensitive Scheduling for SMT Processors
A simultaneous­multithreaded (SMT) processor executes multiple instructions from multiple threads every cycle. As
a result, threads on SMT processors -- unlike those on traditional shared­memory machines -- simultaneously share all
low­level hardware resources in a single CPU. Because of this fine­grained resource sharing, SMT threads have the
ability to interfere or conflict with each other, as well as to share these resources to mutual benefit.
This paper examines thread­sensitive scheduling for SMT processors. When more threads exist than hardware execu­
tion contexts, the operating system is responsible for selecting which threads to execute at any instant, inherently
deciding which threads will compete for resources. Thread­sensitive scheduling uses thread­behavior feedback to
choose the best set of threads to execute together, in order to maximize processor throughput. We introduce several
thread­sensitive scheduling schemes and compare them to traditional oblivious schemes, such as round­robin. Our
measurements show how these scheduling algorithms impact performance and the utilization of low­level hardware
resources. We also demonstrate how thread­sensitive scheduling algorithms can be tuned to trade­off performance
and fairness. For the workloads we measured, we show that an IPC­based thread­sensitive scheduling algorithm can
achieve speedups over oblivious schemes of 7% to 15%, with minimal hardware costs.
1 Introduction
Simultaneous Multithreading (SMT) [22] is a processor design that combines the wide­issue capabilities of modern
superscalars with the latency­hiding abilities of hardware multithreading. Using multiple on­chip thread contexts, an
SMT processor issues instructions from multiple threads each cycle. The technique has been shown to boost proces­


Source: Anderson, Richard - Department of Computer Science and Engineering, University of Washington at Seattle


Collections: Computer Technologies and Information Sciences