| | |
Summary: Abstract
An increasingly large portion of scheduler latency is
derived from the monolithic content addressable memory
(CAM) arrays accessed during instruction wakeup. The
performance of the scheduler can be improved by decreas-
ing the number of tag comparisons necessary to schedule
instructions. Using detailed simulation-based analyses,
we find that most instructions enter the window with at
least one of their input operands already available. By
putting these instructions into specialized windows with
fewer tag comparators, load capacitance on the scheduler
critical path can be reduced, with only very small effects
on program throughput. For instructions with multiple
unavailable operands, we introduce a last-tag speculation
mechanism that eliminates all remaining tag comparators
except those for the last arriving input operand. By com-
bining these two tag-reduction schemes, we are able to
construct dynamic schedulers with approximately one
quarter of the tag comparators found in conventional
designs. Conservative circuit-level timing analyses indi-
|