
- Select-Free Instruction Scheduling Logic Mary D. Brown Jared Stark Yale N. Patt
- Teaching Old Caches New Tricks: RegionTracker and Predictor Virtualization Ioana Burcea, Jason Zebchuk and Andreas Moshovos
- TOWARDS A VIABLE OUT-OF-ORDER SOFT CORE: COPY-FREE, CHECKPOINTED REGISTER RENAMING
- Prior research demonstrates that temporal memory streaming and related address-correlating
- A Building Block for Coarse-Grain Optimizations in the On-Chip Memory Hierarchy
- L-CBF: A Low-Power, Fast Counting Bloom Filter Architecture
- BranchTap: Improving Performance with Very Few Checkpoints Through Adaptive Speculation Control
- RegionTracker: A Case for Dual-Grain Tracking in the Memory System
- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are
- TECHNICAL REPORT: TR-01-01-02 COMPUTER GROUP, ELECTRICAL AND COMPUTER ENGINEERING, UNIV. OF TORONTO
- Memory Dependence Prediction in Multimedia Applications Andreas Moshovos MOSHOVOS@ECE.NORTHWESTERN.EDU
- Memory Dependence Prediction Andreas Ioannis Moshovos
- Altera Corporation 81 8. Instruction Set Reference
- Advanced Computer Architecture Instructor: Andreas Moshovos
- ECE 1773, ECE Toronto Lecture Notes: Chapter 1 1 2002 Moshovos, some material based on slides by
- Speculative Versioning Cache Sridhar Gopal
- TurboSMARTS: Accurate Microarchitecture Simulation Sampling in Minutes
- This paper presents an analysis of the performance of the shader processing units in a modern Graphics Proc-
- A Characterization of Processor Performance in the VAX-11/780 Joel S. Emer
- POWER5 system microarchitecture
- Trace Processors Eric Rotenberg*, Quinn Jacobson, Yiannakis Sazeides, Jim Smith
- Speculative Versioning Cache Sridhar Gopaly
- Focusing Processor Policies via Critical-Path Prediction Brian Fields Shai Rubin Rastislav Bodik
- Selective Value Prediction Brad Calder Glenn Reinman Dean M. Tullsen
- Cache Decay: Exploiting Generational Behavior to Reduce Cache Leakage Power
- Scalable Store-Load Forwarding via Store Queue Index Prediction Tingting Sha, Milo M.K. Martin, Amir Roth
- We believe the key to the 10's longevity is its basically simple, clean structure with adequately large
- We consider a variety of dynamic, hardware-based methods for exploiting load/store parallelism, including
- We describe the Slice Processor micro-architecture that imple-ments a generalized operation-based prefetching mechanism.
- A Tagless Coherence Directory Jason Zebchuk
- Cost-Effective, High-Performance Giga-Scale Checkpoint/Restore
- IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 11, NO. 4, AUGUST 2003 701 Low-Leakage Asymmetric-Cell SRAM
- Dynamic History-Length Fitting: A third level of adaptivity for branch prediction
- IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 13, NO. 7, JULY 2005 877 SPEEDUP AND GATE COUNTS OF APPROACH A, B, AND C FOR
- Reducing Memory Latency via Read-after-Read Memory Dependence Prediction
- A Framework for Coarse-Grain Optimizations in the On-Chip Memory Hierarchy
- ECE D52 Lecture Notes: Chapter 3 1 1998 by Hill, Wood, Sohi, Smith and Vijaykumar and
- RegionTracker: Using Dual-Grain Tracking for Energy Efficient Cache Lookup
- SMARTS: Accelerating Microarchitecture Simulation via Rigorous Statistical Sampling
- Abstract: We identify that typical programs exhibit highly regular read-after-read (RAR) memory dependence streams.
- JETTY: Reducing Snoop-Induced Power Consumption in Small-Scale, Bus-Based SMP Systems
- Low-Leakage Asymmetric-Cell SRAM Navid Azizi
- On the Latency, Energy and Area of Checkpointed, Superscalar Register Alias Tables
- Dependence Based Prefetching for Linked Data Structures Amir Roth, Andreas Moshovos and Gurindar S. Sohi
- Memory State Compressors for Giga-Scale Checkpoint/Restore We propose a checkpoint store compression method for
- To achieve high instruction throughput, instruction schedulers must be capable of producing high-quality
- A Study of Slipstream Processors A slipstream processor reduces the length of a running
- Edgar H. Sibley Panel Editor
- MIPS R4000 Microprocessor User's Manual A-1 CPU Instruction Set Details
- Appears in Proceedings of 37th International Symposium on Microarchitecture (MICRO-37), Dec. 48, 2004. A mini-graph is a dataflow graph that has an arbi-
- by Linley Gwennap Intel's forthcoming P6 processor (see cover story) is
- An examination ofthe relation between architecture and compiler design leads to severalprinciples which can simplify compilers
- Integrated Silicon Solution, Inc. --www.issi.com --1-800-379-4774 1 IS61LV25616AL ISSI
- We revisit memory hierarchy design viewing memory as an inter-operation communication mechanism. We show how
- Turbo-ROB: A Low Cost Checkpoint/Restore Accelerator
- Increasing the Size of Atomic Instruction Blocks using Control Flow Assertions
- 16 October 1964, Volume 146, Number 3642 SCIENCE 16 OCTOBER 1964
- We investigate instruction distribution methods for quad-cluster, dynamically-scheduled superscalar processors. We
- 101 Innovation Drive San Jose, CA 95134
- We present a number of power-aware instruction front-end (fetch/decode) throttling methods for high-performance dynami-
- Microarchitectural Innovations: Boosting Microprocessor Performance Beyond
- Exploiting Coarse Grain Non-Shared Regions in Snoopy Coherent Multiprocessors
- 40 0272-1732/00/$10.00 2000 IEEE Processors designed for computer
- POWER-AWARE REGISTER RENAMING COMPUTER ENGINEERING GROUP TECHNICAL REPORT 01-08-02
- A Dynamic Multithreading Processor Haitham Akkary
- A Case for MLP-Aware Cache Replacement Moinuddin K. Qureshi Daniel N. Lynch Onur Mutlu Yale N. Patt
- A. Moshovos (Univ. of Toronto) Instruction Set Architecture and its Implications
- TVLSI-00034-2007.R1 1 L-CBF: A Low-Power, Fast
- Improving Virtual Function Call Target Prediction via Dependence-Based Pre-Computation
- RegionScout: Exploiting Coarse Grain Sharing in Snoop-Based Coherence
- 850272-1732/00/$10.00 2000 IEEE Thereisacleartrendinpersonalcom-
- Cache-coherent shared-memory multiprocessors have wide-ranging applica-
- DIVA: A Reliable Substrate for Deep Submicron Microarchitecture Design
- Transient-Fault Recovery Using Simultaneous Multithreading T. N. Vijaykumar, Irith Pomeranz, and Karl Cheng
- MOTOROLA INC., 1992 M68000 FAMILY
- Phantom-BTB: A Virtualized Branch Target Buffer Design Ioana Burcea