
- Instruction Prefetching of Systems Codes With Layout Optimized for Reduced Cache Misses 1
- BulkCompiler: High-Performance Sequential Consistency through Cooperative Compiler and Hardware Support
- A Framework for Dynamic Energy Efficiency and Temperature Management
- Toward A Cost-E ective DSM Organization That Exploits Processor-Memory Integration
- Submitted to the Workshop on the Evaluation of Software Defect Detection Tools Deploying Architectural Support for Software Defect Detection in Future Processors
- Abstract--OpenSPARC is an open source community based around hardware design and experimentation aids
- Architectural Support for Scalable Speculative Parallelization in SharedMemory Multiprocessors
- Using a UserLevel Memory Thread for Correlation Prefetching Yan Solihin y Jaejin Lee z Josep Torrellas y
- S l ti S h i ti A l iSpeculative Synchronization: Applying ThreadLevel Speculation to Parallel
- Hardware for Speculative Parallelization in High-End Multiprocessors 1
- The Impact of Speeding up Critical Sections with Data Prefetching and Forwarding 1
- Efficient Use of Processing Transistors for Larger OnChip Storage: Multithreading 1
- The Design Complexity of Program Undo Support in a General Purpose
- Recent impressive advances in micro-processor performance have failed to deliver
- Prototyping Architectural Support for Program Rollback Using FPGAs Radu Teodorescu and Josep Torrellas
- A Near-Memory Processor for Vector, Streaming and Bit Manipulation Workloads
- Mitigating Parameter Variation with Dynamic Fine-Grain Body Biasing
- The Augmint Multiprocessor Simulation Toolkit for Intel x86 Architectures AnthonyTrung Nguyen y , Maged Michael z , Arun Sharma y , and Josep Torrellas y
- Low Perturbation Address Trace Collection with Simple Hardware Performance Monitors 1
- Evaluating the Performance of CacheAffinity in SharedMemory Multiprocessors
- A Clustered Approach to Multithreaded Processors 1 Venkata Krishnan and Josep Torrellas
- Cache Optimization for Memory-Resident Decision Support Commercial Workloads 1
- Accurate and Efficient Filtering for the Intel Thread Checker Race Detector
- Wshp. on Memory Performance Issues, Intl. Symp. on Computer Architecture, June 2001. Speculative Locks for Concurrent Execution of Critical
- BulkCompiler: High-Performance Sequential Consistency through Cooperative CompilerConsistency through Cooperative Compiler
- Phoenix: Detecting and Recovering from Permanent Processor Design Bugs
- Upcoming Architectural Advances in DSM Machines and Their Impact on Programmability
- Optimizing Primary Data Caches for Parallel Scientific Applications: The Pool Buffer Approach 1
- Computer Architecture Education at the University of Illinois: Current Status and Some Thoughts
- Hardware for Speculative Parallelization of Partially-Parallel Loops in DSM Multiprocessors 1
- POSH: A TLS Compiler that Exploits Program Structure
- 1990 Internatiooal Conference on Parallel Processing Shared Data Placement Optimizations to
- Positional Adaptation of Processors:Positional Adaptation of Processors: Application to Energy Reduction
- With this special issue, IEEE Micro continues its yearly tradition of showcasing
- Architectural Support for Parallel Reduction in Scalable Shared Memory Multiprocessors
- http://iacoma.cs.uiuc.edu/ of Illinois
- Facelift: Hiding and Slowing Down Aging in Multicores Abhishek Tiwari and Josep Torrellas
- A U d t d E l ti f R C lAn Updated Evaluation of ReCycle Abhishek Tiwari and Josep Torrellas
- 00189162/99/$10.00 1999 IEEE 72 Computer
- Optimizing Instruction Cache Performance for Operating System Intensive Workloads 1
- Hardware and Software Support for Speculative Execution of Sequential Binaries on a ChipMultiprocessor 1
- A DirectExecution Framework for Fast and Accurate Simulation of Superscalar Processors 1
- Programming the FlexRAM Intelligent Memory Architecture
- An Efficient Algorithm for the Runtime Parallelization of DOACROSS Loops 1
- SigRace: Signature-Based Data Race Detection Abdullah Muzahid
- IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING (TDSC) 1 Using Register Lifetime Predictions to Protect
- A Framework for Dynamic Energy Efficiency and Temperature Management
- Compiler Support for Data Forwarding in Scalable Shared-Memory Multiprocessors 1
- Software Trace Cache for Commercial Applications Alex Ramirez, Josep Ll. Larriba-Pey,
- Comparing Data Forwarding and Prefetching for CommunicationInduced Misses in SharedMemory MPs 1
- Excel-NUMA: Toward Programmability, Simplicity, and High Performance
- Cache Miss Handling For High MLP
- SoftSig: Software-Exposed Hardware Signatures for Code Analysis and Optimization
- http://iacoma.cs.uiuc.edu/ of Illinois
- Flexible Snooping: Adaptive Forwarding and Filtering of Snoops
- Improving the Performance of Bristled CC-NUMA Systems Using Virtual Channels and Adaptivity
- Speeding up the Memory Hierarchy in Flat COMA Multiprocessors 1
- ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing
- Optimization of Instruction Fetch for Decision Support Workloads \Lambda Alex Ramirez, Josep Ll. LarribaPey,
- The Performance of the Cedar Multistage Switching Network 1 Josep Torrellas and Zheng Zhang
- Executing Sequential Binaries on a Clustered Multithreaded Architecture with Speculation Support 1
- Uni ed Fine-Granularity Bu ering of Index and Data: Approach and Implementation 1
- SmartApps: An Application Centric Approach to High Performance Computing ?
- DistanceAdaptive Update Protocols for Scalable SharedMemory Multiprocessors 1
- Encyclopedia of Parallel Computing "00166" --2011/3/21 --14:04 --Page 1 --#2 UncorrectedProof
- Empowering Software Debugging Through Architectural Support
- Characterizing the Caching and Synchronization Performance of a Multiprocessor Operating System
- Using a User-Level Memory Thread for Correlation Prefetching Yan Solihin
- ReCycle: Pipeline Adaptation to Tolerate Process Abhishek Tiwari, Smruti R. Sarangi and Josep Torrellas
- The Memory Performance of DSS Commercial Workloads in SharedMemory Multiprocessors 1
- Data Forwarding in Scalable SharedMemory Multiprocessors 1 D. A. Koufaty, X. Chen, D. K. Poulsen 2 and J. Torrellas
- Scal-Tool: Pinpointing and Quantifying Scalability Bottlenecks in DSM Multiprocessors 1
- The Performance of the Cedar Multistage Switching Network 1
- Programming the FlexRAM Parallelg g Intelligent Memory System
- Eliminating Squashes Through Learning Cross-Thread Violations in Speculative Parallelization for Multiprocessors
- FlexBulk: Intelligently Forming Atomic Blocks in Blocked-Execution
- Speculative Locks J F M d J T llJos F. Martnez and Josep Torrellas
- of Illinois http://iacoma.cs.uiuc.edu/
- IEEE TRANSACTIONS ON COMPUTERS, VOL. 47, NO. 12, DECEMBER 1998 1363 Optimizing the Instruction Cache Performance
- Empowering Software Debugging Through Architectural Support for Program Rollback
- Automatically Mapping Code on an Intelligent Memory Architecture , Yan Solihin
- Paceline: Improving Single-Thread Performance in Nanoscale CMPs through Core Overclocking
- Encyclopedia of Parallel Computing "00170" --2011/3/8 --12:30 --Page 1 --#2 UncorrectedProof
- A Framework for Dynamic Energy Effi i d T t M tEfficiency and Temperature Management
- Design Trade-Offs in High-Throughput Coherence Controllers Anthony-Trung Nguyen
- Prototyping Architectural Support
- Reducing Remote Conflict Misses: NUMA with Remote Cache versus COMA 1
- Enhancing Memory Use in Simple Coma: Multiplexed Simple Coma 1
- Rapid Prototyping in Architecture Research using Hardware Hooks in COTS Systems
- How ProcessorMemory Integration Affects the Design of DSMs 1 Liuxi Yang, AnthonyTrung Nguyen and Josep Torrellas
- The Illinois Aggressive Coma Multiprocessor Project (IACOMA) 1
- Comparing the Performance of the DASH and Cedar Multiprocessors for Scientific Applications 1
- Unconstrained Snoop Request Delivery in Embedded-Ring Multiprocessors
- Tradeoffs in Buffering Memory State for Thread-Level Speculation in Multiprocessors
- Concurrency Control with Data Coloring Luis Ceze, Christoph von Praun, Calin Cascaval
- Tradeoffs in Buffering Memory State for Thread-Level Speculation
- SOFTSIG: SOFTWARE-EXPOSED HARDWARE SIGNATURES FOR CODE
- Hardware for Speculative Reduction Parallelization and Optimization in DSM Multiprocessors 1
- Software Trace Cache \Lambda Alex Ram'irez JosepL. LarribaPey Carlos Navarro Josep Torrellas y
- FlexRAM Architecture Design Parameters SeungMoon Yoo z , Jose Renau y , Michael Huang y , and Josep Torrellas y
- Rebound: Scalable Checkpointing for Coherent Shared Memorfor Coherent Shared Memory
- iWatcheriWatcher
- CAP: Criticality AnCAP: Criticality An Efficient Speculatip
- Uncorq: Unconstrained Snoop Request Delivery in Embedded-Ring Multiprocessors
- Cost-Effective Architectural Support for Rollback Recoveryy
- Accurate and Efficient Filtering f th I t l Th d Ch kfor the Intel Thread Checker
- SigRace: Signature-Based Data Race Detection
- I F M TRANSACTIONS ON COMPUTERS, VOL. 43, NO. 6. JUNE 1994 651 False Sharing and Spatial Locality
- Selective Re-Long retired MisspeLong-retired Misspe
- Scalable SharedMemory Architectures Introduction to the Minitrack
- Colorama: Architectural Support for Data-Centric Synchronization
- Report No. UIUCDCS-R-2005-2633 UILU-ENG-2005-1823 A Brief Description of the NMP ISA and Benchmarks
- Toward an Advanced Intelligent Memory System
- Variation Aware Application Scheduling and Power Management
- Improving the Data Cache Performance of Multiprocessor Operating Systems 1
- The Need for Fast Communication in Hardware-Based Speculative Chip Multiprocessors 1
- Speeding up Irregular Applications in SharedMemory Multiprocessors: Memory Binding and Group Prefetching 1
- Hardware for Speculative RunTime Parallelization in Distributed SharedMemory Multiprocessors 1
- Brian Greskamp, Ulya Karpuzcu, and Josep Torrellas of Illinois
- InstantCheck: Checking the determinism ofChecking the determinism of
- Li hLight64: Ligh support for data ra
- An Efficient Implementation of TreeBased Multicast Routing for Distributed SharedMemory Multiprocessors \Lambda
- Detailed Characterization of a Quad Pentium Pro Server Running TPC-D 1
- EVAL: Utilizing Processors with Variation-Induced Timing Errors Smruti Sarangi, Brian Greskamp, Abhishek Tiwari, and Josep Torrellas
- FlexBulk: Intelligently Forming Atomic Blocks in Blocked-Execution Multiprocessors to Minimize Squashes
- Rebound: Scalable Checkpointing for Coherent Shared Memory
- Workshop on Advancing Computer Architecture Research (ACAR-II)
- ScalableBulk: Scalable Cache Coherence for Atomic Blocks in a Lazy Environment
- AtomTracker: A Comprehensive Approach to Atomic Region
- LeadOut: Composing Low-Overhead Frequency-Enhancing Techniques for Single-Thread Performance in Configurable
- of Illinois http://iacoma.cs.uiuc.edu/
- The Bulk Multicore Architecture for Programmability
- COVER FEATURE Published by the IEEE Computer Society 0018-9162/09/$26.00 2009 IEEE
- Architectures for Extreme-Scale ComputingArchitectures for Extreme Scale Computing Josep Torrellas
- The BubbleWrap Many-Core: Popping Cores for Sequential Acceleration
- Lessons Learned During the Development of the CapoOne Deterministic Multiprocessor Replay System
- Lessons Learned During the Development of the CapoOne Deterministic
- Capo: A Software-Hardware Interface for Practical Deterministic Multiprocessor Replay
- A Software-Hardware Interface for Practical Deterministic Multiprocessor Replay
- BubbleWrap: Popping CMP Cores for Sequential Acceleration Brian Greskamp, Ulya R. Karpuzcu, and Josep Torrellas
- Designing Processors for Timing Speculation from the Ground Up
- Facelift: Hiding and Slowing Down Aging in MulticoresDown Aging in Multicores
- of Illinois Utilizing Processors with
- Recording and Deterministically Replaying Shared Memory Multiprocessor Execution Efficiently
- Variation-Aware Application Scheduling and Power Management for Chip Multiprocessors
- An Updated Evaluation of ReCycle Abhishek Tiwari and Josep Torrellas
- OpenSPARC An Open Platform for Hardware Reliability Experimentation
- IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 21, NO. 1, FEBRUARY 2008 3 VARIUS: A Model of Process Variation and Resulting
- VARIUS: A Model of Process Variation and Resulting Timing Errors for
- Mitigating Parameter Variation with Dynamic Fine-Grain Body Biasing Radu Teodorescu, Jun Nakano, Abhishek Tiwari and Josep Torrellas
- CAP: Criticality Analysis for Power-Efficient Speculative Multithreading NC State University
- Improving Single-Thread Performance through Core
- BulkSC: Bulk Enforcement of Sequential Consistency
- ReCycle: Pipeline Adaptation to Tolerate Process Variation
- VARIUS: A Model of Parameter Variation and Resulting Timing Errors for Microarchitects
- Using Register Lifetime Predictions to Protect Register Files Against Soft Errors Pablo Montesinos, Wei Liu and Josep Torrellas
- Using Register Lifetime Predictions to Protect Register Files Against Soft Errors
- Shield: Cost-Effective Soft-Error Protection for Register Files Pablo Montesinos, Wei Liu, and Josep Torrellas
- Cost-Effective Soft Error Protection For Register Files Pablo Montesinos, Wei Liu* and Josep Torrellas
- Vt Variation Effects on Lifetime Reliability
- A Model for Timing Errors in Processors with Parameter Variation Smruti R. Sarangi, Brian Greskamp, and Josep Torrellas
- PATCHING PROCESSOR DESIGN ERRORS WITH
- Phoenix: Detecting and Recovering from Permanent Processor Design Bugs with Programmable Hardware
- Designing Hardware that Supports Cycle-Accurate Deterministic Replay Brian Greskamp, Smruti R. Sarangi, and Josep Torrellas
- CADRE: Cycle-Accurate Deterministic Replay for Hardware Debugging Smruti R. Sarangi, Brian Greskamp, and Josep Torrellas
- CADRE: Cycle-Accurate Deterministic Replay for Hardware DebuggingReplay for Hardware Debugging
- CAVA: Using Checkpoint-Assisted Value Prediction to Hide L2 Misses
- Backward recovery through check-pointing and rollback is a popular approach
- 4th Workshop on Memory Performance Issues -Feb/2006Luis Ceze Luis Ceze, James Tuck, Josep Torrellas
- ReViveI/O: Efficient Handling of I/O in Highly-Available Rollback-Recovery Servers
- Efficient Handling of I/O in Highly-Available Rollback-Recovery
- University of Illinois at Urbana-Champaign Department of Computer Science
- POSH: A Profiler-Enhanced TLS Compiler that Leverages Program Structure Wei Liu, James Tuck, Luis Ceze, Karin Strauss, Jose Renau
- A Profiler EnhancedA Profiler-Enhanced Leverages Prog
- A Near-Memory Processor (NMP) for Vector, Streaming, and Bit Manipulation Workloads
- Thread-Level Spec Can Be Ener
- Tasking with Out-of-Order Spawn in TLS Chip Multiprocessors: Microarchitecture and Compilation
- Tasking with Out oTasking with Out-o TLS Chip Mult
- The Design Complexity of Program Undo Support in a General-Purpose Processor
- Prototyping Architectural Support for Program
- Prototyping Architectural Support for Program Rollback: An Application to Software Debugging
- Efficient and Flexible Architectural Support for Dynamic Monitoring
- Hi h Th h tHigh-Throughput Coherence Controllers
- Using Software Logging to Support Multi-Version Buffering in Thread-Level Speculation
- Using Software Logging To Support Multi-Version Buffering in
- ReEnact Using TLS MechanismsReEnact: Using TLS Mechanisms to Debug Data Races
- Correlation Prefetching with a User-Level Memory Thread
- Using a User Level Memory ThreadUsing a User-Level Memory Thread for Correlation Prefetchingg
- To appear in the Proceedings of the 29th Annual International Symposium on Computer Architecture (ISCA-29)
- Eliminating Squashes Through Learningg q g g Cross-Thread Violations in Speculative
- Automatic Code Mapping on an Intelligent Memory Architecture
- Prefetching in an Intelligent Memory Architecture Using a Helper Thread Yan Solihin
- Architectural Support for Parallel Reductions in Scalable Shared-Memory Multiprocessors
- Removing Architectural Bottlenecks to the Scalability of Speculative Parallelization
- Removing Architectural Bottlenecksg to the Scalability of Speculative
- Software Logging under Sp l ti P r ll liz ti nSpeculative Parallelization
- Automatically Mapp Intelligent Memory AIntelligent Memory A
- Architectural Support for Scalable Speculative Parallelization in Shared-Memory Multiprocessors
- Architectural Support for Scalablepp Speculative Parallelization in Shared-
- Comprehensive Hardware and Software Support for Operating Systems
- A Chip-Multiprocessor Architecture with Speculative Multithreading
- Energy Efficient Hybrid Wakeup Logic Michael Huang, Jose Renau*, and Josep Torrellas*
- AtomTracker: A Comprehensive Approach to Atomic Region Inference and Violation Detection
- ScalScal--Tool: Pinpointing ScalabilityTool: Pinpointing ScalabilityScalScal Tool: Pinpointing ScalabilityTool: Pinpointing Scalability BottlenecksBottlenecks
- Energy/Performance Design of Memory Hierarchies for Processor-in-Memory Chips
- Threshold Voltage Variation Effects on Aging-Related Hard Failure Rates
- Comparing the Power and Performance of Intel's SCC to State-of-the-Art CPUs and GPUs