
- University of Maryland Inistitute for Advanced Computer Studies Technical Report UMIACS-TR-2006-42
- Multicore Performance Optimization Using Partner Cores , Jason E Miller*
- Early Experience with Profiling and Optimizing Distributed Shared Cache Performance on Tilera's Tile Processor
- To appear in ACM Transactions on Computer Systems A General Framework for Prefetch Scheduling in
- To appear in Proceedings of the 2006 International Symposium on Computer Architecture (ISCA-XXXIII), Boston, MA. Learning-Based SMT Processor Resource Distribution via Hill-Climbing
- Appears in Proceedings of the 4th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, May 1993. Experience with FineGrain Synchronization in
- University of Maryland Inistitute for Advanced Computer Studies Technical Report UMIACSTR200307
- University of Maryland Inistitute for Advanced Computer Studies Technical Report UMIACSTR200149
- Appears in Proceedings of the 3rd Workshop on Dependable Architectures, Lake Como, Italy. Nov. 2008. Exploiting Value Prediction for Fault Tolerance
- University of Maryland Inistitute for Advanced Computer Studies Technical Report UMIACS-TR-2003-07
- Appears in Proc. of the 13th Int'l Symp. on HighPerformance Computer Architecture, Phoenix, AZ. Feb. 2007. ApplicationLevel Correctness and its Impact on Fault Tolerance
- Appears in Proceedings of the 3rd Workshop on Dependable Architectures, Lake Como, Italy. Nov. 2008. Exploiting Value Prediction for Fault Tolerance
- Hill-Climbing SMT Processor Resource Scheduler Seungryul Choi Donald Yeung
- University of Maryland Inistitute for Advanced Computer Studies Technical Report UMIACS-TR-2001-49
- Appears in Proc. of the 18th Int'l Conf. on Parallel Architectures and Compilation Techniques. Raleigh, NC. Sept. 2009. Using Aggressor Thread Information to
- BioBench: A Benchmark Suite of Bioinformatics Applications Kursad Albayraktaroglu, Aamer Jaleel, Xue Wu, Manoj Franklin, Bruce Jacob,
- University of Maryland Systems & Computer Architecture Group technical report UMDSCATR200001. MultiChain Prefetching: Exploiting Natural
- To appear in ACM Transactions on Computer Systems A General Framework for Prefetch Scheduling in
- Physical Experimentation with Prefetching Helper Threads on Intels HyperThreaded Processors
- Journal of Instruction-Level Parallelism 5 (2003) 1-35 Submitted 10/02; published 04/03 Optimizing SMT Processors for High Single-Thread
- The MIT Alewife Machine: Architecture and Performance Anant Agarwal, Ricardo Bianchini \Lambda , David Chaiken y , Kirk L. Johnson,
- Appears in Proceedings of the 10th Annual International Conference on Parallel Architectures and Compilation Techniques, Barcelona, Spain, September 2001.
- Multigrain Shared Memory Donald Yeung
- Appears in Proceedings of the 13th Annual International Conference on Supercomputing, June 1999. The Scalability of Multigrain Systems
- Appears in Workshop on Architectural Support for Gigascale Integration, Boston, MA. June 2006. Exploiting Soft Computing for Increased Fault Tolerance
- Physical Experimentation with Prefetching Helper Threads on Intels Hyper-Threaded Processors
- Appears in Proceedings of the 11th Annual International Conference on Parallel Architectures and Compilation Techniques, Charlottesville, VA, September 2002.
- Appears in Proceedings of the 23rd Annual International Symposium on Computer Architecture, May 1996. MGS: A Multigrain Shared Memory System
- Appears in Proceedings of the 4th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, May 1993. Experience with Fine-Grain Synchronization in
- Multigrain Shared Memory Donald Yeung, z
- University of Maryland Technical Report UMIACS-TR-2008-13 Probabilistic Replacement: Enabling Flexible Use
- University of Maryland Technical Report UMIACS-TR-2007-33 Enhancing LTP-Driven Cache Management
- University of Maryland Technical Report UMIACS-TR-2006-36 Application-Level Correctness and its Impact on Fault Tolerance
- Multigrain Shared Memory Donald Yeung
- How to Choose the Grain Size of a Parallel Computer Donald Yeung, William J. Dally, and Anant Agarwal
- Scalability of Multicast Communication over Wide-Area Networks Donald Yeung
- Appears in Proceedings of the 13th Annual International Conference on Supercomputing, June 1999. The Scalability of Multigrain Systems
- The MIT Alewife Machine: Architecture and Performance Anant Agarwal, Ricardo Bianchini, David Chaikeny, Kirk L. Johnson,
- Low-Cost Support for Fine-Grain Synchronization in Multiprocessors
- Appears in Proc. of the 18th Int'l Conf. on Parallel Architectures and Compilation Techniques. Raleigh, NC. Sept. 2009. Using Aggressor Thread Information to
- Early Experience with Profiling and Optimizing Distributed Shared Cache Performance on Tilera's Tile Processor
- Appears in Proceedings of the 23rd Annual International Symposium on Computer Architecture, May 1996. MGS: A Multigrain Shared Memory System
- Appears in the 4th Workshop on Complexity-Effective Design, San Diego, CA, June 2003. Exploiting Application-Level Information to
- MultiChain Prefetching: Exploiting Memory Parallelism in PointerChasing Codes
- Appears in Proceedings of the 10th Annual International Conference on Parallel Architectures and Compilation Techniques, Barcelona, Spain, September 2001.
- Appears in the 4th Workshop on ComplexityE#ective Design, San Diego, CA, June 2003. Exploiting ApplicationLevel Information to
- University of Maryland Systems & Computer Architecture Group technical report UMD-SCA-TR-2000-01. Multi-Chain Prefetching: Exploiting Natural
- Journal of InstructionLevel Parallelism 10 (2008) 128 Submitted 6/08; published 9/08 Exploiting ApplicationLevel Correctness for
- SimpleFit: A Framework for Analyzing Design Trade-Offs in Raw Architectures
- To appear in Proceedings of the 2006 International Symposium on Computer Architecture (ISCAXXXIII), Boston, MA. LearningBased SMT Processor Resource Distribution via HillClimbing
- How to Choose the Grain Size of a Parallel Computer Donald Yeung, William J. Dally, and Anant Agarwal
- Journal of Instruction-Level Parallelism 11 (2009) 1-24 Submitted 10/08; published 4/09 Enhancing LTP-Driven Cache Management
- Evaluating the Impact of Memory System Performance on Software Prefetching and Locality Optimizations
- Appears in Proceedings of the Tenth International Conference on Architectural Support for Programming Languages and Operating Systems(ASPLOSX), October 2002, San Jose, CA.
- Scalability of Multicast Communication over WideArea Networks Donald Yeung
- 1st Reading November 9, 2007 14:11 WSPC/123-JCSC 00396
- Exploring Optimal Cost-Performance Designs for Raw Microprocessors
- Journal of Instruction-Level Parallelism 1 (2004) XX-YY Submitted 5/03; published 6/04 The Efficacy of Software Prefetching and Locality Optimizations
- Evaluating the Impact of Memory System Performance on Software Prefetching and Locality Optimizations
- Journal of Instruction-Level Parallelism 10 (2008) 1-28 Submitted 6/08; published 9/08 Exploiting Application-Level Correctness for
- Appears in Proceedings of the Tenth International Conference on Architectural Support for Programming Languages and Operating Systems(ASPLOS-X), October 2002, San Jose, CA.
- Hill-Climbing SMT Processor Resource Distribution
- The MIT Alewife Machine ANANT AGARWAL, MEMBER, IEEE, RICARDO BIANCHINI, MEMBER, IEEE,
- TRANSFERRING PERFORMANCE GAIN FROM SOFTWARE PREFETCHING TO ENERGY REDUCTION
- Appears in Proc. of the 13th Int'l Symp. on High-Performance Computer Architecture, Phoenix, AZ. Feb. 2007. Application-Level Correctness and its Impact on Fault Tolerance
- Sparcle: An Evolutionary Processor Design for LargeScale Multiprocessors Anant Agarwal, John Kubiatowicz, David Kranz,
- Appears in Workshop on Architectural Support for Gigascale Integration, Boston, MA. June 2006. Exploiting Soft Computing for Increased Fault Tolerance
- Journal of InstructionLevel Parallelism 11 (2009) 124 Submitted 10/08; published 4/09 Enhancing LTPDriven Cache Management
- University of Maryland Inistitute for Advanced Computer Studies Technical Report UMIACSTR200264
- University of Maryland Inistitute for Advanced Computer Studies Technical Report UMIACS-TR-2002-64
- Sparcle: An Evolutionary Processor Design for Large-Scale Multiprocessors Anant Agarwal, John Kubiatowicz, David Kranz,
- Journal of InstructionLevel Parallelism 1 (2004) XXYY Submitted 5/03; published 6/04 The E#cacy of Software Prefetching and Locality Optimizations
- LowCost Support for FineGrain Synchronization in Multiprocessors
- Appears in Proceedings of the 11th Annual International Conference on Parallel Architectures and Compilation Techniques, Charlottesville, VA, September 2002.
- A Study of SourceLevel Compiler Algorithms for Automatic Construction of PreExecution Code
- University of Maryland Technical Report UMIACS-TR-2009-16 Scaling Single-Program Performance on Large-Scale
- University of Maryland Technical Report UMIACS-TR-2010-10 Memory Performance Analysis for Parallel Programs
- Experience with Improving Distributed Shared Cache Performance on Tilera's Tile Processor
- Appears in Proceedings of the 2006 International Symposium on Computer Architecture (ISCA-XXXIII), Boston, MA. Learning-Based SMT Processor Resource Distribution via Hill-Climbing
- To appear in Proc. of the 20th Int'l Conf. on Parallel Architectures and Compilation Techniques. Galveston Island, TX. October 2011.
- Experience with Improving Distributed Shared Cache Performance on Tilera's Tile Processor
- To appear in Proc. of the 20th Int'l Conf. on Parallel Architectures and Compilation Techniques. Galveston Island, TX. October 2011.