Kokkos: Enabling manycore performance portability through polymorphic memory access patterns
|
journal
|
December 2014 |
Estimating cache misses and locality using stack distances
|
conference
|
January 2003 |
Roofline: An Insightful Visual Performance Model for Floating-Point Programs and Multicore Architectures
|
report
|
September 2009 |
Accurate and efficient regression modeling for microarchitectural performance and power prediction
|
journal
|
October 2006 |
Predict the performance of GE with an ACO based machine learning algorithm
- Chennupati, Gopinath; Azad, R. Muhammad Atif; Ryan, Conor
-
GECCO '14: Genetic and Evolutionary Computation Conference, Proceedings of the Companion Publication of the 2014 Annual Conference on Genetic and Evolutionary Computation
https://doi.org/10.1145/2598394.2609860
|
conference
|
July 2014 |
Legion: Expressing locality and independence with logical regions
- Bauer, Michael; Treichler, Sean; Slaughter, Elliott
-
2012 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis
https://doi.org/10.1109/SC.2012.71
|
conference
|
November 2012 |
COMPASS: A Framework for Automated Performance Modeling and Prediction
|
conference
|
January 2015 |
Hardware-independent application characterization
|
conference
|
September 2013 |
An Integrated Interconnection Network Model for Large-Scale Performance Prediction
- Ahmed, Kishwar; Obaida, Mohammad; Liu, Jason
-
SIGSIM-PADS '16: SIGSIM Principles of Advanced Discrete Simulation, Proceedings of the 2016 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation
https://doi.org/10.1145/2901378.2901396
|
conference
|
May 2016 |
Predicting whole-program locality through reuse distance analysis
|
journal
|
May 2003 |
PPT-SASMM: Scalable Analytical Shared Memory Model: Predicting the Performance of Multicore Caches from a Single-Threaded Execution Trace
|
conference
|
March 2021 |
The gem5 simulator
|
journal
|
August 2011 |
The Simian concept: Parallel Discrete Event Simulation with interpreted languages and just-in-time compilation
|
conference
|
December 2015 |
New Performance Modeling Methods for Parallel Data Processing Applications
|
journal
|
July 2019 |
LogP: towards a realistic model of parallel computation
|
journal
|
July 1993 |
Miss rate prediction across all program inputs
- Zhong, Y.; Dropsho, S. G.; Ding, C.
-
12th International Conference on Parallel Architectures and Compilation Techniques. PACT 2003, Oceans 2002 Conference and Exhibition. Conference Proceedings (Cat. No.02CH37362)
https://doi.org/10.1109/PACT.2003.1238004
|
conference
|
January 2003 |
LogGP: incorporating long messages into the LogP model---one step closer towards a realistic model for parallel computation
- Alexandrov, Albert; Ionescu, Mihai F.; Schauser, Klaus E.
-
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures - SPAA '95
https://doi.org/10.1145/215399.215427
|
conference
|
January 1995 |
Durango: Scalable Synthetic Workload Generation for Extreme-Scale Application Performance Modeling and Simulation
- Carothers, Christopher D.; Meredith, Jeremy S.; Blanco, Mark P.
-
SIGSIM-PADS '17: SIGSIM Principles of Advanced Discrete Simulation, Proceedings of the 2017 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation
https://doi.org/10.1145/3064911.3064923
|
conference
|
May 2017 |
Parallel Application Performance Prediction Using Analysis Based Models and HPC Simulations
- Obaida, Mohammad Abu; Liu, Jason; Chennupati, Gopinath
-
SIGSIM-PADS '18: SIGSIM Principles of Advanced Discrete Simulation, Proceedings of the 2018 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation
https://doi.org/10.1145/3200921.3200937
|
conference
|
May 2018 |
Cetus: A Source-to-Source Compiler Infrastructure for Multicores
|
journal
|
December 2009 |
PARDA: A Fast Parallel Reuse Distance Analysis Algorithm
- Niu, Qingpeng; Dinan, James; Lu, Qingda
-
2012 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), 2012 IEEE 26th International Parallel and Distributed Processing Symposium
https://doi.org/10.1109/IPDPS.2012.117
|
conference
|
May 2012 |
Order of Nonlinearity as a Complexity Measure for Models Generated by Symbolic Regression via Pareto Genetic Programming
|
journal
|
April 2009 |
Evolving multidimensional transformations for symbolic regression with M3GP
|
journal
|
September 2018 |
Rose: Compiler Support for Object-Oriented Frameworks
|
journal
|
June 2000 |
Fast, accurate, and scalable memory modeling of GPGPUs using reuse profiles
- Arafa, Yehia; Badawy, Abdel-Hameed; Chennupati, Gopinath
-
ICS '20: 2020 International Conference on Supercomputing, Proceedings of the 34th ACM International Conference on Supercomputing
https://doi.org/10.1145/3392717.3392761
|
conference
|
June 2020 |
The structural simulation toolkit
|
journal
|
March 2011 |
Reuse-distance-based miss-rate prediction on a per instruction basis
|
conference
|
January 2004 |
Predicting whole-program locality through reuse distance analysis
|
conference
|
January 2003 |
Evaluation techniques for storage hierarchies
|
journal
|
January 1970 |
Inferred Models for Dynamic and Sparse Hardware-Software Spaces
|
conference
|
December 2012 |
Accelerating multicore reuse distance analysis with sampling and parallelization
- Schuff, Derek L.; Kulkarni, Milind; Pai, Vijay S.
-
Proceedings of the 19th international conference on Parallel architectures and compilation techniques - PACT '10
https://doi.org/10.1145/1854273.1854286
|
conference
|
January 2010 |
Imcsim: Parameterized Performance Prediction for Implicit Monte Carlo Codes
|
conference
|
December 2018 |
Discrete event performance prediction of speculatively parallel temperature-accelerated dynamics
|
journal
|
October 2016 |
Program locality analysis using reuse distance
|
journal
|
August 2009 |
Analytical Processor Performance and Power Modeling using Micro-Architecture Independent Characteristics
|
journal
|
January 2016 |
MARSS: a full system simulator for multicore x86 CPUs
|
conference
|
January 2011 |
Palm: easing the burden of analytical performance modeling
|
conference
|
January 2014 |