Methods, apparatus and computer software product for source code optimization are provided. In an exemplary embodiment, a first custom computing apparatus is used to optimize the execution of source code on a second computing apparatus. In this embodiment, the first custom computing apparatus contains a memory, a storage medium and at least one processor with at least one multi-stage execution unit. The second computing apparatus contains at least two multi-stage execution units that allow for parallel execution of tasks. The first custom computing apparatus optimizes the code for parallelism, locality of operations and contiguity of memory accesses on the second computing apparatus. This Abstract is provided for the sole purpose of complying with the Abstract requirement rules. This Abstract is submitted with the explicit understanding that it will not be used to interpret or to limit the scope or the meaning of the claims.
Bastoul, Cedric, et al. "System, methods and apparatus for program optimization for multi-threaded processor architectures." US 8,930,926, United States Patent and Trademark Office, Jan. 2015.
Bastoul, Cedric, Lethin, Richard A., Leung, Allen K., Meister, Benoit J., Szilagyi, Peter, Vasilache, Nicolas T., & Wohlford, David E. (2015). System, methods and apparatus for program optimization for multi-threaded processor architectures (U.S. Patent No.
Bastoul, Cedric, Lethin, Richard A., Leung, Allen K., et al., "System, methods and apparatus for program optimization for multi-threaded processor architectures," US 8,930,926, issued January 5, 2015.
@misc{osti_1167014,
author = {Bastoul, Cedric and Lethin, Richard A. and Leung, Allen K. and Meister, Benoit J. and Szilagyi, Peter and Vasilache, Nicolas T. and Wohlford, David E.},
title = {System, methods and apparatus for program optimization for multi-threaded processor architectures},
annote = {Methods, apparatus and computer software product for source code optimization are provided. In an exemplary embodiment, a first custom computing apparatus is used to optimize the execution of source code on a second computing apparatus. In this embodiment, the first custom computing apparatus contains a memory, a storage medium and at least one processor with at least one multi-stage execution unit. The second computing apparatus contains at least two multi-stage execution units that allow for parallel execution of tasks. The first custom computing apparatus optimizes the code for parallelism, locality of operations and contiguity of memory accesses on the second computing apparatus. This Abstract is provided for the sole purpose of complying with the Abstract requirement rules. This Abstract is submitted with the explicit understanding that it will not be used to interpret or to limit the scope or the meaning of the claims.},
url = {https://www.osti.gov/biblio/1167014},
place = {United States},
year = {2015},
month = {01},
note = {US Patent
Twenty-Eighth Annual Hawaii International Conference on System Sciences, Proceedings of the Twenty-Eighth Hawaii International Conference on System Scienceshttps://doi.org/10.1109/HICSS.1995.375473
Proceedings of the international conference on Compilers, architectures and synthesis for embedded systems - CASES '03https://doi.org/10.1145/951710.951749
Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems - CASES '06https://doi.org/10.1145/1176760.1176771