System, method, and computer program product for bulk synchronous binary program translation and optimization
Abstract
A system, method, and computer program product are provided for. The method includes the steps of executing a block of translated binary instructions by multiple threads and gathering profiling data during execution of the block of translated binary instructions. The multiple threads are then synchronized at a barrier instruction associated with the block of translated binary instructions and the block of translated binary instructions is replaced with optimized binary instructions, where the optimized binary instructions are produced based on the profiling data.
- Inventors:
- Issue Date:
- Research Org.:
- NVIDIA Corp., Santa Clara, CA (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1532140
- Patent Number(s):
- 9207919
- Application Number:
- 14/158,749
- Assignee:
- NVIDIA Corporation (Santa Clara, CA)
- Patent Classifications (CPCs):
-
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
- DOE Contract Number:
- B599861; HR0011-13-3-0001
- Resource Type:
- Patent
- Resource Relation:
- Patent File Date: 2014-01-17
- Country of Publication:
- United States
- Language:
- English
Citation Formats
Diamos, Gregory Frederick. System, method, and computer program product for bulk synchronous binary program translation and optimization. United States: N. p., 2015.
Web.
Diamos, Gregory Frederick. System, method, and computer program product for bulk synchronous binary program translation and optimization. United States.
Diamos, Gregory Frederick. Tue .
"System, method, and computer program product for bulk synchronous binary program translation and optimization". United States. https://www.osti.gov/servlets/purl/1532140.
@article{osti_1532140,
title = {System, method, and computer program product for bulk synchronous binary program translation and optimization},
author = {Diamos, Gregory Frederick},
abstractNote = {A system, method, and computer program product are provided for. The method includes the steps of executing a block of translated binary instructions by multiple threads and gathering profiling data during execution of the block of translated binary instructions. The multiple threads are then synchronized at a barrier instruction associated with the block of translated binary instructions and the block of translated binary instructions is replaced with optimized binary instructions, where the optimized binary instructions are produced based on the profiling data.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2015},
month = {12}
}
Save to My Library
You must Sign In or Create an Account in order to save documents to your library.