Technique for grouping instructions into independent strands

Mehrara, Mojtaba; Garland, Michael; Diamos, Gregory

Title: Technique for grouping instructions into independent strands

Patent · Tue May 09 00:00:00 EDT 2017

OSTI ID:1531988

Mehrara, Mojtaba; Garland, Michael; Diamos, Gregory

A device compiler and linker is configured to group instructions into different strands for execution by different threads based on the dependence of those instructions on other, long-latency instructions. A thread may execute a strand that includes long-latency instructions, and then hardware resources previously allocated for the execution of that thread may be de-allocated from the thread and re-allocated to another thread. The other thread may then execute another strand while the long-latency instructions are in flight. With this approach, the other thread is not required to wait for the long-latency instructions to complete before acquiring hardware resources and initiating execution of the other strand, thereby eliminating at least a portion of the time that the other thread would otherwise spend waiting.

View Patent

Cite

Export

Save

Research Organization:: NVIDIA Corp., Santa Clara, CA (United States)

Sponsoring Organization:: USDOE

DOE Contract Number:: B599861; HR0011-13-3-0001

Assignee:: NVIDIA Corporation (Santa Clara, CA)

Patent Number(s):: 9,645,802

Application Number:: 13/961,097

OSTI ID:: 1531988

Resource Relation:: Patent File Date: 2013-08-07

Country of Publication:: United States

Language:: English

References (10)

Apparatus and method for speculatively executing instructions in a computer system McKeen, Francis X.; Adler, Michael C.; Emer, Joel S. https://doi.org/https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/5421022 US Patent Document 5,421,022	patent	May 1995
Multithreaded data processing method with long latency subinstructions Hwang, Myeong-Eun https://doi.org/https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/6216220 US Patent Document 6,216,220	patent	April 2001
Local stall control method and structure in a microprocessor Tremblay, Marc; Yeluri, Sharada https://doi.org/https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/6279100 US Patent Document 6,279,100	patent	August 2001
Generation of compiler description from architecture description Braun, Gunnar; Hoffmann, Andreas; Greive, Volker https://doi.org/https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/8677312 US Patent Document 8,677,312	patent	March 2014
Scheduling of instructions Braun, Gunnar; Hoffmann, Andreas; Grieve, Volker https://doi.org/https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/8689202 US Patent Document 8,689,202	patent	April 2014
Value speculation on an assist processor to facilitate prefetching for a primary processor Chaudhry, Shailender; Tremblay, Marc https://doi.org/https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/20010052064 US Patent Application 09/761360; 20010052064	patent-application	December 2001
Supporting out-of-order issue in an execute-ahead processor Chaudhry, Shailender; Trembly, Marc; Capriolo, Paul https://doi.org/https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/20070186081 US Patent Application 11/367814; 20070186081	patent-application	August 2007
Diagnostic apparatus and method Reid, Alastair David; Ford, Simon Andrew; Kneebone, Katherine Elizabeth https://doi.org/https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/20080133897 US Patent Application 11/907112; 20080133897	patent-application	June 2008
Credit-Based Streaming Multiprocessor Warp Scheduling Lindholm, John Erik; Coon, Brett W.; Wiezbicki, Jered https://doi.org/https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/20110072244 US Patent Application 12/885299; 20110072244	patent-application	March 2011
Opcode Counting for Performance Measurement Gara, Alan; Satterfield, David L.; Walkup, Robert E. https://doi.org/https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/20110172969 US Patent Application 12/688773; 20110172969	patent-application	July 2011

Cited By (2)

Data processing graph compilation Stanfill, Craig W.; Shapiro, Richard https://doi.org/https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/10037198 US Patent Document 10,037,198	patent	July 2018
System and method for managing static divergence in a SIMD computing architecture Lo, Chen-Kang; Liao, Shih-wei; Han, Cheng-Ting https://doi.org/https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/9921838 US Patent Document 9,921,838	patent	March 2018

Similar Records

System, method, and computer program product for bulk synchronous binary program translation and optimization

Patent · Tue Dec 08 00:00:00 EST 2015 · OSTI ID:1531988

Diamos, Gregory Frederick

Two fundamental issues in multiprocessing. Technical report

Technical Report · Thu Oct 01 00:00:00 EDT 1987 · OSTI ID:1531988

Iannucci, R A; Arvind,

Single-pass parallel prefix scan with dynamic look back

Patent · Tue Mar 27 00:00:00 EDT 2018 · OSTI ID:1531988

Merrill, Duane

Title: Technique for grouping instructions into independent strands

Citation Formats

References (10)

Cited By (2)

Similar Records

Related Subjects