skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Dispatching function calls across accelerator devices

Abstract

In one embodiment, a computer-implemented method for dispatching a function call includes receiving, at a supervisor processing element (PE) and from an origin PE, an identifier of a target device, a stack frame of the origin PE, and an address of a function called from the origin PE. The supervisor PE allocates a target PE of the target device. The supervisor PE copies the stack frame of the origin PE to a new stack frame on a call stack of the target PE. The supervisor PE instructs the target PE to execute the function. The supervisor PE receives a notification that execution of the function is complete. The supervisor PE copies the stack frame of the target PE to the stack frame of the origin PE. The supervisor PE releases the target PE of the target device. The supervisor PE instructs the origin PE to resume execution of the program.

Inventors:
;
Publication Date:
Research Org.:
INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1339621
Patent Number(s):
9,547,526
Application Number:
14/744,048
Assignee:
INTERNATIONAL BUSINESS MACHINES CORPORATION (Armonk, NY) OSTI
DOE Contract Number:
B599858
Resource Type:
Patent
Resource Relation:
Patent File Date: 2015 Jun 19
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Jacob, Arpith C., and Sallenave, Olivier H. Dispatching function calls across accelerator devices. United States: N. p., 2017. Web.
Jacob, Arpith C., & Sallenave, Olivier H. Dispatching function calls across accelerator devices. United States.
Jacob, Arpith C., and Sallenave, Olivier H. Tue . "Dispatching function calls across accelerator devices". United States. doi:. https://www.osti.gov/servlets/purl/1339621.
@article{osti_1339621,
title = {Dispatching function calls across accelerator devices},
author = {Jacob, Arpith C. and Sallenave, Olivier H.},
abstractNote = {In one embodiment, a computer-implemented method for dispatching a function call includes receiving, at a supervisor processing element (PE) and from an origin PE, an identifier of a target device, a stack frame of the origin PE, and an address of a function called from the origin PE. The supervisor PE allocates a target PE of the target device. The supervisor PE copies the stack frame of the origin PE to a new stack frame on a call stack of the target PE. The supervisor PE instructs the target PE to execute the function. The supervisor PE receives a notification that execution of the function is complete. The supervisor PE copies the stack frame of the target PE to the stack frame of the origin PE. The supervisor PE releases the target PE of the target device. The supervisor PE instructs the origin PE to resume execution of the program.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Tue Jan 17 00:00:00 EST 2017},
month = {Tue Jan 17 00:00:00 EST 2017}
}

Patent:

Save / Share:
  • In one embodiment, a computer-implemented method for dispatching a function call includes receiving, at a supervisor processing element (PE) and from an origin PE, an identifier of a target device, a stack frame of the origin PE, and an address of a function called from the origin PE. The supervisor PE allocates a target PE of the target device. The supervisor PE copies the stack frame of the origin PE to a new stack frame on a call stack of the target PE. The supervisor PE instructs the target PE to execute the function. The supervisor PE receives a notificationmore » that execution of the function is complete. The supervisor PE copies the stack frame of the target PE to the stack frame of the origin PE. The supervisor PE releases the target PE of the target device. The supervisor PE instructs the origin PE to resume execution of the program.« less
  • Executing application function calls in response to an interrupt including creating a thread; receiving an interrupt having an interrupt type; determining whether a value of a semaphore represents that interrupts are disabled; if the value of the semaphore represents that interrupts are not disabled: calling, by the thread, one or more preconfigured functions in dependence upon the interrupt type of the interrupt; yielding the thread; and if the value of the semaphore represents that interrupts are disabled: setting the value of the semaphore to represent to a kernel that interrupts are hard-disabled; and hard-disabling interrupts at the kernel.
  • An aspect includes a table of contents (TOC) that was generated by a compiler being received at an accelerator device. The TOC includes an address of global data in a host memory space. The global data is copied from the address in the host memory space to an address in the device memory space. The address in the host memory space is obtained from the received TOC. The received TOC is updated to indicate that global data is stored at the address in the device memory space. A kernel that accesses the global data from the address in the devicemore » memory space is executed. The address in the device memory space is obtained based on contents of the updated TOC. When the executing is completed, the global data from the address in the device memory space is copied to the address in the host memory space.« less
  • An aspect includes a table of contents (TOC) that was generated by a compiler being received at an accelerator device. The TOC includes an address of global data in a host memory space. The global data is copied from the address in the host memory space to an address in the device memory space. The address in the host memory space is obtained from the received TOC. The received TOC is updated to indicate that global data is stored at the address in the device memory space. A kernel that accesses the global data from the address in the devicemore » memory space is executed. The address in the device memory space is obtained based on contents of the updated TOC. When the executing is completed, the global data from the address in the device memory space is copied to the address in the host memory space.« less