skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Experience with Remote Job Execution

Conference ·
OSTI ID:947593

The Neutron Science Portal at Oak Ridge National Laboratory submits jobs to the TeraGrid for remote job execution. The TeraGrid is a network of high performance computers supported by the US National Science Foundation. There are eleven partner facilities with over a petaflop of peak computing performance and sixty petabytes of long-term storage. Globus is installed on a local machine and used for job submission. The graphical user interface is produced by java coding that reads an XML file. After submission, the status of the job is displayed in a Job Information Service window which queries globus for the status. The output folder produced in the scratch directory of the TeraGrid machine is returned to the portal with globus-url-copy command that uses the gridftp servers on the TeraGrid machines. This folder is copied from the stage-in directory of the community account to the user's results directory where the output can be plotted using the portal's visualization services. The primary problem with remote job execution is diagnosing execution problems. We have daily tests of submitting multiple remote jobs from the portal. When these jobs fail on a computer, it is difficult to diagnose the problem from the globus output. Successes and problems will be presented.

Research Organization:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Spallation Neutron Source (SNS)
Sponsoring Organization:
Work for Others (WFO)
DOE Contract Number:
DE-AC05-00OR22725
OSTI ID:
947593
Resource Relation:
Conference: NOBUGS 2008, Sydney, Australia, 20081103, 20081105
Country of Publication:
United States
Language:
English