Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

The HEPCloud Facility: elastic computing for High Energy Physics – The NOvA Use Case

Conference ·
OSTI ID:1346929
The need for computing in the HEP community follows cycles of peaks and valleys mainly driven by conference dates, accelerator shutdown, holiday schedules, and other factors. Because of this, the classical method of provisioning these resources at providing facilities has drawbacks such as potential overprovisioning. As the appetite for computing increases, however, so does the need to maximize cost efficiency by developing a model for dynamically provisioning resources only when needed. To address this issue, the HEPCloud project was launched by the Fermilab Scientific Computing Division in June 2015. Its goal is to develop a facility that provides a common interface to a variety of resources, including local clusters, grids, high performance computers, and community and commercial Clouds. Initially targeted experiments include CMS and NOvA, as well as other Fermilab stakeholders. In its first phase, the project has demonstrated the use of the “elastic” provisioning model offered by commercial clouds, such as Amazon Web Services. In this model, resources are rented and provisioned automatically over the Internet upon request. In January 2016, the project demonstrated the ability to increase the total amount of global CMS resources by 58,000 cores from 150,000 cores - a 25 percent increase - in preparation for the Recontres de Moriond. In March 2016, the NOvA experiment has also demonstrated resource burst capabilities with an additional 7,300 cores, achieving a scale almost four times as large as the local allocated resources and utilizing the local AWS s3 storage to optimize data handling operations and costs. NOvA was using the same familiar services used for local computations, such as data handling and job submission, in preparation for the Neutrino 2016 conference. In both cases, the cost was contained by the use of the Amazon Spot Instance Market and the Decision Engine, a HEPCloud component that aims at minimizing cost and job interruption. This paper describes the Fermilab HEPCloud Facility and the challenges overcome for the CMS and NOvA communities.
Research Organization:
Fermi National Accelerator Laboratory (FNAL), Batavia, IL (United States)
Sponsoring Organization:
USDOE Office of Science (SC), High Energy Physics (HEP) (SC-25)
DOE Contract Number:
AC02-07CH11359
OSTI ID:
1346929
Report Number(s):
FERMILAB-CONF-16-643-CD; 1517408
Country of Publication:
United States
Language:
English

Similar Records

Virtual machine provisioning, code management, and data movement design for the Fermilab HEPCloud Facility
Journal Article · Sat Sep 30 20:00:00 EDT 2017 · Journal of Physics. Conference Series · OSTI ID:1423235

HEPCloud Operations at Fermilab—The First Five Years
Conference · Tue Dec 31 23:00:00 EST 2024 · EPJ Web Conf. · OSTI ID:3009886

HEPCloud, an Elastic Hybrid HEP Facility using an Intelligent Decision Support System
Conference · Thu Apr 18 00:00:00 EDT 2019 · OSTI ID:1513286

Related Subjects