Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

The globus compute dataset: An open function-as-a-service dataset from the edge to the cloud

Journal Article · · Future Generations Computer Systems
 [1];  [2];  [3];  [2];  [2];  [4];  [5];  [1]
  1. Univ. of Chicago, IL (United States); Argonne National Laboratory (ANL), Argonne, IL (United States)
  2. Univ. of Chicago, IL (United States)
  3. Argonne National Laboratory (ANL), Argonne, IL (United States)
  4. Northeastern Univ., Boston, MA (United States)
  5. Argonne National Laboratory (ANL), Argonne, IL (United States); Univ. of Chicago, IL (United States)
Here we present a unique function-as-a-service (FaaS) dataset capturing the use of the Globus Compute (previously funcX) platform. Globus Compute implements a federated model via which users may deploy endpoints on arbitrary remote computers, from the edge to high performance computing (HPC) cluster, and they may then invoke Python functions on those endpoints via a reliable cloud -hosted service. The dataset covers 31 weeks and includes 2121472 task submissions from 252 users executed on 580 remote computing endpoints. It includes 277386 registered functions. We describe the dataset and various observations, some that are similar to other FaaS datasets, for example, that 74% of tasks run for less than 1 s, and some that are unique to Globus Compute, for example, that endpoints are used in different ways and that the majority of functions are related to scientific computing and machine learning. To the best of our knowledge, this dataset represents the first federated FaaS dataset that includes user workloads, distributed computing endpoints, and analysis of registered function bodies. We expect the dataset to be useful for researching FaaS architectures, workload modeling, container warming, and other distributed computing architectures.
Research Organization:
Argonne National Laboratory (ANL), Argonne, IL (United States)
Sponsoring Organization:
National Science Foundation (NSF); USDOE
Grant/Contract Number:
AC02-06CH11357
OSTI ID:
2571432
Journal Information:
Future Generations Computer Systems, Journal Name: Future Generations Computer Systems Vol. 153; ISSN 0167-739X
Publisher:
ElsevierCopyright Statement
Country of Publication:
United States
Language:
English

References (18)

Performance evaluation of heterogeneous cloud functions journal August 2018
Globus automation services: Research process automation across the space–time continuum journal May 2023
Linking scientific instruments and computation: Patterns, technologies, and experiences journal October 2022
Highly accurate protein structure prediction with AlphaFold journal July 2021
Serverless Applications: Why, When, and How? journal January 2021
𝑓uncX: Federated Function as a Service for Science journal December 2022
The State of Serverless Applications: Collection, Characterization, and Community Consensus journal October 2022
Parsl: Pervasive Parallel Programming in Python
  • Babuji, Yadu; Foster, Ian; Wilde, Michael
  • Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing - HPDC '19 https://doi.org/10.1145/3307681.3325400
conference January 2019
Benchmarking elasticity of FaaS platforms as a foundation for objective-driven design of serverless applications conference March 2020
Characterizing serverless platforms with serverlessbench conference October 2020
FaasCache: keeping serverless computing alive with greedy-dual caching
  • Fuerst, Alexander; Sharma, Prateek
  • Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems https://doi.org/10.1145/3445814.3446757
conference April 2021
Understanding, predicting and scheduling serverless workloads under partial interference conference November 2021
IMPECCABLE: Integrated Modeling PipelinE for COVID Cure by Assessing Better LEads conference October 2021
Atoll conference November 2021
Kraken conference November 2021
Characterizing Microservice Dependency and Performance conference November 2021
IceBreaker: warming serverless functions better with heterogeneity
  • Roy, Rohan Basu; Patel, Tirthak; Tiwari, Devesh
  • Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems https://doi.org/10.1145/3503222.3507750
conference February 2022
AWSomePy: A Dataset and Characterization of Serverless Applications conference May 2023

Figures / Tables (36)


Similar Records

funcX: Federated Function as a Service for Science
Journal Article · Wed Sep 21 20:00:00 EDT 2022 · IEEE Transactions on Parallel and Distributed Systems · OSTI ID:2375879

XaaS: Acceleration as a Service to Enable Productive High-Performance Cloud Computing
Journal Article · Mon Apr 01 20:00:00 EDT 2024 · Computing in Science and Engineering · OSTI ID:2545755