Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Intern-Artificial Intelligence Benchmarking

Journal Article · · No journal information
OSTI ID:3014039
Benchmarks provide a standardized method for evaluating different AI models, enabling reproducibility and comparison between models, and facilitating scientific progress. As AI models continue to develop rapidly, incorporating new datasets, capabilities, and architectures becomes more complicated. Therefore, the current static benchmarks become increasingly irrelevant. The MLCommons team argues that to make AI benchmarks more relevant, it involves making the benchmarks themselves more dynamic, as well as technical innovations that make it easier for scientists and researchers at all levels to use and contribute to the benchmarks. The current progress in technical innovation is a software that allows for a detailed view of a collection of AI benchmarks to be output in various formats that are easily readable and accessible.
Research Organization:
Fermi National Accelerator Laboratory (FNAL), Batavia, IL (United States)
Sponsoring Organization:
US Department of Energy
DOE Contract Number:
89243024CSC000002
OSTI ID:
3014039
Report Number(s):
FERMILAB-PUB-25-0495-STUDENT; oai:inspirehep.net:3109111
Journal Information:
No journal information, Journal Name: No journal information
Country of Publication:
United States
Language:
English

Similar Records

AI Benchmark Democratization and Carpentry
Journal Article · Thu Dec 11 23:00:00 EST 2025 · No journal information · OSTI ID:3008660

An MLCommons Scientific Benchmarks Ontology
Journal Article · Wed Nov 05 23:00:00 EST 2025 · No journal information · OSTI ID:3004873

Adversarial Artificial Intelligence: State of the Malpractice
Journal Article · Sun Dec 01 23:00:00 EST 2019 · Journal of Information Warfare · OSTI ID:1595267

Related Subjects