Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Classifying and rating AI benchmarks

Journal Article · · No journal information
OSTI ID:3019254
We created a set of standards to efficiently evaluate AI benchmarks through objective means. Although prevalent, especially in recent times, AI benchmarks have no single way to measure their effectiveness. The MLCommons team provided a set of criteria for evaluating benchmarks, although the criteria lacks a clearly defined set of evaluation rules. We created a rubric with preset factors to efficiently and objectively evaluate a benchmark’s quality. We created a software framework for processing lists of benchmarks for visualization. The framework and rating system allows researchers to quickly check if their benchmarks are effective.
Research Organization:
Fermi National Accelerator Laboratory (FNAL), Batavia, IL (United States)
Sponsoring Organization:
US Department of Energy
DOE Contract Number:
89243024CSC000002
OSTI ID:
3019254
Report Number(s):
FERMILAB-PUB-25-0478-STUDENT; oai:inspirehep.net:2963459
Journal Information:
No journal information, Journal Name: No journal information
Country of Publication:
United States
Language:
English

Similar Records

Ranking and Classifying AI Benchmarks
Conference · Tue Aug 05 20:00:00 EDT 2025 · No journal information · OSTI ID:3019412

An MLCommons Scientific Benchmarks Ontology
Journal Article · Wed Nov 05 23:00:00 EST 2025 · No journal information · OSTI ID:3004873

AI Benchmark Democratization and Carpentry
Journal Article · Thu Dec 11 23:00:00 EST 2025 · No journal information · OSTI ID:3008660

Related Subjects