Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Ranking and Classifying AI Benchmarks

Conference · · No journal information
DOI:https://doi.org/10.2172/3019412· OSTI ID:3019412
We created a set of standards to efficiently evaluate AI benchmarks through objective means. Although prevalent, especially in recent times, AI benchmarks have no single way to measure their effectiveness. The MLCommons team provided a set of criteria for evaluating benchmarks, although the criteria lacks a clearly defined set of evaluation rules. We created a rubric with preset factors to efficiently and objectively evaluate a benchmark s quality. We created a software framework for processing lists of benchmarks for visualization. The framework and rating system allows researchers to quickly check if their benchmarks are effective.
Research Organization:
Cornell U.; Fermi National Accelerator Laboratory (FNAL), Batavia, IL (United States)
Sponsoring Organization:
US Department of Energy
DOE Contract Number:
89243024CSC000002
OSTI ID:
3019412
Report Number(s):
FERMILAB-POSTER-25-0117-STUDENT; oai:inspirehep.net:2958943
Resource Type:
Conference poster
Conference Information:
Journal Name: No journal information
Country of Publication:
United States
Language:
English

Similar Records

Classifying and rating AI benchmarks
Journal Article · Sun Aug 24 20:00:00 EDT 2025 · No journal information · OSTI ID:3019254

An MLCommons Scientific Benchmarks Ontology
Journal Article · Wed Nov 05 23:00:00 EST 2025 · No journal information · OSTI ID:3004873

AI Benchmark Democratization and Carpentry
Journal Article · Thu Dec 11 23:00:00 EST 2025 · No journal information · OSTI ID:3008660

Related Subjects