DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Incorporating uncertainty for enhanced leaderboard scoring and ranking in data competitions

Abstract

Data competitions have become a popular and cost-effective approach for crowdsourcing versatile solutions from diverse expertise. Current practice relies on the simple leaderboard scoring based on a given set of competition data for ranking competitors and distributing the prize. However, a disadvantage of this practice in many competitions is that a slight difference in the scores due to the natural variability of the observed data could result in a much larger difference in the prize amounts. In this article, we propose a new strategy to quantify the uncertainty in the rankings and scores from using different data sets that share common characteristics with the provided competition data. By using a bootstrap approach to generate many comparable data sets, the new method has four advantages over current practice. Furthermore, during the competition, it provides a mechanism for competitors to get feedback about the uncertainty in their relative ranking. After the competition, it allows the host to gain a deeper understanding of the algorithm performance and their robustness across representative data sets. It also offers a transparent mechanism for prize distribution to reward the competitors more fairly with superior and robust performance. Finally, it has the additional advantage of being able tomore » explore what results might have looked like if competition goals evolved from their original choices. The implementation of the strategy is illustrated with a real data competition hosted by Topcoder on urban radiation search.« less

Authors:
 [1]; ORCiD logo [2]
  1. Univ. of South Florida, Tampa, FL (United States)
  2. Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Publication Date:
Research Org.:
Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Sponsoring Org.:
USDOE National Nuclear Security Administration (NNSA). Office of Defense Nuclear Nonproliferation R&D
OSTI Identifier:
1774438
Report Number(s):
LA-UR-20-22405
Journal ID: ISSN 0898-2112
Grant/Contract Number:  
89233218CNA000001
Resource Type:
Accepted Manuscript
Journal Name:
Quality Engineering
Additional Journal Information:
Journal Volume: 33; Journal Issue: 2; Journal ID: ISSN 0898-2112
Publisher:
American Society for Quality Control
Country of Publication:
United States
Language:
English
Subject:
42 ENGINEERING; fractional random-weight bootstrap; ordinary bootstrap; price allocation; private leaderboard; public leaderboard; relative ranking; resampling

Citation Formats

Lu, Lu, and Anderson-Cook, Christine Michaela. Incorporating uncertainty for enhanced leaderboard scoring and ranking in data competitions. United States: N. p., 2020. Web. doi:10.1080/08982112.2020.1808222.
Lu, Lu, & Anderson-Cook, Christine Michaela. Incorporating uncertainty for enhanced leaderboard scoring and ranking in data competitions. United States. https://doi.org/10.1080/08982112.2020.1808222
Lu, Lu, and Anderson-Cook, Christine Michaela. Wed . "Incorporating uncertainty for enhanced leaderboard scoring and ranking in data competitions". United States. https://doi.org/10.1080/08982112.2020.1808222. https://www.osti.gov/servlets/purl/1774438.
@article{osti_1774438,
title = {Incorporating uncertainty for enhanced leaderboard scoring and ranking in data competitions},
author = {Lu, Lu and Anderson-Cook, Christine Michaela},
abstractNote = {Data competitions have become a popular and cost-effective approach for crowdsourcing versatile solutions from diverse expertise. Current practice relies on the simple leaderboard scoring based on a given set of competition data for ranking competitors and distributing the prize. However, a disadvantage of this practice in many competitions is that a slight difference in the scores due to the natural variability of the observed data could result in a much larger difference in the prize amounts. In this article, we propose a new strategy to quantify the uncertainty in the rankings and scores from using different data sets that share common characteristics with the provided competition data. By using a bootstrap approach to generate many comparable data sets, the new method has four advantages over current practice. Furthermore, during the competition, it provides a mechanism for competitors to get feedback about the uncertainty in their relative ranking. After the competition, it allows the host to gain a deeper understanding of the algorithm performance and their robustness across representative data sets. It also offers a transparent mechanism for prize distribution to reward the competitors more fairly with superior and robust performance. Finally, it has the additional advantage of being able to explore what results might have looked like if competition goals evolved from their original choices. The implementation of the strategy is illustrated with a real data competition hosted by Topcoder on urban radiation search.},
doi = {10.1080/08982112.2020.1808222},
journal = {Quality Engineering},
number = 2,
volume = 33,
place = {United States},
year = {Wed Oct 14 00:00:00 EDT 2020},
month = {Wed Oct 14 00:00:00 EDT 2020}
}

Works referenced in this record:

Improved learning from data competitions through strategic design of training and test data sets
journal, May 2019


How to Host An Effective Data Competition: Statistical Advice for Competition Design and Analysis
journal, February 2019

  • Anderson‐Cook, Christine M.; Myers, Kary L.; Lu, Lu
  • Statistical Analysis and Data Mining: The ASA Data Science Journal, Vol. 12, Issue 4
  • DOI: 10.1002/sam.11404

Non-uniform space filling (NUSF) designs
journal, February 2020


Applications of the Fractional-Random-Weight Bootstrap
journal, April 2020


Bootstrap Methods: Another Look at the Jackknife
journal, January 1979