Incorporating uncertainty for enhanced leaderboard scoring and ranking in data competitions

Lu, Lu; Anderson-Cook, Christine Michaela

doi:10.1080/08982112.2020.1808222

Title: Incorporating uncertainty for enhanced leaderboard scoring and ranking in data competitions

Abstract

Data competitions have become a popular and cost-effective approach for crowdsourcing versatile solutions from diverse expertise. Current practice relies on the simple leaderboard scoring based on a given set of competition data for ranking competitors and distributing the prize. However, a disadvantage of this practice in many competitions is that a slight difference in the scores due to the natural variability of the observed data could result in a much larger difference in the prize amounts. In this article, we propose a new strategy to quantify the uncertainty in the rankings and scores from using different data sets that share common characteristics with the provided competition data. By using a bootstrap approach to generate many comparable data sets, the new method has four advantages over current practice. Furthermore, during the competition, it provides a mechanism for competitors to get feedback about the uncertainty in their relative ranking. After the competition, it allows the host to gain a deeper understanding of the algorithm performance and their robustness across representative data sets. It also offers a transparent mechanism for prize distribution to reward the competitors more fairly with superior and robust performance. Finally, it has the additional advantage of being able tomore »« less

Authors:

Lu, Lu ^[1];

^[2]

Univ. of South Florida, Tampa, FL (United States)
Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

Publication Date:: Wed Oct 14 00:00:00 EDT 2020

Research Org.:: Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

Sponsoring Org.:: USDOE National Nuclear Security Administration (NNSA). Office of Defense Nuclear Nonproliferation R&D

OSTI Identifier:: 1774438

Report Number(s):: LA-UR-20-22405
Journal ID: ISSN 0898-2112

Grant/Contract Number:: 89233218CNA000001

Resource Type:: Accepted Manuscript

Journal Name:: Quality Engineering

Additional Journal Information:: Journal Volume: 33; Journal Issue: 2; Journal ID: ISSN 0898-2112

Publisher:: American Society for Quality Control

Country of Publication:: United States

Language:: English

Subject:: 42 ENGINEERING; fractional random-weight bootstrap; ordinary bootstrap; price allocation; private leaderboard; public leaderboard; relative ranking; resampling

Citation Formats


                    Lu, Lu, and Anderson-Cook, Christine Michaela. Incorporating uncertainty for enhanced leaderboard scoring and ranking in data competitions.  United States: N. p., 2020. 
Web.  doi:10.1080/08982112.2020.1808222.

Copy to clipboard


                    Lu, Lu, & Anderson-Cook, Christine Michaela. Incorporating uncertainty for enhanced leaderboard scoring and ranking in data competitions.  United States.  https://doi.org/10.1080/08982112.2020.1808222

Copy to clipboard


                    Lu, Lu, and Anderson-Cook, Christine Michaela. Wed .  
"Incorporating uncertainty for enhanced leaderboard scoring and ranking in data competitions".  United States.  https://doi.org/10.1080/08982112.2020.1808222.  https://www.osti.gov/servlets/purl/1774438.

Copy to clipboard


                    
@article{osti_1774438,

  title        = {Incorporating uncertainty for enhanced leaderboard scoring and ranking in data competitions},

  author       = {Lu, Lu and Anderson-Cook, Christine Michaela},

  abstractNote = {Data competitions have become a popular and cost-effective approach for crowdsourcing versatile solutions from diverse expertise. Current practice relies on the simple leaderboard scoring based on a given set of competition data for ranking competitors and distributing the prize. However, a disadvantage of this practice in many competitions is that a slight difference in the scores due to the natural variability of the observed data could result in a much larger difference in the prize amounts. In this article, we propose a new strategy to quantify the uncertainty in the rankings and scores from using different data sets that share common characteristics with the provided competition data. By using a bootstrap approach to generate many comparable data sets, the new method has four advantages over current practice. Furthermore, during the competition, it provides a mechanism for competitors to get feedback about the uncertainty in their relative ranking. After the competition, it allows the host to gain a deeper understanding of the algorithm performance and their robustness across representative data sets. It also offers a transparent mechanism for prize distribution to reward the competitors more fairly with superior and robust performance. Finally, it has the additional advantage of being able to explore what results might have looked like if competition goals evolved from their original choices. The implementation of the strategy is illustrated with a real data competition hosted by Topcoder on urban radiation search.},

  doi          = {10.1080/08982112.2020.1808222},

  journal      = {Quality Engineering},

  number       = 2,

  volume       = 33,

  place        = {United States},

  year         = {Wed Oct 14 00:00:00 EDT 2020},

  month        = {Wed Oct 14 00:00:00 EDT 2020}

}

Copy to clipboard

Journal Article:

Free Publicly Available Full Text

Accepted Manuscript (DOE)

Publisher's Version of Record

https://doi.org/10.1080/08982112.2020.1808222

Other availability

Search WorldCat to find libraries that may hold this journal

Save / Share:

Export Metadata

Save to My Library

Works referenced in this record:

Improved learning from data competitions through strategic design of training and test data sets
journal, May 2019

Anderson-Cook, Christine M.; Lu, Lu; Myers, Kary L.
Quality Engineering, Vol. 31, Issue 4
DOI: 10.1080/08982112.2019.1572186

How to Host An Effective Data Competition: Statistical Advice for Competition Design and Analysis
journal, February 2019

Anderson‐Cook, Christine M.; Myers, Kary L.; Lu, Lu
Statistical Analysis and Data Mining: The ASA Data Science Journal, Vol. 12, Issue 4
DOI: 10.1002/sam.11404

Non-uniform space filling (NUSF) designs
journal, February 2020

Lu, Lu; Anderson-Cook, Christine M.; Ahmed, Towfiq
Journal of Quality Technology
DOI: 10.1080/00224065.2020.1727801

Applications of the Fractional-Random-Weight Bootstrap
journal, April 2020

Xu, Li; Gotwalt, Chris; Hong, Yili
The American Statistician, Vol. 74, Issue 4
DOI: 10.1080/00031305.2020.1731599

Bootstrap Methods: Another Look at the Jackknife
journal, January 1979

Efron, B.
The Annals of Statistics, Vol. 7, Issue 1
DOI: 10.1214/aos/1176344552

Similar Records in DOE PAGES and OSTI.GOV collections:

Radiation Detection Data Competition Report

Technical Report Anderson-Cook, Christine Michaela ; Archer, Dan ; Bandstra, Mark S. ; ...

In FY2018 through FY2020, NA-22, the Defense Nuclear Nonproliferation Research and Development Program, funded a Data Science project to develop and implement statistical methodology to effectively host data competitions with the goal of leveraging the opportunity provided by crowdsourcing. By accessing and engaging expertise from a broader research community, there is an opportunity to attract innovative solutions from a variety of different research disciplines to advance the ability to solve important non-proliferation problems. This report summarizes the key results of this project after hosting two data competitions focused on urban radiation detection. The first competition was focused on attracting participantsmore »« less
https://doi.org/10.2172/1778748

Full Text Available
How to Host An Effective Data Competition: Statistical Advice for Competition Design and Analysis

Journal Article Anderson-Cook, Christine Michaela ; Myers, Kary Lynn ; Lu, Lu ; ... - Statistical Analysis and Data Mining

Data competitions rely on real-time leaderboards to rank competitor entries and stimulate algorithm improvement. While such competitions have become quite popular and prevalent, particularly in supervised learning formats, their implementations by the host are highly variable. Without careful planning, a supervised learning competition is vulnerable to overfitting, where the winning solutions are so closely tuned to the particular set of provided data that they cannot generalize to the underlying problem of interest to the host. This paper outlines some important considerations for strategically designing relevant and informative data sets to maximize the learning outcome from hosting a competition based onmore »« less
Cited by 6
https://doi.org/10.1002/sam.11404

Full Text Available
Improved learning from data competitions through strategic design of training and test data sets

Journal Article Anderson-Cook, Christine Michaela ; Myers, Kary Lynn ; Lu, Lu ; ... - Quality Engineering

Leveraging the depth and breadth of solutions generated through crowdsourcing can be a powerful accelerator to method development for high consequence problems. While data science competitions have become quite popular and prevalent, particularly in supervised learning formats, their implementations by the host are highly variable. Without careful planning, a supervised learning competition is vulnerable to overfitting, where the winning solutions are so closely tuned to the particular set of provided data that they cannot generalize to the general underlying problem of interest to the host. This paper outlines important considerations for strategically designing relevant and informative data sets to maximizemore »« less
Cited by 6
https://doi.org/10.1080/08982112.2019.1572186

Full Text Available
Grid Optimization Competition on Synthetic and Industrial Power Systems

Conference Safdarian, Farnaz ; Snodgrass, Jonathan ; Yeo, Ju Hee ; ...

This paper summarizes a grid optimization (GO) competition effort in the United States to find the best solution strategies for up to interconnect-scale power system networks with around 32,000 buses. The optimization problem is a mixedinteger, non-convex non-linear problem, (MINLP) and includes discrete variables such as unit commitment and line switching, control settings (transformer taps and phase shifters with impedance correction tables), and bus shunts. The case study includes six actual industry grids as well as 16 realistic synthetic grids created by three different dataset teams. The winners are selected and ranked based on scoring criteria, which consider the solutionmore »« less
https://doi.org/10.1109/NAPS56150.2022.10012138
Determination of subjective similarity for pairs of masses and pairs of clustered microcalcifications on mammograms: Comparison of similarity ranking scores and absolute similarity ratings

Journal Article Muramatsu, Chisako ; Qiang, Li ; Schmidt, Robert A ; ... - Medical Physics

The presentation of images that are similar to that of an unknown lesion seen on a mammogram may be helpful for radiologists to correctly diagnose that lesion. For similar images to be useful, they must be quite similar from the radiologists' point of view. We have been trying to quantify the radiologists' impression of similarity for pairs of lesions and to establish a ''gold standard'' for development and evaluation of a computerized scheme for selecting such similar images. However, it is considered difficult to reliably and accurately determine similarity ratings, because they are subjective. In this study, we compared themore »« less
https://doi.org/10.1118/1.2745937

Similar Records
Related Works
text (1)

Title: Incorporating uncertainty for enhanced leaderboard scoring and ranking in data competitions

Abstract

Citation Formats

Improved learning from data competitions through strategic design of training and test data sets journal, May 2019

How to Host An Effective Data Competition: Statistical Advice for Competition Design and Analysis journal, February 2019

Non-uniform space filling (NUSF) designs journal, February 2020

Applications of the Fractional-Random-Weight Bootstrap journal, April 2020

Bootstrap Methods: Another Look at the Jackknife journal, January 1979

Improved learning from data competitions through strategic design of training and test data sets
journal, May 2019

How to Host An Effective Data Competition: Statistical Advice for Competition Design and Analysis
journal, February 2019

Non-uniform space filling (NUSF) designs
journal, February 2020

Applications of the Fractional-Random-Weight Bootstrap
journal, April 2020

Bootstrap Methods: Another Look at the Jackknife
journal, January 1979