Improve Learning from Crowds via Generative Augmentation

Chu, Zhendong; Wang, Hongning

doi:10.1145/3447548.3467409

Improve Learning from Crowds via Generative Augmentation

Conference · Sat Aug 14 04:00:00 EDT 2021 · Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

DOI:https://doi.org/10.1145/3447548.3467409· OSTI ID:1822655

Chu, Zhendong ^[1]; Wang, Hongning ^[2]

University of Virginia, Charlottesville, VA, USA; Computer Science, University of Virginia
University of Virginia, Charlottesville, VA, USA

Crowdsourcing provides an efficient label collection schema for supervised machine learning. However, to control annotation cost, each instance in the crowdsourced data is typically annotated by a small number of annotators. This creates a sparsity issue and limits the quality of machine learning models trained on such data. In this paper, we study how to handle sparsity in crowdsourced data using data augmentation. Specifically, we propose to directly learn a classifier by augmenting the raw sparse annotations. We implement two principles of high-quality augmentation using Generative Adversarial Networks: 1) the generated annotations should follow the distribution of authentic ones, which is measured by a discriminator; 2) the generated annotations should have high mutual information with the ground-truth labels, which is measured by an auxiliary network. Extensive experiments and comparisons against an array of state-of-the-art learning from crowds methods on three real-world datasets proved the effectiveness of our data augmentation framework. It shows the potential of our algorithm for low-budget crowdsourcing in general.

Research Organization:: University of Virginia

Sponsoring Organization:: U.S. Department of Energy

DOE Contract Number:: EE0008227

OSTI ID:: 1822655

Report Number(s):: DOE-UVA-0008227-5; 1718216,1553568

Conference Information:: Journal Name: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

Country of Publication:: United States

Language:: English

References (13)

GraphGAN: Graph Representation Learning With Generative Adversarial Nets Wang, Hongwei; Wang, Jia; Wang, Jialin Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32, Issue 1 https://doi.org/10.1609/aaai.v32i1.11872	journal	April 2018
Fine-Grained Crowdsourcing for Fine-Grained Recognition Deng, Jia; Krause, Jonathan; Fei-Fei, Li 2013 IEEE Conference on Computer Vision and Pattern Recognition https://doi.org/10.1109/CVPR.2013.81	conference	June 2013
Cfgan Chae, Dong-Kyu; Kang, Jin-Soo; Kim, Sang-Wook Proceedings of the 27th ACM International Conference on Information and Knowledge Management https://doi.org/10.1145/3269206.3271743	conference	October 2018
Human Uncertainty Makes Classification More Robust Peterson, Joshua; Battleday, Ruairidh; Griffiths, Thomas 2019 IEEE/CVF International Conference on Computer Vision (ICCV) https://doi.org/10.1109/ICCV.2019.00971	conference	October 2019
Community-based bayesian aggregation models for crowdsourcing Venanzi, Matteo; Guiver, John; Kazai, Gabriella Proceedings of the 23rd international conference on World wide web - WWW '14 https://doi.org/10.1145/2566486.2567989	conference	January 2014
Who Said What: Modeling Individual Labelers Improves Classification Guan, Melody; Gulshan, Varun; Dai, Andrew Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32, Issue 1 https://doi.org/10.1609/aaai.v32i1.11756	journal	April 2018
Enhancing Collaborative Filtering with Generative Augmentation Wang, Qinyong; Yin, Hongzhi; Wang, Hao Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining https://doi.org/10.1145/3292500.3330873	conference	July 2019
Maximum Likelihood Estimation of Observer Error-Rates Using the EM Algorithm Dawid, A. P.; Skene, A. M. Applied Statistics, Vol. 28, Issue 1 https://doi.org/10.2307/2346806	journal	January 1979
LabelMe: A Database and Web-Based Tool for Image Annotation Russell, Bryan C.; Torralba, Antonio; Murphy, Kevin P. International Journal of Computer Vision, Vol. 77, Issue 1-3 https://doi.org/10.1007/s11263-007-0090-8	journal	October 2007
Learning From Noisy Labels by Regularized Estimation of Annotator Confusion Tanno, Ryutaro; Saeedi, Ardavan; Sankaranarayanan, Swami 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) https://doi.org/10.1109/CVPR.2019.01150	conference	June 2019
Deep Learning from Crowds Rodrigues, Filipe; Pereira, Francisco Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32, Issue 1 https://doi.org/10.1609/aaai.v32i1.11506	journal	April 2018
Irgan Wang, Jun; Yu, Lantao; Zhang, Weinan Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval https://doi.org/10.1145/3077136.3080786	conference	August 2017
Crowdsourced Data Management: A Survey Li, Guoliang; Wang, Jiannan; Zheng, Yudian IEEE Transactions on Knowledge and Data Engineering, Vol. 28, Issue 9 https://doi.org/10.1109/TKDE.2016.2535242	journal	September 2016

Similar Records

Learning from Crowds by Modeling Common Confusions

Conference · Mon Feb 08 23:00:00 EST 2021 · OSTI ID:1822656

Learning from crowds with variational Gaussian processes

Journal Article · Mon Nov 19 19:00:00 EST 2018 · Pattern Recognition · OSTI ID:1488416

CMed: Crowd Analytics for Medical Imaging Data

Journal Article · Tue Nov 19 19:00:00 EST 2019 · IEEE Transactions on Visualization and Computer Graphics · OSTI ID:1677652

Related Subjects

96 KNOWLEDGE MANAGEMENT AND PRESERVATION
Crowdsourcing
generative adversarial nets
label noise

Improve Learning from Crowds via Generative Augmentation

Citation Formats

References (13)

Similar Records

Related Subjects