skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Machine Learning to Improve Retrieval by Category in Big Volunteered Geodata

Abstract

Nowadays, Volunteered Geographic Information (VGI) is commonly used in research and practical applications. However, the quality assurance of such a geographic data remains a problem. In this study we use machine learning and natural language processing to improve record retrieval by category (e.g. restaurant, museum, etc.) from Wikimapia Points of Interest data. We use textual information contained in VGI records to evaluate its ability to determine the category label. The performance of the trained classifier is evaluated on the complete dataset and then is compared with its performance on regional subsets. Preliminary analysis shows significant difference in the classifier performance across the regions. Such geographic differences will have a significant effect on data enrichment efforts such as labeling entities with missing categories.

Authors:
ORCiD logo [1]; ORCiD logo [1];  [1]
  1. ORNL
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1490594
DOE Contract Number:  
AC05-00OR22725
Resource Type:
Conference
Resource Relation:
Conference: ACM SIGSPATIAL 2018: 12th Workshop on Geographic Information Retrieval - Seattle, Washington, United States of America - 11/6/2018 3:00:00 PM-11/6/2018 3:00:00 PM
Country of Publication:
United States
Language:
English

Citation Formats

Sorokine, Alexandre, Thakur, Gautam, and Palumbo, Rachel L. Machine Learning to Improve Retrieval by Category in Big Volunteered Geodata. United States: N. p., 2018. Web. doi:10.1145/3281354.3281358.
Sorokine, Alexandre, Thakur, Gautam, & Palumbo, Rachel L. Machine Learning to Improve Retrieval by Category in Big Volunteered Geodata. United States. doi:10.1145/3281354.3281358.
Sorokine, Alexandre, Thakur, Gautam, and Palumbo, Rachel L. Thu . "Machine Learning to Improve Retrieval by Category in Big Volunteered Geodata". United States. doi:10.1145/3281354.3281358. https://www.osti.gov/servlets/purl/1490594.
@article{osti_1490594,
title = {Machine Learning to Improve Retrieval by Category in Big Volunteered Geodata},
author = {Sorokine, Alexandre and Thakur, Gautam and Palumbo, Rachel L.},
abstractNote = {Nowadays, Volunteered Geographic Information (VGI) is commonly used in research and practical applications. However, the quality assurance of such a geographic data remains a problem. In this study we use machine learning and natural language processing to improve record retrieval by category (e.g. restaurant, museum, etc.) from Wikimapia Points of Interest data. We use textual information contained in VGI records to evaluate its ability to determine the category label. The performance of the trained classifier is evaluated on the complete dataset and then is compared with its performance on regional subsets. Preliminary analysis shows significant difference in the classifier performance across the regions. Such geographic differences will have a significant effect on data enrichment efforts such as labeling entities with missing categories.},
doi = {10.1145/3281354.3281358},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2018},
month = {11}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: