skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Assessing Spatial and Attribute Errors of Input Data in Large National Datasets for use in Population Distribution Models

Journal Article · · GeoJournal

Geospatial technologies and digital data have developed and disseminated rapidly in conjunction with increasing computing performance and internet availability. The ability to store and transmit large datasets has encouraged the development of national datasets in geospatial format. National datasets are used by numerous agencies for analysis and modeling purposes because these datasets are standardized, and are considered to be of acceptable accuracy. At Oak Ridge National Laboratory, a national population model incorporating multiple ancillary variables was developed and one of the inputs required is a school database. This paper examines inaccuracies present within two national school datasets, TeleAtlas North America (TANA) and National Center of Education Statistics (NCES). Schools are an important component of the population model, because they serve as locations containing dense clusters of vulnerable populations. It is therefore essential to validate the quality of the school input data, which was made possible by increasing national coverage of high resolution imagery. Schools were also chosen since a 'real-world' representation of K-12 schools for the Philadelphia School District was produced; thereby enabling 'ground-truthing' of the national datasets. Analyses found the national datasets not standardized and incomplete, containing 76 to 90% of existing schools. The temporal accuracy of enrollment values of updating national datasets resulted in 89% inaccuracy to match 2003 data. Spatial rectification was required for 87% of the NCES points, of which 58% of the errors were attributed to the geocoding process. Lastly, it was found that by combining the two national datasets together, the resultant dataset provided a more useful and accurate solution. Acknowledgment Prepared by Oak Ridge National Laboratory, P.O. Box 2008, Oak Ridge, Tennessee 37831-6285, managed by UT-Battelle, LLC for the U. S. Department of Energy undercontract no. DEAC05-00OR22725. Copyright This manuscript has been authored by employees of UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the U.S. Department of Energy. Accordingly, the United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes.

Research Organization:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). National Center for Computational Sciences (NCCS)
Sponsoring Organization:
Work for Others (WFO)
DOE Contract Number:
DE-AC05-00OR22725
OSTI ID:
978759
Journal Information:
GeoJournal, Vol. 69, Issue 1-2; ISSN 0343-2521
Country of Publication:
United States
Language:
English