skip to main content

DOE PAGESDOE PAGES

Title: Validation of spatiodemographic estimates produced through data fusion of small area census records and household microdata

Techniques such as Iterative Proportional Fitting have been previously suggested as a means to generate new data with the demographic granularity of individual surveys and the spatial granularity of small area tabulations of censuses and surveys. This article explores internal and external validation approaches for synthetic, small area, household- and individual-level microdata using a case study for Bangladesh. Using data from the Bangladesh Census 2011 and the Demographic and Health Survey, we produce estimates of infant mortality rate and other household attributes for small areas using a variation of an iterative proportional fitting method called P-MEDM. We conduct an internal validation to determine: whether the model accurately recreates the spatial variation of the input data, how each of the variables performed overall, and how the estimates compare to the published population totals. We conduct an external validation by comparing the estimates with indicators from the 2009 Multiple Indicator Cluster Survey (MICS) for Bangladesh to benchmark how well the estimates compared to a known dataset which was not used in the original model. The results indicate that the estimation process is viable for regions that are better represented in the microdata sample, but also revealed the possibility of strong overfitting inmore » sparsely sampled sub-populations.« less
Authors:
 [1] ;  [2]
  1. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Computational Sciences and Engineering Division
  2. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Computational Sciences and Engineering Division; Univ. of Tennessee, Knoxville, TN (United States). Dept. of Geography
Publication Date:
Grant/Contract Number:
AC05-00OR22725
Type:
Accepted Manuscript
Journal Name:
Computers, Environment and Urban Systems
Additional Journal Information:
Journal Volume: 63; Journal Issue: C; Journal ID: ISSN 0198-9715
Publisher:
Elsevier
Research Org:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org:
USDOE; Work for Others (WFO)
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; 42 ENGINEERING; population; small area estimation; P-MEDM; IPF; microdata; validation; Reweighting; DHS
OSTI Identifier:
1349598

Rose, Amy N., and Nagle, Nicholas N.. Validation of spatiodemographic estimates produced through data fusion of small area census records and household microdata. United States: N. p., Web. doi:10.1016/j.compenvurbsys.2016.07.006.
Rose, Amy N., & Nagle, Nicholas N.. Validation of spatiodemographic estimates produced through data fusion of small area census records and household microdata. United States. doi:10.1016/j.compenvurbsys.2016.07.006.
Rose, Amy N., and Nagle, Nicholas N.. 2016. "Validation of spatiodemographic estimates produced through data fusion of small area census records and household microdata". United States. doi:10.1016/j.compenvurbsys.2016.07.006. https://www.osti.gov/servlets/purl/1349598.
@article{osti_1349598,
title = {Validation of spatiodemographic estimates produced through data fusion of small area census records and household microdata},
author = {Rose, Amy N. and Nagle, Nicholas N.},
abstractNote = {Techniques such as Iterative Proportional Fitting have been previously suggested as a means to generate new data with the demographic granularity of individual surveys and the spatial granularity of small area tabulations of censuses and surveys. This article explores internal and external validation approaches for synthetic, small area, household- and individual-level microdata using a case study for Bangladesh. Using data from the Bangladesh Census 2011 and the Demographic and Health Survey, we produce estimates of infant mortality rate and other household attributes for small areas using a variation of an iterative proportional fitting method called P-MEDM. We conduct an internal validation to determine: whether the model accurately recreates the spatial variation of the input data, how each of the variables performed overall, and how the estimates compare to the published population totals. We conduct an external validation by comparing the estimates with indicators from the 2009 Multiple Indicator Cluster Survey (MICS) for Bangladesh to benchmark how well the estimates compared to a known dataset which was not used in the original model. The results indicate that the estimation process is viable for regions that are better represented in the microdata sample, but also revealed the possibility of strong overfitting in sparsely sampled sub-populations.},
doi = {10.1016/j.compenvurbsys.2016.07.006},
journal = {Computers, Environment and Urban Systems},
number = C,
volume = 63,
place = {United States},
year = {2016},
month = {8}
}