skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A model independent safeguard against background mismodeling for statistical inference

Abstract

We propose a safeguard procedure for statistical inference that provides universal protection against mismodeling of the background. The method quantifies and incorporates the signal-like residuals of the background model into the likelihood function, using information available in a calibration dataset. This prevents possible false discovery claims that may arise through unknown mismodeling, and corrects the bias in limit setting created by overestimated or underestimated background. We demonstrate how the method removes the bias created by an incomplete background model using three realistic case studies.

Authors:
; ; ;  [1];  [2]
  1. Department of Particle Physics and Astrophysics, Weizmann Institute of Science, Herzl St. 234, Rehovot (Israel)
  2. Teilchen- und Astroteilchenphysik, Max-Planck-Institut für Kernphysik, Saupfercheckweg 1, 69117 Heidelberg (Germany)
Publication Date:
OSTI Identifier:
22676228
Resource Type:
Journal Article
Resource Relation:
Journal Name: Journal of Cosmology and Astroparticle Physics; Journal Volume: 2017; Journal Issue: 05; Other Information: Country of input: International Atomic Energy Agency (IAEA)
Country of Publication:
United States
Language:
English
Subject:
72 PHYSICS OF ELEMENTARY PARTICLES AND FIELDS; CALIBRATION; DATASETS; FIELD THEORIES; SIMULATION

Citation Formats

Priel, Nadav, Landsman, Hagar, Manfredini, Alessandro, Budnik, Ranny, and Rauch, Ludwig, E-mail: nadav.priel@weizmann.ac.il, E-mail: rauch@mpi-hd.mpg.de, E-mail: hagar.landsman@weizmann.ac.il, E-mail: alessandro.manfredini@weizmann.ac.il, E-mail: ran.budnik@weizmann.ac.il. A model independent safeguard against background mismodeling for statistical inference. United States: N. p., 2017. Web. doi:10.1088/1475-7516/2017/05/013.
Priel, Nadav, Landsman, Hagar, Manfredini, Alessandro, Budnik, Ranny, & Rauch, Ludwig, E-mail: nadav.priel@weizmann.ac.il, E-mail: rauch@mpi-hd.mpg.de, E-mail: hagar.landsman@weizmann.ac.il, E-mail: alessandro.manfredini@weizmann.ac.il, E-mail: ran.budnik@weizmann.ac.il. A model independent safeguard against background mismodeling for statistical inference. United States. doi:10.1088/1475-7516/2017/05/013.
Priel, Nadav, Landsman, Hagar, Manfredini, Alessandro, Budnik, Ranny, and Rauch, Ludwig, E-mail: nadav.priel@weizmann.ac.il, E-mail: rauch@mpi-hd.mpg.de, E-mail: hagar.landsman@weizmann.ac.il, E-mail: alessandro.manfredini@weizmann.ac.il, E-mail: ran.budnik@weizmann.ac.il. Mon . "A model independent safeguard against background mismodeling for statistical inference". United States. doi:10.1088/1475-7516/2017/05/013.
@article{osti_22676228,
title = {A model independent safeguard against background mismodeling for statistical inference},
author = {Priel, Nadav and Landsman, Hagar and Manfredini, Alessandro and Budnik, Ranny and Rauch, Ludwig, E-mail: nadav.priel@weizmann.ac.il, E-mail: rauch@mpi-hd.mpg.de, E-mail: hagar.landsman@weizmann.ac.il, E-mail: alessandro.manfredini@weizmann.ac.il, E-mail: ran.budnik@weizmann.ac.il},
abstractNote = {We propose a safeguard procedure for statistical inference that provides universal protection against mismodeling of the background. The method quantifies and incorporates the signal-like residuals of the background model into the likelihood function, using information available in a calibration dataset. This prevents possible false discovery claims that may arise through unknown mismodeling, and corrects the bias in limit setting created by overestimated or underestimated background. We demonstrate how the method removes the bias created by an incomplete background model using three realistic case studies.},
doi = {10.1088/1475-7516/2017/05/013},
journal = {Journal of Cosmology and Astroparticle Physics},
number = 05,
volume = 2017,
place = {United States},
year = {Mon May 01 00:00:00 EDT 2017},
month = {Mon May 01 00:00:00 EDT 2017}
}
  • We examine the absorption of cosmic microwave background (CMB) photons by formaldehyde (H{sub 2}CO) over cosmic time. The K-doublet rotational transitions of H{sub 2}CO become 'refrigerated'-their excitation temperatures are driven below the CMB temperature-via collisional pumping by molecular hydrogen (H{sub 2}). 'Anti-inverted' H{sub 2}CO line ratios thus provide an accurate measurement of the H{sub 2} density in molecular clouds. Using a radiative transfer model, we demonstrate that H{sub 2}CO centimeter wavelength line excitation and detectability are nearly independent of redshift or gas kinetic temperature. Since the H{sub 2}CO K-doublet lines absorb CMB light, and since the CMB lies behind everymore » galaxy and provides an exceptionally uniform extended illumination source, H{sub 2}CO is a distance-independent, extinction-free molecular gas mass-limited tracer of dense gas in galaxies. A Formaldehyde Deep Field could map the history of cosmic star formation in a uniquely unbiased fashion and may be possible with large bandwidth wide-field radio interferometers whereby the silhouettes of star-forming galaxies would be detected across the epoch of galaxy evolution. We also examine the possibility that H{sub 2}CO lines may provide a standardizable galaxy ruler for cosmology similar to the Sunyaev-Zel'dovich effect in galaxy clusters but applicable to much higher redshifts and larger samples. Finally, we explore how anti-inverted meter-wave H{sub 2}CO lines in galaxies during the peak of cosmic star formation may contaminate H I 21 cm tomography of the Epoch of Reionization.« less
  • Indicator atcokriging is an alternative to disjunctive kriging for estimation of spatial distributions. One way to determine which of these techniques is more accurate for estimation of spatial distributions is to apply each to a particular type of data. A procedure is developed for evaluation of disjunctive kriging and indicator atcokriging for such an application. Application of this procedure to earthquake ground motion data found disjunctive kriging to be at least as accurate as indicator atcokriging for estimation of spatial distributions for peak horizontal acceleration. Indicator atcokriging was superior for all other types of earthquake ground motion data.
  • We describe an approximate statistical model for the sample variance distribution of the nonlinear matter power spectrum that can be calibrated from limited numbers of simulations. Our model retains the common assumption of a multivariate normal distribution for the power spectrum band powers but takes full account of the (parameter-dependent) power spectrum covariance. The model is calibrated using an extension of the framework in Habib et al. (2007) to train Gaussian processes for the power spectrum mean and covariance given a set of simulation runs over a hypercube in parameter space. We demonstrate the performance of this machinery by estimatingmore » the parameters of a power-law model for the power spectrum. Within this framework, our calibrated sample variance distribution is robust to errors in the estimated covariance and shows rapid convergence of the posterior parameter constraints with the number of training simulations.« less
  • Models represent our primary method for integration of small-scale, processlevel phenomena into a comprehensive description of forest-stand or ecosystem function. They also represent a key method for testing hypotheses about the response of forest ecosystems to multiple changing environmental conditions. This paper describes the evaluation of 13 stand-level models varying in their spatial, mechanistic, and temporal complexity for their ability to capture intra- and interannual components of the water and carbon cycle for an upland, oak-dominated forest of eastern Tennessee. Comparisons between model simulations and observations were conducted for hourly, daily, and annual time steps. Data for the comparisons weremore » obtained from a wide range of methods including: eddy covariance, sapflow, chamber-based soil respiration, biometric estimates of stand-level net primary production and growth, and soil water content by time or frequency domain reflectometry. Response surfaces of carbon and water flux as a function of environmental drivers, and a variety of goodness-of-fit statistics (bias, absolute bias, and model efficiency) were used to judge model performance. A single model did not consistently perform the best at all time steps or for all variables considered. Intermodel comparisons showed good agreement for water cycle fluxes, but considerable disagreement among models for predicted carbon fluxes. The mean of all model outputs, however, was nearly always the best fit to the observations. Not surprisingly, models missing key forest components or processes, such as roots or modeled soil water content, were unable to provide accurate predictions of ecosystem responses to short-term drought phenomenon. Nevertheless, an inability to correctly capture short-term physiological processes under drought was not necessarily an indicator of poor annual water and carbon budget simulations. This is possible because droughts in the subject ecosystem were of short duration and therefore had a small cumulative impact. Models using hourly time steps and detailed mechanistic processes, and having a realistic spatial representation of the forest ecosystem provided the best predictions of observed data. Predictive ability of all models deteriorated under drought conditions, suggesting that further work is needed to evaluate and improve ecosystem model performance under unusual conditions, such as drought, that are a common focus of environmental change discussions.« less