skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Stochastic Engine Final Report: Applying Markov Chain Monte Carlo Methods with Importance Sampling to Large-Scale Data-Driven Simulation

Technical Report ·
DOI:https://doi.org/10.2172/15009813· OSTI ID:15009813

Accurate prediction of complex phenomena can be greatly enhanced through the use of data and observations to update simulations. The ability to create these data-driven simulations is limited by error and uncertainty in both the data and the simulation. The stochastic engine project addressed this problem through the development and application of a family of Markov Chain Monte Carlo methods utilizing importance sampling driven by forward simulators to minimize time spent search very large state spaces. The stochastic engine rapidly chooses among a very large number of hypothesized states and selects those that are consistent (within error) with all the information at hand. Predicted measurements from the simulator are used to estimate the likelihood of actual measurements, which in turn reduces the uncertainty in the original sample space via a conditional probability method called Bayesian inferencing. This highly efficient, staged Metropolis-type search algorithm allows us to address extremely complex problems and opens the door to solving many data-driven, nonlinear, multidimensional problems. A key challenge has been developing representation methods that integrate the local details of real data with the global physics of the simulations, enabling supercomputers to efficiently solve the problem. Development focused on large-scale problems, and on examining the mathematical robustness of the approach in diverse applications. Multiple data types were combined with large-scale simulations to evaluate systems with {approx}{sup 10}20,000 possible states (detecting underground leaks at the Hanford waste tanks). The probable uses of chemical process facilities were assessed using an evidence-tree representation and in-process updating. Other applications included contaminant flow paths at the Savannah River Site, locating structural flaws in buildings, improving models for seismic travel times systems used to monitor nuclear proliferation, characterizing the source of indistinct atmospheric plumes, and improving flash radiography. In the course of developing these applications, we also developed new methods to cluster and analyze the results of the state-space searches, as well as a number of algorithms to improve the search speed and efficiency. Our generalized solution contributes both a means to make more informed predictions of the behavior of very complex systems, and to improve those predictions as events unfold, using new data in real time.

Research Organization:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Organization:
US Department of Energy (US)
DOE Contract Number:
W-7405-ENG-48
OSTI ID:
15009813
Report Number(s):
UCRL-TR-202878; TRN: US200430%%1334
Resource Relation:
Other Information: PBD: 11 Mar 2004
Country of Publication:
United States
Language:
English