skip to main content

Title: Where Big Data and Prediction Meet

Our ability to assemble and analyze massive data sets, often referred to under the title of “big data”, is an increasingly important tool for shaping national policy. This in turn has introduced issues from privacy concerns to cyber security. But as IBM’s John Kelly emphasized in the last Innovation, making sense of the vast arrays of data will require radically new computing tools. In the past, technologies and tools for analysis of big data were viewed as quite different from the traditional realm of high performance computing (HPC) with its huge models of phenomena such as global climate or supporting the nuclear test moratorium. Looking ahead, this will change with very positive benefits for both worlds. Societal issues such as global security, economic planning and genetic analysis demand increased understanding that goes beyond existing data analysis and reduction. The modeling world often produces simulations that are complex compositions of mathematical models and experimental data. This has resulted in outstanding successes such as the annual assessment of the state of the US nuclear weapons stockpile without underground nuclear testing. Ironically, while there were historically many test conducted, this body of data provides only modest insight into the underlying physics of themore » system. A great deal of emphasis was thus placed on the level of confidence we can develop for the predictions. As data analytics and simulation come together, there is a growing need to assess the confidence levels in both data being gathered and the complex models used to make predictions. An example of this is assuring the security or optimizing the performance of critical infrastructure systems such as the power grid. If one wants to understand the vulnerabilities of the system or impacts of predicted threats, full scales tests of the grid against threat scenarios are unlikely. Preventive measures would need to be predicated on well-defined margins of confidence in order to take mitigating actions that could have wide ranging impacts. There is a rich opportunity for interaction and exchange between the HPC simulation and data analytics communities.« less
 [1] ;  [2] ;  [3] ;  [4] ;  [5]
  1. Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
  2. Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
  3. Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
  4. USDOE, Washington, DC (United States)
  5. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Publication Date:
OSTI Identifier:
Report Number(s):
DOE Contract Number:
Resource Type:
Technical Report
Research Org:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Org:
Country of Publication:
United States