skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Accelerating Big Data Infrastructure and Applications (Ongoing collaboration)

Authors:
; ; ; ; ; ; ; ; ; ; ;
Publication Date:
Research Org.:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1363858
Report Number(s):
LLNL-CONF-727471
DOE Contract Number:
AC52-07NA27344
Resource Type:
Conference
Resource Relation:
Conference: Presented at: The 1st US-Japan Workshop on Collaborative Global Research on Applying Information Technology, Atlanta, GA, United States, Jun 05 - Jun 06, 2017
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE

Citation Formats

Brown, K, Xu, T, Iwabuchi, K, Sato, K, Moody, A, Mohror, K, Jain, N, Bhatele, A, Schulz, M, Pearce, R, Gokhale, M, and Matsuoka, S. Accelerating Big Data Infrastructure and Applications (Ongoing collaboration). United States: N. p., 2017. Web.
Brown, K, Xu, T, Iwabuchi, K, Sato, K, Moody, A, Mohror, K, Jain, N, Bhatele, A, Schulz, M, Pearce, R, Gokhale, M, & Matsuoka, S. Accelerating Big Data Infrastructure and Applications (Ongoing collaboration). United States.
Brown, K, Xu, T, Iwabuchi, K, Sato, K, Moody, A, Mohror, K, Jain, N, Bhatele, A, Schulz, M, Pearce, R, Gokhale, M, and Matsuoka, S. Tue . "Accelerating Big Data Infrastructure and Applications (Ongoing collaboration)". United States. doi:. https://www.osti.gov/servlets/purl/1363858.
@article{osti_1363858,
title = {Accelerating Big Data Infrastructure and Applications (Ongoing collaboration)},
author = {Brown, K and Xu, T and Iwabuchi, K and Sato, K and Moody, A and Mohror, K and Jain, N and Bhatele, A and Schulz, M and Pearce, R and Gokhale, M and Matsuoka, S},
abstractNote = {},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Tue Mar 21 00:00:00 EDT 2017},
month = {Tue Mar 21 00:00:00 EDT 2017}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:
  • The success of data-driven business in government, science, and private industry is driving the need for seamless integration of intra and inter-enterprise data sources to extract knowledge nuggets in the form of correlations, trends, patterns and behaviors previously not discovered due to physical and logical separation of datasets. Today, as volume, velocity, variety and complexity of enterprise data keeps increasing, the next generation analysts are facing several challenges in the knowledge extraction process. Towards addressing these challenges, data-driven organizations that rely on the success of their analysts have to make investment decisions for sustainable data/information systems and knowledge discovery. Optionsmore » that organizations are considering are newer storage/analysis architectures, better analysis machines, redesigned analysis algorithms, collaborative knowledge management tools, and query builders amongst many others. In this paper, we present a concept of operations for enabling knowledge discovery that data-driven organizations can leverage towards making their investment decisions. We base our recommendations on the experience gained from integrating multi-agency enterprise data warehouses at the Oak Ridge National Laboratory to design the foundation of future knowledge nurturing data-system architectures.« less
  • In the last three decades, there has been an exponential growth in the area of information technology providing the information processing needs of data-driven businesses in government, science, and private industry in the form of capturing, staging, integrating, conveying, analyzing, and transferring data that will help knowledge workers and decision makers make sound business decisions. Data integration across enterprise warehouses is one of the most challenging steps in the big data analytics strategy. Several levels of data integration have been identified across enterprise warehouses: data accessibility, common data platform, and consolidated data model. Each level of integration has its ownmore » set of complexities that requires a certain amount of time, budget, and resources to implement. Such levels of integration are designed to address the technical challenges inherent in consolidating the disparate data sources. In this paper, we present a methodology based on industry best practices to measure the readiness of an organization and its data sets against the different levels of data integration. We introduce a new Integration Level Model (ILM) tool, which is used for quantifying an organization and data system s readiness to share data at a certain level of data integration. It is based largely on the established and accepted framework provided in the Data Management Association (DAMA-DMBOK). It comprises several key data management functions and supporting activities, together with several environmental elements that describe and apply to each function. The proposed model scores the maturity of a system s data governance processes and provides a pragmatic methodology for evaluating integration risks. The higher the computed scores, the better managed the source data system and the greater the likelihood that the data system can be brought in at a higher level of integration.« less
  • Critical Infrastructure systems(CIs) such as energy, water, transportation and communication are highly interconnected and mutually dependent in complex ways. Robust modeling of CIs interconnections is crucial to identify vulnerabilities in the CIs. We present here a national-scale Infrastructure Vulnerability Analysis System (IVAS) vision leveraging Se- mantic Big Data (SBD) tools, Big Data, and Geographical Information Systems (GIS) tools. We survey existing ap- proaches on vulnerability analysis of critical infrastructures and discuss relevant systems and tools aligned with our vi- sion. Next, we present a generic system architecture and discuss challenges including: (1) Constructing and manag- ing a CI network-of-networks graph,more » (2) Performing analytic operations at scale, and (3) Interactive visualization of ana- lytic output to generate meaningful insights. We argue that this architecture acts as a baseline to realize a national-scale network based vulnerability analysis system.« less
  • At the occasion of the restructurisation of the Committees at the Nuclear Energy Agency (OECD, Paris), the newly formed Nuclear Energy Agency Nuclear Science Committee (NEA-NSC) took over some of the activities of the former Nuclear Energy Agency Nuclear Data Committee (NEA-NDC). Amongst these activities were two Interlaboratory Collaborations, one on an important standard, the {sup 10}B(n,{alpha}) cross-section, the other on measurements of activation cross-sections. Progress of these two NEA-NSC Interlaboratory Collaborations is reported.
  • In this paper, we address the problem of data confidentiality in big data analytics. In many fields, much useful patterns can be extracted by applying machine learning techniques to big data. However, data confidentiality must be protected. In many scenarios, data confidentiality could well be a prerequisite for data to be shared. We present a scheme to provide provable secure data confidentiality and discuss various techniques to optimize performance of such a system.