skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Data Foundry: Data Warehousing and Integration for Scientific Data Management

Abstract

Data warehousing is an approach for managing data from multiple sources by representing them with a single, coherent point of view. Commercial data warehousing products have been produced by companies such as RebBrick, IBM, Brio, Andyne, Ardent, NCR, Information Advantage, Informatica, and others. Other companies have chosen to develop their own in-house data warehousing solution using relational databases, such as those sold by Oracle, IBM, Informix and Sybase. The typical approaches include federated systems, and mediated data warehouses, each of which, to some extent, makes use of a series of source-specific wrapper and mediator layers to integrate the data into a consistent format which is then presented to users as a single virtual data store. These approaches are successful when applied to traditional business data because the data format used by the individual data sources tends to be rather static. Therefore, once a data source has been integrated into a data warehouse, there is relatively little work required to maintain that connection. However, that is not the case for all data sources. Data sources from scientific domains tend to regularly change their data model, format and interface. This is problematic because each change requires the warehouse administrator to update themore » wrapper, mediator, and warehouse interfaces to properly read, interpret, and represent the modified data source. Furthermore, the data that scientists require to carry out research is continuously changing as their understanding of a research question develops, or as their research objectives evolve. The difficulty and cost of these updates effectively limits the number of sources that can be integrated into a single data warehouse, or makes an approach based on warehousing too expensive to consider.« less

Authors:
; ; ; ; ;
Publication Date:
Research Org.:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Org.:
USDOE Office of Defense Programs (DP) (US)
OSTI Identifier:
793555
Report Number(s):
UCRL-ID-127593
TRN: US200222%%324
DOE Contract Number:  
W-7405-Eng-48
Resource Type:
Technical Report
Resource Relation:
Other Information: PBD: 29 Feb 2000
Country of Publication:
United States
Language:
English
Subject:
99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE; BUSINESS; MANAGEMENT; DATA; DATA BASE MANAGEMENT

Citation Formats

Musick, R, Critchlow, T, Ganesh, M, Fidelis, Z, Zemla, A, and Slezak, T. Data Foundry: Data Warehousing and Integration for Scientific Data Management. United States: N. p., 2000. Web. doi:10.2172/793555.
Musick, R, Critchlow, T, Ganesh, M, Fidelis, Z, Zemla, A, & Slezak, T. Data Foundry: Data Warehousing and Integration for Scientific Data Management. United States. https://doi.org/10.2172/793555
Musick, R, Critchlow, T, Ganesh, M, Fidelis, Z, Zemla, A, and Slezak, T. 2000. "Data Foundry: Data Warehousing and Integration for Scientific Data Management". United States. https://doi.org/10.2172/793555. https://www.osti.gov/servlets/purl/793555.
@article{osti_793555,
title = {Data Foundry: Data Warehousing and Integration for Scientific Data Management},
author = {Musick, R and Critchlow, T and Ganesh, M and Fidelis, Z and Zemla, A and Slezak, T},
abstractNote = {Data warehousing is an approach for managing data from multiple sources by representing them with a single, coherent point of view. Commercial data warehousing products have been produced by companies such as RebBrick, IBM, Brio, Andyne, Ardent, NCR, Information Advantage, Informatica, and others. Other companies have chosen to develop their own in-house data warehousing solution using relational databases, such as those sold by Oracle, IBM, Informix and Sybase. The typical approaches include federated systems, and mediated data warehouses, each of which, to some extent, makes use of a series of source-specific wrapper and mediator layers to integrate the data into a consistent format which is then presented to users as a single virtual data store. These approaches are successful when applied to traditional business data because the data format used by the individual data sources tends to be rather static. Therefore, once a data source has been integrated into a data warehouse, there is relatively little work required to maintain that connection. However, that is not the case for all data sources. Data sources from scientific domains tend to regularly change their data model, format and interface. This is problematic because each change requires the warehouse administrator to update the wrapper, mediator, and warehouse interfaces to properly read, interpret, and represent the modified data source. Furthermore, the data that scientists require to carry out research is continuously changing as their understanding of a research question develops, or as their research objectives evolve. The difficulty and cost of these updates effectively limits the number of sources that can be integrated into a single data warehouse, or makes an approach based on warehousing too expensive to consider.},
doi = {10.2172/793555},
url = {https://www.osti.gov/biblio/793555}, journal = {},
number = ,
volume = ,
place = {United States},
year = {Tue Feb 29 00:00:00 EST 2000},
month = {Tue Feb 29 00:00:00 EST 2000}
}