skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Data Foundry: Data Warehousing and Integration for Scientific Data Management

Technical Report ·
DOI:https://doi.org/10.2172/793555· OSTI ID:793555

Data warehousing is an approach for managing data from multiple sources by representing them with a single, coherent point of view. Commercial data warehousing products have been produced by companies such as RebBrick, IBM, Brio, Andyne, Ardent, NCR, Information Advantage, Informatica, and others. Other companies have chosen to develop their own in-house data warehousing solution using relational databases, such as those sold by Oracle, IBM, Informix and Sybase. The typical approaches include federated systems, and mediated data warehouses, each of which, to some extent, makes use of a series of source-specific wrapper and mediator layers to integrate the data into a consistent format which is then presented to users as a single virtual data store. These approaches are successful when applied to traditional business data because the data format used by the individual data sources tends to be rather static. Therefore, once a data source has been integrated into a data warehouse, there is relatively little work required to maintain that connection. However, that is not the case for all data sources. Data sources from scientific domains tend to regularly change their data model, format and interface. This is problematic because each change requires the warehouse administrator to update the wrapper, mediator, and warehouse interfaces to properly read, interpret, and represent the modified data source. Furthermore, the data that scientists require to carry out research is continuously changing as their understanding of a research question develops, or as their research objectives evolve. The difficulty and cost of these updates effectively limits the number of sources that can be integrated into a single data warehouse, or makes an approach based on warehousing too expensive to consider.

Research Organization:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Organization:
USDOE Office of Defense Programs (DP) (US)
DOE Contract Number:
W-7405-Eng-48
OSTI ID:
793555
Report Number(s):
UCRL-ID-127593; TRN: US200222%%324
Resource Relation:
Other Information: PBD: 29 Feb 2000
Country of Publication:
United States
Language:
English

Similar Records

Data warehousing leads to improved business performance
Journal Article · Fri Sep 01 00:00:00 EDT 1995 · World Oil · OSTI ID:793555

BioWarehouse: a bioinformatics database warehouse toolkit
Journal Article · Thu Mar 23 00:00:00 EST 2006 · BMC Bioinformatics · OSTI ID:793555

Automatic generation of warehouse mediators using an ontology engine
Conference · Wed Mar 04 00:00:00 EST 1998 · OSTI ID:793555