skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Scenario Driven Data Modelling: A Method for Integrating Diverse Sources of Data and Data Streams

Conference ·
OSTI ID:1078152

Background Biology is rapidly becoming a data intensive, data-driven science. It is essential that data is represented and connected in ways that best represent its full conceptual content and allows both automated integration and data driven decision-making. Recent advancements in distributed multi-relational directed graphs, implemented in the form of the Semantic Web make it possible to deal with complicated heterogeneous data in new and interesting ways. Results This paper presents a new approach, scenario driven data modelling (SDDM), that integrates multi-relational directed graphs with data streams. SDDM can be applied to virtually any data integration challenge with widely divergent types of data and data streams. In this work, we explored integrating genetics data with reports from traditional media. SDDM was applied to the New Delhi metallo-beta-lactamase gene (NDM-1), an emerging global health threat. The SDDM process constructed a scenario, created a RDF multi-relational directed graph that linked diverse types of data to the Semantic Web, implemented RDF conversion tools (RDFizers) to bring content into the Sematic Web, identified data streams and analytical routines to analyse those streams, and identified user requirements and graph traversals to meet end-user requirements. Conclusions We provided an example where SDDM was applied to a complex data integration challenge. The process created a model of the emerging NDM-1 health threat, identified and filled gaps in that model, and constructed reliable software that monitored data streams based on the scenario derived multi-relational directed graph. The SDDM process significantly reduced the software requirements phase by letting the scenario and resulting multi-relational directed graph define what is possible and then set the scope of the user requirements. Approaches like SDDM will be critical to the future of data intensive, data-driven science because they automate the process of converting massive data streams into usable knowledge.

Research Organization:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE Laboratory Directed Research and Development (LDRD) Program
DOE Contract Number:
DE-AC05-00OR22725
OSTI ID:
1078152
Resource Relation:
Conference: MCBIOS, College Station, TX, USA, 20110323, 20110323
Country of Publication:
United States
Language:
English

Similar Records

Scenario driven data modelling: a method for integrating diverse sources of data and data streams
Journal Article · Tue Oct 18 00:00:00 EDT 2011 · BMC Bioinformatics · OSTI ID:1078152

Publication and Retrieval of Computational Chemical-Physical Data Via the Semantic Web. Final Technical Report
Technical Report · Thu Jul 20 00:00:00 EDT 2017 · OSTI ID:1078152

Soda Pop: A Time-Series Clustering, Alarming and Disease Forecasting Application
Journal Article · Tue May 02 00:00:00 EDT 2017 · Online Journal of Public Health Informatics · OSTI ID:1078152

Related Subjects