EpiK: A Knowledge Base for Epidemiological Modeling and Analytics of Infectious Diseases
Abstract
Computational epidemiology seeks to develop computational methods to study the distribution and determinants of health-related states or events (including disease), and the application of this study to the control of diseases and other health problems. Recent advances in computing and data sciences have led to the development of innovative modeling environments to support this important goal. The datasets used to drive the dynamic models as well as the data produced by these models presents unique challenges owing to their size, heterogeneity and diversity. These datasets form the basis of effective and easy to use decision support and analytical environments. As a result, it is important to develop scalable data management systems to store, manage and integrate these datasets. In this paper, we develop EpiK—a knowledge base that facilitates the development of decision support and analytical environments to support epidemic science. An important goal is to develop a framework that links the input as well as output datasets to facilitate effective spatio-temporal and social reasoning that is critical in planning and intervention analysis before and during an epidemic. The data management framework links modeling workflow data and its metadata using a controlled vocabulary. The metadata captures information about storage, the mappingmore »
- Authors:
-
- Virginia Polytechnic Inst. and State Univ. (Virginia Tech), Blacksburg, VA (United States)
- Publication Date:
- Research Org.:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1454390
- Grant/Contract Number:
- AC05-00OR22725
- Resource Type:
- Accepted Manuscript
- Journal Name:
- Journal of Healthcare Informatics Research
- Additional Journal Information:
- Journal Volume: 1; Journal Issue: 2; Journal ID: ISSN 2509-4971
- Publisher:
- Springer
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 60 APPLIED LIFE SCIENCES; 96 KNOWLEDGE MANAGEMENT AND PRESERVATION; Computational epidemiology; Knowledge base; Social contact networks; Mapping; RDF; SPARQL
Citation Formats
Hasan, S. M. Shamimul, Fox, Edward A., Bisset, Keith, and Marathe, Madhav V. EpiK: A Knowledge Base for Epidemiological Modeling and Analytics of Infectious Diseases. United States: N. p., 2017.
Web. doi:10.1007/s41666-017-0010-9.
Hasan, S. M. Shamimul, Fox, Edward A., Bisset, Keith, & Marathe, Madhav V. EpiK: A Knowledge Base for Epidemiological Modeling and Analytics of Infectious Diseases. United States. https://doi.org/10.1007/s41666-017-0010-9
Hasan, S. M. Shamimul, Fox, Edward A., Bisset, Keith, and Marathe, Madhav V. Mon .
"EpiK: A Knowledge Base for Epidemiological Modeling and Analytics of Infectious Diseases". United States. https://doi.org/10.1007/s41666-017-0010-9. https://www.osti.gov/servlets/purl/1454390.
@article{osti_1454390,
title = {EpiK: A Knowledge Base for Epidemiological Modeling and Analytics of Infectious Diseases},
author = {Hasan, S. M. Shamimul and Fox, Edward A. and Bisset, Keith and Marathe, Madhav V.},
abstractNote = {Computational epidemiology seeks to develop computational methods to study the distribution and determinants of health-related states or events (including disease), and the application of this study to the control of diseases and other health problems. Recent advances in computing and data sciences have led to the development of innovative modeling environments to support this important goal. The datasets used to drive the dynamic models as well as the data produced by these models presents unique challenges owing to their size, heterogeneity and diversity. These datasets form the basis of effective and easy to use decision support and analytical environments. As a result, it is important to develop scalable data management systems to store, manage and integrate these datasets. In this paper, we develop EpiK—a knowledge base that facilitates the development of decision support and analytical environments to support epidemic science. An important goal is to develop a framework that links the input as well as output datasets to facilitate effective spatio-temporal and social reasoning that is critical in planning and intervention analysis before and during an epidemic. The data management framework links modeling workflow data and its metadata using a controlled vocabulary. The metadata captures information about storage, the mapping between the linked model and the physical layout, and relationships to support services. EpiK is designed to support agent-based modeling and analytics frameworks—aggregate models can be seen as special cases and are thus supported. We use semantic web technologies to create a representation of the datasets that encapsulates both the location and the schema heterogeneity. The choice of RDF as a representation language is motivated by the diversity and growth of the datasets that need to be integrated. A query bank is developed—the queries capture a broad range of questions that can be posed and answered during a typical case study pertaining to disease outbreaks. The queries are constructed using SPARQL Protocol and RDF Query Language (SPARQL) over the EpiK. EpiK can hide schema and location heterogeneity while efficiently supporting queries that span the computational epidemiology modeling pipeline: from model construction to simulation output. As a result, we show that the performance of benchmark queries varies significantly with respect to the choice of hardware underlying the database and resource description framework (RDF) engine.},
doi = {10.1007/s41666-017-0010-9},
journal = {Journal of Healthcare Informatics Research},
number = 2,
volume = 1,
place = {United States},
year = {Mon Nov 06 00:00:00 EST 2017},
month = {Mon Nov 06 00:00:00 EST 2017}
}
Works referenced in this record:
Faceted search over RDF-based knowledge graphs
journal, March 2016
- Arenas, Marcelo; Cuenca Grau, Bernardo; Kharlamov, Evgeny
- Journal of Web Semantics, Vol. 37-38
From Relational Data to RDFS Models
book, January 2004
- Korotkiy, Makym; Top, Jan L.
- Lecture Notes in Computer Science
Development of Web-Based Epidemiological Reporting System for Tasmania Utilizing a Google Maps Add-On
conference, December 2007
- Shi, Hao; Zhang, Yanchun; Zhang, Jingyuan
- 9th Biennial Conference of the Australian Pattern Recognition Society on Digital Image Computing Techniques and Applications (DICTA 2007)
Publishing life science data as linked open data: the case study of miRBase
conference, January 2012
- Dalamagas, Theodore; Bikakis, Nikos; Papastefanatos, George
- Proceedings of the First International Workshop on Open Data - WOD '12
Updating relational data via SPARQL/update
conference, January 2010
- Hert, Matthias; Reif, Gerald; Gall, Harald C.
- Proceedings of the 1st International Workshop on Data Semantics - DataSem '10
Epidemic Marketplace: An Information Management System for Epidemiological Data
book, January 2010
- Lopes, Luis F.; Silva, Fabrício A. B.; Couto, Francisco
- Information Technology in Bio- and Medical Informatics, ITBAM 2010
Using Semantic Technology to Tame the Data Variety Challenge
journal, November 2016
- Horrocks, Ian; Giese, Martin; Kharlamov, Evgeny
- IEEE Internet Computing, Vol. 20, Issue 6
Semantic Robot Memory Store using 5W1H for Service Tasks [Semantic Robot Memory Store using 5W1H for Service Tasks]
journal, January 2010
- Kim, Hak Soo; Son, Jin Hyun; Lim, Gi Hyun
- The Abstracts of the international conference on advanced mechatronics : toward evolutionary fusion of IT and mechatronics : ICAM, Vol. 2010.5, Issue 0
On directly mapping relational databases to RDF and OWL
conference, January 2012
- Sequeda, Juan F.; Arenas, Marcelo; Miranker, Daniel P.
- Proceedings of the 21st international conference on World Wide Web - WWW '12
GeMInA, Genomic Metadata for Infectious Agents, a geospatial surveillance pathogen database
journal, October 2009
- Schriml, L. M.; Arze, C.; Nadendla, S.
- Nucleic Acids Research, Vol. 38, Issue Database
The EBI RDF platform: linked open data for the life sciences
journal, January 2014
- Jupp, S.; Malone, J.; Bolleman, J.
- Bioinformatics, Vol. 30, Issue 9
Semantic Technologies for Data Analysis in Health Care
book, January 2016
- Piro, Robert; Nenov, Yavor; Motik, Boris
- Lecture Notes in Computer Science
Drowning in data: digital library architecture to support scientific use of embedded sensor networks
conference, January 2007
- Borgman, Christine L.; Wallis, Jillian C.; Mayernik, Matthew S.
- Proceedings of the 2007 conference on Digital libraries - JCDL '07
How Much Would Closing Schools Reduce Transmission During an Influenza Pandemic?
journal, January 2007
- Glass, Kathryn; Barnes, Belinda
- Epidemiology, Vol. 18, Issue 5
EpiFast: a fast algorithm for large scale realistic epidemic simulations on distributed memory systems
conference, January 2009
- Bisset, Keith R.; Chen, Jiangzhuo; Feng, Xizhou
- Proceedings of the 23rd international conference on Conference on Supercomputing - ICS '09
A comparison of RDB-to-RDF mapping languages
conference, January 2011
- Hert, Matthias; Reif, Gerald; Gall, Harald C.
- Proceedings of the 7th International Conference on Semantic Systems - I-Semantics '11
A systematic review of studies on forecasting the dynamics of influenza outbreaks
journal, December 2013
- Nsoesie, Elaine O.; Brownstein, John S.; Ramakrishnan, Naren
- Influenza and Other Respiratory Viruses, Vol. 8, Issue 3
Exploring nationally and regionally defined models for large area population mapping
journal, October 2014
- Gaughan, A. E.; Stevens, F. R.; Linard, C.
- International Journal of Digital Earth, Vol. 8, Issue 12
Emergency response to a smallpox attack: The case for mass vaccination
journal, July 2002
- Kaplan, E. H.; Craft, D. L.; Wein, L. M.
- Proceedings of the National Academy of Sciences, Vol. 99, Issue 16
ISIS: a networked-epidemiology based pervasive web app for infectious disease pandemic planning and response
conference, January 2014
- Beckman, Richard; Bisset, Keith R.; Chen, Jiangzhuo
- Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '14
Progress and Challenges in Infectious Disease Cartography
journal, January 2016
- Kraemer, Moritz U. G.; Hay, Simon I.; Pigott, David M.
- Trends in Parasitology, Vol. 32, Issue 1
Digital Epidemiology
journal, July 2012
- Salathé, Marcel; Bengtsson, Linus; Bodnar, Todd J.
- PLoS Computational Biology, Vol. 8, Issue 7
Genre taxonomy: A knowledge repository of communicative actions
journal, October 2001
- Yoshioka, Takeshi; Herman, George; Yates, JoAnne
- ACM Transactions on Information Systems, Vol. 19, Issue 4
Networks and epidemic models
journal, May 2005
- Keeling, Matt J.; Eames, Ken T. D.
- Journal of The Royal Society Interface, Vol. 2, Issue 4
FRED (A Framework for Reconstructing Epidemic Dynamics): an open-source software system for modeling infectious diseases and control strategies using census-based populations
journal, October 2013
- Grefenstette, John J.; Brown, Shawn T.; Rosenfeld, Roni
- BMC Public Health, Vol. 13, Issue 1
A process-oriented scientific database model
journal, September 1992
- Pratt, J. Michael; Cohen, Maxine
- ACM SIGMOD Record, Vol. 21, Issue 3
Relational Databases in RDF: Keys and Foreign Keys
book, January 2008
- Lausen, Georg
- Semantic Web, Ontologies and Databases
Computational epidemiology
journal, July 2013
- Marathe, Madhav; Vullikanti, Anil Kumar S.
- Communications of the ACM, Vol. 56, Issue 7
The Mathematics of Infectious Diseases
journal, January 2000
- Hethcote, Herbert W.
- SIAM Review, Vol. 42, Issue 4
Dimensions of superspreading
journal, November 2005
- Galvani, Alison P.; May, Robert M.
- Nature, Vol. 438, Issue 7066
Model-Based Comprehensive Analysis of School Closure Policies for Mitigating Influenza Epidemics and Pandemics
journal, January 2016
- Fumanelli, Laura; Ajelli, Marco; Merler, Stefano
- PLOS Computational Biology, Vol. 12, Issue 1
Building an efficient RDF store over a relational database
conference, January 2013
- Bornea, Mihaela A.; Dolby, Julian; Kementsietsidis, Anastasios
- Proceedings of the 2013 international conference on Management of data - SIGMOD '13
BioPortal: ontologies and integrated data resources at the click of a mouse
journal, May 2009
- Noy, N. F.; Shah, N. H.; Whetzel, P. L.
- Nucleic Acids Research, Vol. 37, Issue Web Server
Data modeling of scientific experimentation
conference, January 1995
- Pratt, J. Michael
- Proceedings of the 1995 ACM symposium on Applied computing - SAC '95
Interpreting relational databases in the RDF domain
conference, January 2011
- Bertails, Alexandre; Prud'hommeaux, Eric Gordon
- Proceedings of the sixth international conference on Knowledge capture - K-CAP '11
Forecasting Seasonal Influenza Fusing Digital Indicators and a Mechanistic Disease Model
conference, January 2017
- Zhang, Qian; Perra, Nicola; Perrotta, Daniela
- Proceedings of the 26th International Conference on World Wide Web - WWW '17
RDB2RDF plugin: relational databases to RDF plugin for eclipse
conference, January 2011
- Salas, Percy E.; Marx, Edgard; Mera, Alexander
- Proceeding of the 1st workshop on Developing tools as plug-ins - TOPI '11
Accessing and Documenting Relational Databases through OWL Ontologies
book, January 2009
- Curino, Carlo; Orsi, Giorgio; Panigati, Emanuele
- Flexible Query Answering Systems
Opinion: Mathematical models: A key tool for outbreak response
journal, December 2014
- Lofgren, Eric T.; Halloran, M. Elizabeth; Rivers, Caitlin M.
- Proceedings of the National Academy of Sciences, Vol. 111, Issue 51
Towards linked open gene mutations data
journal, January 2012
- Zappa, Achille; Splendiani, Andrea; Romano, Paolo
- BMC Bioinformatics, Vol. 13, Issue Suppl 4
Forecasting a Moving Target: Ensemble Models for ILI Case Count Predictions
conference, April 2014
- Chakraborty, Prithwish; Khadivi, Pejman; Lewis, Bryan
- Proceedings of the 2014 SIAM International Conference on Data Mining
Efficient processing of SPARQL joins in memory by dynamically restricting triple patterns
conference, January 2009
- Groppe, Jinghua; Groppe, Sven; Ebers, Sebastian
- Proceedings of the 2009 ACM symposium on Applied Computing - SAC '09
A Scalable Data Management Tool to Support Epidemiological Modeling of Large Urban Regions
book, January 2007
- Barrett, Christopher L.; Bisset, Keith; Eubank, Stephen
- Research and Advanced Technology for Digital Libraries
Data mapping framework in a digital library with computational epidemiology datasets
conference, September 2014
- Hasan, S. M. Shamimul; Gupta, Sandeep; Fox, Edward A.
- 2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL)
Contact network epidemiology: Bond percolation applied to infectious
disease prediction and control
journal, October 2006
- Meyers, Lauren Ancel
- Bulletin of the American Mathematical Society, Vol. 44, Issue 01