Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

An open source knowledge graph ecosystem for the life sciences

Journal Article · · Scientific Data
 [1];  [2];  [3];  [4];  [5];  [3];  [6];  [7];  [8];  [9];  [10];  [9];  [11];  [12];  [8];  [13];  [8];  [14];  [14];  [15] more »;  [14];  [16];  [16];  [17];  [18];  [19];  [20];  [21];  [19];  [19];  [19];  [16] « less
  1. Univ. of Colorado, Aurora, CO (United States). Anschutz Medical Campus; Columbia Univ., New York, NY (United States). Irving Medical Center
  2. Univ. of Colorado, Boulder, CO (United States)
  3. Univ. of Colorado, Aurora, CO (United States). Anschutz Medical Campus
  4. Univ. degli Studi di Milano (Italy)
  5. Univ. of Pittsburgh, PA (United States)
  6. Univ. degli Studi di Milano (Italy); Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
  7. Semanticly, Athens (Greece)
  8. Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
  9. Univ. of Pittsburgh, PA (United States). School of Medicine
  10. Harvard Medical School, Boston, MA (United States)
  11. Univ. of New Mexico, Albuquerque, NM (United States). School of Medicine
  12. SIB Swiss Inst. of Bioinformatics, Basel (Switzerland)
  13. Berlin Institute of Health at Charité-Universitatsmedizin, Berlin (Germany)
  14. Univ. di Milano (Italy)
  15. Univ. di Milano (Italy); European Laboratory for Learning and Intelligent Systems (ELLIS), Milan (Italy)
  16. Univ. of Colorado, Aurora, CO (United States). Anschutz Medical Campus and School of Medicine
  17. Critical Path Institute, Tucson, AZ (United States). Data Collaboration Center
  18. King Abdullah University of Science and Technology (KAUST), Thuwal (Saudi Arabia)
  19. Univ. of Colorado, Aurora, CO (United States). School of Medicine
  20. Janssen Research and Development, Raritan, NJ (United States)
  21. Columbia Univ., New York, NY (United States). Irving Medical Center
Translational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge graphs (KGs) are used to model complex phenomena, and methods exist to construct them automatically. However, tackling complex biomedical integration problems requires flexibility in the way knowledge is modeled. Moreover, existing KG construction methods provide robust tooling at the cost of fixed or limited choices among knowledge representation models. PheKnowLator (Phenotype Knowledge Translator) is a semantic ecosystem for automating the FAIR (Findable, Accessible, Interoperable, and Reusable) construction of ontologically grounded KGs with fully customizable knowledge representation. The ecosystem includes KG construction resources (e.g., data preparation APIs), analysis tools (e.g., SPARQL endpoint resources and abstraction algorithms), and benchmarks (e.g., prebuilt KGs). We evaluated the ecosystem by systematically comparing it to existing open-source KG construction methods and by analyzing its computational performance when used to construct 12 different large-scale KGs. With flexible knowledge representation, PheKnowLator enables fully customizable KGs without compromising performance or usability.
Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Basic Energy Sciences (BES); USDOE Office of Science (SC), Biological and Environmental Research (BER)
Grant/Contract Number:
AC02-05CH11231
OSTI ID:
2375473
Journal Information:
Scientific Data, Journal Name: Scientific Data Journal Issue: 1 Vol. 11; ISSN 2052-4463
Publisher:
Nature Publishing GroupCopyright Statement
Country of Publication:
United States
Language:
English

References (85)

Biomedical KG Construction Survey other January 2023
Open Source Biomedical KG - GitHub Scraper other January 2023
PheKnowLator Human Disease KG Benchmarks - Build Logs (v2.1.0 - April 2021) dataset January 2021
Rna-Kg computationalnotebook January 2023
Adapting the Harmonized Data Quality Framework for Ontology Quality Assessment dataset January 2022
Overview of the PheKnowLator Ecosystem image January 2022
PheKnowLator Human Disease Knowledge Graph Benchmarks Embeddings -- v1.0.0 dataset January 2021
PheKnowLator Human Disease KG Benchmarks: Instance-Inverse Relations-OWLNETS (v2.1.0 - June 2021) dataset January 2021
PheKnowLator Human Disease KG Benchmarks: Instance-Standard Relations-OWL (v2.1.0 - June 2021) dataset January 2021
Knowledge graphs as tools for explainable machine learning: A survey journal January 2022
Interactome Networks and Human Disease journal March 2011
NDEx, the Network Data Exchange journal October 2015
Constructing knowledge graphs and their biomedical applications journal January 2020
Bio2RDF: Towards a mashup to build bioinformatics knowledge systems journal October 2008
Assessing the practice of biomedical ontology evaluation: Gaps and opportunities journal April 2018
Developing a Knowledge Graph for Pharmacokinetic Natural Product-Drug Interactions journal April 2023
Causal feature selection using a knowledge graph combining structured knowledge from the biomedical literature and ontologies: A use case studying depression as a risk factor for Alzheimer’s disease journal June 2023
Using the Unified Medical Language System to Expand the Operative Stress Score – First Use Case journal December 2021
High-Throughput Sequencing Technologies journal May 2015
KG-COVID-19: A Framework to Produce Customized Knowledge Graphs for COVID-19 Response journal January 2021
Ten years of next-generation sequencing technology journal September 2014
Applying knowledge-driven mechanistic inference to toxicogenomics journal August 2020
Logical Grounds journal November 2013
Central Dogma of Molecular Biology journal August 1970
Gene Ontology: tool for the unification of biology journal May 2000
The Genotype-Tissue Expression (GTEx) project journal May 2013
Big data in digital healthcare: lessons learnt and recommendations for general practice journal March 2020
Integrating biomedical research and electronic health records to create knowledge-based biologically meaningful machine-readable embeddings journal July 2019
The human body at cellular resolution: the NIH Human Biomolecular Atlas Program journal January 2019
A knowledge graph to interpret clinical proteomics data journal January 2022
Democratizing knowledge representation with BioCypher journal June 2023
Unifying the identification of biomedical entities with the Bioregistry journal November 2022
Building a knowledge graph to enable precision medicine journal February 2023
NIH SenNet Consortium to map senescent cells throughout the human lifespan to understand physiological health journal December 2022
GRAPE for fast and scalable graph processing and random-walk-based embedding journal June 2023
The FAIR Guiding Principles for scientific data management and stewardship journal March 2016
Multi-omics data integration considerations and study design for biological systems and disease journal January 2021
An Information Flow Model for Conflict and Fission in Small Groups journal December 1977
The role of ontologies in biological and biomedical research: a functional perspective journal April 2015
OpenBioLink: a benchmarking framework for large-scale biomedical link prediction journal April 2020
KG-Hub—building and exchanging biological knowledge graphs journal June 2023
The Human Phenotype Ontology in 2021 journal December 2020
Comparative Toxicogenomics Database (CTD): update 2021 journal October 2020
Entrez Gene: gene-centered information at NCBI journal December 2004
VIOLIN: vaccine investigation and online information network journal December 2007
The Protein Ontology: a structured representation of protein forms and complexes journal October 2010
Updates on the web-based VIOLIN vaccine database and analysis system journal November 2013
ChEBI in 2016: Improved services and an expanding collection of metabolites journal October 2015
Genenames.org: the HGNC and VGNC resources in 2017 journal October 2016
The Monarch Initiative: an integrative data and analytic platform connecting phenotypes to genotypes across species journal November 2016
DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants journal October 2016
Ensembl 2018 journal November 2017
ClinVar: improving access to variant interpretations and supporting evidence journal November 2017
UniProt: a worldwide hub of protein knowledge November 2018
The Gene Ontology Resource: 20 years and still GOing strong journal November 2018
STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets journal November 2018
The Monarch Initiative in 2019: an integrative data and analytic platform connecting phenotypes to genotypes across species journal November 2019
Mondo: Unifying diseases for the world, by the world preprint May 2022
A framework for classifying and comparing software architecture evaluation methods conference January 2004
A Survey on Knowledge Graphs: Representation, Acquisition, and Applications journal February 2022
Biolink Model: A universal schema for knowledge graphs in clinical, biomedical, and translational science journal June 2022
OpenOrd: an open-source toolbox for large graph layout conference January 2011
Tissue-based map of the human proteome journal January 2015
Knowledge-Driven Mechanistic Enrichment of the Preeclampsia Ignorome conference November 2022
Wikidata conference April 2012
Knowledge-Based Biomedical Data Science journal July 2020
CLO: The cell line ontology journal January 2014
The pathway ontology – updates and applications journal January 2014
An ontology for cell types journal January 2005
Relations in biomedical ontologies journal January 2005
Uberon, an integrative multi-species anatomy ontology journal January 2012
KaBOB: ontology-based semantic integration of biomedical databases journal April 2015
ROBOT: A Tool for Automating Ontology Workflows journal July 2019
RTX-KG2: a system for building a semantically standardized knowledge graph for translational biomedicine journal September 2022
From hype to reality: data science enabling personalized medicine journal August 2018
Transforming the study of organisms: Phenomic data models and knowledge bases journal November 2020
Gephi: An Open Source Software for Exploring and Manipulating Networks journal March 2009
Molecular cartooning with knowledge graphs journal December 2022
Het-node2vec: second order random walk sampling for heterogeneous multigraphs embedding preprint January 2021
RNA-KG: An ontology-based knowledge graph for representing interactions involving RNA molecules preprint January 2023
Leveraging a Neural-Symbolic Representation of Biomedical Knowledge to Improve Pediatric Subphenotyping preprint January 2021
Overview of the PheKnowLator Ecosystem image January 2022
PheKnowLator Human Disease Knowledge Graph Benchmarks Embeddings -- v1.0.0 dataset January 2021
RTX-KG2: a system for building a semantically standardized knowledge graph for translational biomedicine collection January 2022
Systematic integration of biomedical knowledge prioritizes drugs for repurposing journal September 2017

Similar Records

Materials Data Science Ontology(MDS-Onto): Unifying Domain Knowledge in Materials and Applied Data Science
Journal Article · Mon Apr 14 20:00:00 EDT 2025 · Scientific Data (Online) · OSTI ID:2569621

An ontology-based knowledge graph for representing interactions involving RNA molecules
Journal Article · Wed Aug 21 20:00:00 EDT 2024 · Scientific Data · OSTI ID:2440393