An open source knowledge graph ecosystem for the life sciences
more »
- Univ. of Colorado, Aurora, CO (United States). Anschutz Medical Campus; Columbia Univ., New York, NY (United States). Irving Medical Center
- Univ. of Colorado, Boulder, CO (United States)
- Univ. of Colorado, Aurora, CO (United States). Anschutz Medical Campus
- Univ. degli Studi di Milano (Italy)
- Univ. of Pittsburgh, PA (United States)
- Univ. degli Studi di Milano (Italy); Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
- Semanticly, Athens (Greece)
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
- Univ. of Pittsburgh, PA (United States). School of Medicine
- Harvard Medical School, Boston, MA (United States)
- Univ. of New Mexico, Albuquerque, NM (United States). School of Medicine
- SIB Swiss Inst. of Bioinformatics, Basel (Switzerland)
- Berlin Institute of Health at Charité-Universitatsmedizin, Berlin (Germany)
- Univ. di Milano (Italy)
- Univ. di Milano (Italy); European Laboratory for Learning and Intelligent Systems (ELLIS), Milan (Italy)
- Univ. of Colorado, Aurora, CO (United States). Anschutz Medical Campus and School of Medicine
- Critical Path Institute, Tucson, AZ (United States). Data Collaboration Center
- King Abdullah University of Science and Technology (KAUST), Thuwal (Saudi Arabia)
- Univ. of Colorado, Aurora, CO (United States). School of Medicine
- Janssen Research and Development, Raritan, NJ (United States)
- Columbia Univ., New York, NY (United States). Irving Medical Center
Translational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge graphs (KGs) are used to model complex phenomena, and methods exist to construct them automatically. However, tackling complex biomedical integration problems requires flexibility in the way knowledge is modeled. Moreover, existing KG construction methods provide robust tooling at the cost of fixed or limited choices among knowledge representation models. PheKnowLator (Phenotype Knowledge Translator) is a semantic ecosystem for automating the FAIR (Findable, Accessible, Interoperable, and Reusable) construction of ontologically grounded KGs with fully customizable knowledge representation. The ecosystem includes KG construction resources (e.g., data preparation APIs), analysis tools (e.g., SPARQL endpoint resources and abstraction algorithms), and benchmarks (e.g., prebuilt KGs). We evaluated the ecosystem by systematically comparing it to existing open-source KG construction methods and by analyzing its computational performance when used to construct 12 different large-scale KGs. With flexible knowledge representation, PheKnowLator enables fully customizable KGs without compromising performance or usability.
- Research Organization:
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Basic Energy Sciences (BES); USDOE Office of Science (SC), Biological and Environmental Research (BER)
- Grant/Contract Number:
- AC02-05CH11231
- OSTI ID:
- 2375473
- Journal Information:
- Scientific Data, Journal Name: Scientific Data Journal Issue: 1 Vol. 11; ISSN 2052-4463
- Publisher:
- Nature Publishing GroupCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
Materials Data Science Ontology(MDS-Onto): Unifying Domain Knowledge in Materials and Applied Data Science
An ontology-based knowledge graph for representing interactions involving RNA molecules
Journal Article
·
Mon Apr 14 20:00:00 EDT 2025
· Scientific Data (Online)
·
OSTI ID:2569621
An ontology-based knowledge graph for representing interactions involving RNA molecules
Journal Article
·
Wed Aug 21 20:00:00 EDT 2024
· Scientific Data
·
OSTI ID:2440393