DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: K-RET: knowledgeable biomedical relation extraction system

Journal Article · · Bioinformatics

Abstract Motivation Relation extraction (RE) is a crucial process to deal with the amount of text published daily, e.g. to find missing associations in a database. RE is a text mining task for which the state-of-the-art approaches use bidirectional encoders, namely, BERT. However, state-of-the-art performance may be limited by the lack of efficient external knowledge injection approaches, with a larger impact in the biomedical area given the widespread usage and high quality of biomedical ontologies. This knowledge can propel these systems forward by aiding them in predicting more explainable biomedical associations. With this in mind, we developed K-RET, a novel, knowledgeable biomedical RE system that, for the first time, injects knowledge by handling different types of associations, multiple sources and where to apply it, and multi-token entities. Results We tested K-RET on three independent and open-access corpora (DDI, BC5CDR, and PGR) using four biomedical ontologies handling different entities. K-RET improved state-of-the-art results by 2.68% on average, with the DDI Corpus yielding the most significant boost in performance, from 79.30% to 87.19% in F-measure, representing a P-value of 2.91×10−12. Availability and implementation https://github.com/lasigeBioTM/K-RET.

Sponsoring Organization:
USDOE Office of Nuclear Energy (NE), Nuclear Fuel Cycle and Supply Chain
Grant/Contract Number:
PTDC/CCIBIO/28685/2017
OSTI ID:
1970408
Journal Information:
Bioinformatics, Journal Name: Bioinformatics Vol. 39 Journal Issue: 4; ISSN 1367-4811
Publisher:
Oxford University PressCopyright Statement
Country of Publication:
United Kingdom
Language:
English

References (22)

Broad-coverage biomedical relation extraction with SemRep journal May 2020
NILINKER: Attention-based approach to NIL Entity Linking journal August 2022
Gene Ontology: tool for the unification of biology journal May 2000
The Human Disease Ontology 2022 update journal November 2021
A hybrid approach toward biomedical relation extraction training corpora: combining distant supervision with crowdsourcing journal December 2020
The DDI corpus: An annotated corpus with pharmacological substances and drug–drug interactions journal October 2013
Machine Learning Techniques for Biomedical Natural Language Processing: A Comprehensive Review journal January 2021
The Unified Medical Language System (UMLS): integrating biomedical terminology journal January 2004
Named Entity Recognition and Relation Extraction journal February 2021
ChEBI: a database and ontology for chemical entities of biological interest journal December 2007
Developing a BERT based triple classification model using knowledge graph embedding for question answering system journal May 2021
BioContrasts: extracting and exploiting protein–protein contrastive relations from biomedical literature journal December 2005
A survey on computational models for predicting protein–protein interactions journal March 2021
Biomedical Relation Extraction With Knowledge Graph-Based Recommendations journal August 2022
Machine Learning Approaches to Retrieve High-Quality, Clinically Relevant Evidence From the Biomedical Literature: Systematic Review journal September 2021
The Human Phenotype Ontology in 2021 journal December 2020
Mining of relations between proteins over biomedical scientific literature using a deep-linguistic approach journal February 2007
The Gene Ontology Resource: 20 years and still GOing strong journal November 2018
Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing journal October 2021
BioBERT: a pre-trained biomedical language representation model for biomedical text mining journal September 2019
BioCreative V CDR task corpus: a resource for chemical disease relation extraction journal January 2016
Lessons learnt from the DDIExtraction-2013 Shared Task journal October 2014