KEBLM: Knowledge-Enhanced Biomedical Language Models

Lai, Tuan Manh; Zhai, ChengXiang; Ji, Heng

doi:10.1016/j.jbi.2023.104392

KEBLM: Knowledge-Enhanced Biomedical Language Models

Journal Article · Fri May 19 00:00:00 EDT 2023 · Journal of Biomedical Informatics

DOI:https://doi.org/10.1016/j.jbi.2023.104392· OSTI ID:2420838

^[1]; Zhai, ChengXiang ^[2]; ^[2]

Univ. of Illinois at Urbana-Champaign, IL (United States); OSTI
Univ. of Illinois at Urbana-Champaign, IL (United States)

Pretrained language models (PLMs) have demonstrated strong performance on many natural language processing (NLP) tasks. Despite their great success, these PLMs are typically pretrained only on unstructured free texts without leveraging existing structured knowledge bases that are readily available for many domains, especially scientific domains. As a result, these PLMs may not achieve satisfactory performance on knowledge-intensive tasks such as biomedical NLP. Comprehending a complex biomedical document without domain-specific knowledge is challenging, even for humans. Inspired by this observation, we propose a general framework for incorporating various types of domain knowledge from multiple sources into biomedical PLMs. We encode domain knowledge using lightweight adapter modules, bottleneck feed-forward networks that are inserted into different locations of a backbone PLM. For each knowledge source of interest, we pretrain an adapter module to capture the knowledge in a self-supervised way. We design a wide range of self-supervised objectives to accommodate diverse types of knowledge, ranging from entity relations to description sentences. Once a set of pretrained adapters is available, we employ fusion layers to combine the knowledge encoded within these adapters for downstream tasks. Each fusion layer is a parameterized mixer of the available trained adapters that can identify and activate the most useful adapters for a given input. Our method diverges from prior work by including a knowledge consolidation phase, during which we teach the fusion layers to effectively combine knowledge from both the original PLM and newly-acquired external knowledge using a large collection of unannotated texts. After the consolidation phase, the complete knowledge-enhanced model can be fine-tuned for any downstream task of interest to achieve optimal performance. Extensive experiments on many biomedical NLP datasets show that our proposed framework consistently improves the performance of the underlying PLMs on various downstream tasks such as natural language inference, question answering, and entity linking. These results demonstrate the benefits of using multiple sources of external knowledge to enhance PLMs and the effectiveness of the framework for incorporating knowledge into PLMs. Finally, while primarily focused on the biomedical domain in this work, our framework is highly adaptable and can be easily applied to other domains, such as the bioenergy sector.

View Accepted Manuscript (DOE)

Research Organization:: Univ. of Illinois at Urbana-Champaign, IL (United States)

Sponsoring Organization:: USDOE Office of Science (SC), Biological and Environmental Research (BER)

Grant/Contract Number:: SC0018420

OSTI ID:: 2420838

Journal Information:: Journal of Biomedical Informatics, Journal Name: Journal of Biomedical Informatics Vol. 143; ISSN 1532-0464

Publisher:: ElsevierCopyright Statement

Country of Publication:: United States

Language:: English

References (16)

Growth rates of modern science: A bibliometric analysis based on the number of publications and cited references: Growth Rates of Modern Science: A Bibliometric Analysis Based on the Number of Publications and Cited References Bornmann, Lutz; Mutz, Rüdiger Journal of the Association for Information Science and Technology, Vol. 66, Issue 11 https://doi.org/10.1002/asi.23329	journal	April 2015
NCBI disease corpus: A resource for disease name recognition and concept normalization Doğan, Rezarta Islamaj; Leaman, Robert; Lu, Zhiyong Journal of Biomedical Informatics, Vol. 47 https://doi.org/10.1016/j.jbi.2013.12.006	journal	February 2014
Pre-trained language models with domain knowledge for biomedical extractive summarization Xie, Qianqian; Bishop, Jennifer Amy; Tiwari, Prayag Knowledge-Based Systems, Vol. 252 https://doi.org/10.1016/j.knosys.2022.109460	journal	September 2022
RelEx--Relation extraction using dependency parse trees Fundel, K.; Kuffner, R.; Zimmer, R. Bioinformatics, Vol. 23, Issue 3 https://doi.org/10.1093/bioinformatics/btl616	journal	December 2006
BioBERT: a pre-trained biomedical language representation model for biomedical text mining Lee, Jinhyuk; Yoon, Wonjin; Kim, Sungdong Bioinformatics https://doi.org/10.1093/bioinformatics/btz682	journal	September 2019
PubChem in 2021: new data content and improved web interfaces Kim, Sunghwan; Chen, Jie; Cheng, Tiejun Nucleic Acids Research, Vol. 49, Issue D1 https://doi.org/10.1093/nar/gkaa971	journal	November 2020
The Unified Medical Language System (UMLS): integrating biomedical terminology Bodenreider, O. Nucleic Acids Research, Vol. 32, Issue 90001 https://doi.org/10.1093/nar/gkh061	journal	January 2004
The Comparative Toxicogenomics Database: A Cross-Species Resource for Building Chemical-Gene Interaction Networks Mattingly, Carolyn J.; Rosenstein, Michael C.; Davis, Allan Peter Toxicological Sciences, Vol. 92, Issue 2 https://doi.org/10.1093/toxsci/kfl008	journal	May 2006
Multi-Similarity Loss With General Pair Weighting for Deep Metric Learning Wang, Xun; Han, Xintong; Huang, Weilin 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) https://doi.org/10.1109/CVPR.2019.00516	conference	June 2019
Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing Gu, Yu; Tinn, Robert; Cheng, Hao ACM Transactions on Computing for Healthcare, Vol. 3, Issue 1 https://doi.org/10.1145/3458754	journal	October 2021
K-Aid Sun, Fu; Li, Feng-Lin; Wang, Ruize Proceedings of the 30th ACM International Conference on Information & Knowledge Management https://doi.org/10.1145/3459637.3481930	conference	October 2021
SpanBERT: Improving Pre-training by Representing and Predicting Spans Joshi, Mandar; Chen, Danqi; Liu, Yinhan Transactions of the Association for Computational Linguistics, Vol. 8 https://doi.org/10.1162/tacl_a_00300	journal	December 2020
A Primer in BERTology: What We Know About How BERT Works Rogers, Anna; Kovaleva, Olga; Rumshisky, Anna Transactions of the Association for Computational Linguistics, Vol. 8 https://doi.org/10.1162/tacl_a_00349	journal	December 2020
Entity linking for biomedical literature Zheng, Jin G.; Howsmon, Daniel; Zhang, Boliang BMC Medical Informatics and Decision Making, Vol. 15, Issue S1 https://doi.org/10.1186/1472-6947-15-S1-S4	journal	May 2015
AdapterHub: A Framework for Adapting Transformers Pfeiffer, Jonas; Rücklé, Andreas; Poth, Clifton Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations https://doi.org/10.18653/v1/2020.emnlp-demos.7	conference	January 2020
BERT might be Overkill: A Tiny but Effective Biomedical Entity Linker based on Residual Convolutional Neural Networks Lai, Tuan; Ji, Heng; Zhai, ChengXiang Findings of the Association for Computational Linguistics: EMNLP 2021 https://doi.org/10.18653/v1/2021.findings-emnlp.140	conference	January 2021

Similar Records

Towards a semantic lexicon for biological language processing

Conference · Wed Dec 31 23:00:00 EST 2003 · OSTI ID:977640

Active Learning for Language Modeling

Technical Report · Thu Sep 01 00:00:00 EDT 2022 · OSTI ID:1890039

Related Subjects

96 KNOWLEDGE MANAGEMENT AND PRESERVATION
Computer Science
Domain knowledge
Knowledge bases
Medical Informatics
Pre-trained language models

KEBLM: Knowledge-Enhanced Biomedical Language Models

Citation Formats

References (16)

Similar Records

Related Subjects