skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: BioADAPT-MRC: adversarial learning-based domain adaptation improves biomedical machine reading comprehension task

Journal Article · · Bioinformatics

ABSTRACT Motivation Biomedical machine reading comprehension (biomedical-MRC) aims to comprehend complex biomedical narratives and assist healthcare professionals in retrieving information from them. The high performance of modern neural network-based MRC systems depends on high-quality, large-scale, human-annotated training datasets. In the biomedical domain, a crucial challenge in creating such datasets is the requirement for domain knowledge, inducing the scarcity of labeled data and the need for transfer learning from the labeled general-purpose (source) domain to the biomedical (target) domain. However, there is a discrepancy in marginal distributions between the general-purpose and biomedical domains due to the variances in topics. Therefore, direct-transferring of learned representations from a model trained on a general-purpose domain to the biomedical domain can hurt the model’s performance. Results We present an adversarial learning-based domain adaptation framework for the biomedical machine reading comprehension task (BioADAPT-MRC), a neural network-based method to address the discrepancies in the marginal distributions between the general and biomedical domain datasets. BioADAPT-MRC relaxes the need for generating pseudo labels for training a well-performing biomedical-MRC model. We extensively evaluate the performance of BioADAPT-MRC by comparing it with the best existing methods on three widely used benchmark biomedical-MRC datasets—BioASQ-7b, BioASQ-8b and BioASQ-9b. Our results suggest that without using any synthetic or human-annotated data from the biomedical domain, BioADAPT-MRC can achieve state-of-the-art performance on these datasets. Availability and implementation BioADAPT-MRC is freely available as an open-source project at https://github.com/mmahbub/BioADAPT-MRC. Supplementary information Supplementary data are available at Bioinformatics online.

Research Organization:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE; US Department of Veterans Affairs, Office of Information Technology
Grant/Contract Number:
AC05-00OR22725; VA118-16-M-1062
OSTI ID:
1887628
Alternate ID(s):
OSTI ID: 1878696
Journal Information:
Bioinformatics, Journal Name: Bioinformatics Vol. 38 Journal Issue: 18; ISSN 1367-4803
Publisher:
Oxford University PressCopyright Statement
Country of Publication:
United Kingdom
Language:
English

References (27)

Adversarial Domain Adaptation for Machine Reading Comprehension
  • Wang, Huazheng; Gan, Zhe; Liu, Xiaodong
  • Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) https://doi.org/10.18653/v1/D19-1254
conference January 2019
Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification journal December 2018
The textual characteristics of traditional and Open Access scientific journals are similar journal June 2009
Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis journal March 1964
An unsupervised deep domain adaptation approach for robust speech recognition journal September 2017
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension
  • Joshi, Mandar; Choi, Eunsol; Weld, Daniel
  • Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) https://doi.org/10.18653/v1/P17-1147
conference January 2017
External features enriched model for biomedical question answering journal May 2021
Generalizing biomedical relation classification with neural adversarial domain adaptation journal March 2018
An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition journal April 2015
Unstructured clinical notes within the 24 hours since admission predict short, mid & long-term mortality in adult ICU patients journal January 2022
emrQA: A Large Corpus for Question Answering on Electronic Medical Records conference January 2018
Identification of asthma control factor in clinical notes using a hybrid deep learning model journal November 2021
The information-seeking behavior of clinical staff in a large health care organization journal January 2009
SQuAD: 100,000+ Questions for Machine Comprehension of Text conference January 2016
Addressing appearance change in outdoor robotics with adversarial domain adaptation conference September 2017
Seventy-Five Trials and Eleven Systematic Reviews a Day: How Will We Ever Keep Up? journal September 2010
A Survey on Transfer Learning journal October 2010
Clinical Questions Raised by Clinicians at the Point of Care: A Systematic Review journal May 2014
Scale variance minimization for unsupervised domain adaptation in image segmentation journal April 2021
Improving face recognition with domain adaptation journal April 2018
A data-centric review of deep transfer learning with applications to text data journal March 2022
Redundancy in electronic health record corpora: analysis, impact on text mining performance and mitigation strategies journal January 2013
Biomedical Question Answering: A Survey of Approaches and Challenges journal March 2023
Question Answering in the Biomedical Domain conference January 2019
CliniQG4QA: Generating Diverse Questions for Domain Adaptation of Clinical Question Answering conference December 2021
Enriching Word Vectors with Subword Information journal December 2017
Expert Search Strategies: The Information Retrieval Practices of Healthcare Information Professionals journal January 2017

Similar Records

Related Subjects