Selective Information Extraction Strategies for Cancer Pathology Reports with Convolutional Neural Networks

Yoon, Hong-Jun; Qiu, John X.; Christian, Blair; Hinkle, Jacob; Alamudun, Folami; Tourassi, Georgia

doi:10.1007/978-3-030-16841-4_9

Title: Selective Information Extraction Strategies for Cancer Pathology Reports with Convolutional Neural Networks

Conference · Mon Apr 01 00:00:00 EDT 2019

DOI:https://doi.org/10.1007/978-3-030-16841-4_9· OSTI ID:1509553

^[1]; Qiu, John X. ^[1];

^[1];

^[1]

ORNL

To trust model predictions, it is important to ensure new data scored by the model comes from the same population used for model training. If the model is used to score new data different than the model’s training data, then predictions and model performance metrics cannot be trusted. Identifying and excluding these anomalous data points is an important task when using models in the real world. Traditional machine learning algorithms and classifiers don’t have the capability to abstain in this case. Here we propose a data-novelty detection algorithm for the Convolutional Neural Network classifier, yielding a rejection score for each new data point scored. It is a post-modeling procedure which examines the distribution of convolution filters to determine if the prediction should be trusted. We apply this algorithm to an information extraction model for a natural language text corpus. We evaluated the algorithm performance using a primary cancer site classification model applied to cancer pathology reports. Results demonstrate that the algorithm is an effective way to exclude cancer pathology reports from model scoring when they do not contain the expected information necessary to accurately classify the primary cancer type.

View Conference

Cite

Export

Save

Research Organization:: Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)

Sponsoring Organization:: USDOE

DOE Contract Number:: AC05-00OR22725

OSTI ID:: 1509553

Resource Relation:: Conference: INNS Big Data and Deep Learning 2019 - Genoa, , Italy - 4/16/2019 4:00:00 AM-4/18/2019 4:00:00 AM

Country of Publication:: United States

Language:: English

References (9)

Uncertainty Quantification: Theory, Implementation, and Applications Smith, Ralph C. Society for Industrial and Applied Mathematics https://doi.org/10.1137/1.9781611973228	book	January 2013
On the Exact Variance of Products Goodman, Leo A. Journal of the American Statistical Association, Vol. 55, Issue 292 https://doi.org/10.1080/01621459.1960.10483369	journal	December 1960
The 2007 WHO Classification of Tumours of the Central Nervous System Louis, David N.; Ohgaki, Hiroko; Wiestler, Otmar D. Acta Neuropathologica, Vol. 114, Issue 2 https://doi.org/10.1007/s00401-007-0243-4	journal	July 2007
Extracting Information from Textual Documents in the Electronic Health Record: A Review of Recent Research Meystre, S. M.; Savova, G. K.; Kipper-Schuler, K. C. Yearbook of Medical Informatics, Vol. 17, Issue 01 https://doi.org/10.1055/s-0038-1638592	journal	January 2008
Deep Learning for Automated Extraction of Primary Sites From Cancer Pathology Reports Qiu, John X.; Yoon, Hong-Jun; Fearn, Paul A. IEEE Journal of Biomedical and Health Informatics, Vol. 22, Issue 1 https://doi.org/10.1109/JBHI.2017.2700722	journal	January 2018
Deep learning LeCun, Yann; Bengio, Yoshua; Hinton, Geoffrey Nature, Vol. 521, Issue 7553 https://doi.org/10.1038/nature14539	journal	May 2015
Bayesian Data Analysis Gelman, Andrew; Carlin, John B.; Stern, Hal S. https://doi.org/10.1201/b16018	book	November 2013
Filter pruning of Convolutional Neural Networks for text classification: A case study of cancer pathology report comprehension Yoon, Hong-Jun; Robinson, Sarah; Christian, J. Blair 2018 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI) https://doi.org/10.1109/BHI.2018.8333439	conference	March 2018
Convolutional Neural Networks for Sentence Classification Kim, Yoon Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) https://doi.org/10.3115/v1/D14-1181	conference	January 2014

Similar Records

Filter pruning of Convolutional Neural Networks for text classification: A case study of cancer pathology report comprehension

Conference · Thu Mar 01 00:00:00 EST 2018 · OSTI ID:1509553

Yoon, Hong-Jun; Robinson, Sarah; Christian, Blair; +2 more

Automatic extraction of cancer registry reportable information from free-text pathology reports using multitask convolutional neural networks

Journal Article · Sat Nov 09 00:00:00 EST 2019 · Journal of the American Medical Informatics Association · OSTI ID:1509553

Alawad, Mohammed; Gao, Shang; Qiu, John X.; +7 more

Coarse-to-Fine Multi-Task Training of Convolutional Neural Networks for Automated Information Extraction from Cancer Pathology Reports

Conference · Thu Mar 01 00:00:00 EST 2018 · OSTI ID:1509553

Alawad, Mohammed; Yoon, Hong-Jun; Tourassi, Georgia

Title: Selective Information Extraction Strategies for Cancer Pathology Reports with Convolutional Neural Networks

Citation Formats

References (9)

Similar Records

Related Subjects