DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Development of message passing-based graph convolutional networks for classifying cancer pathology reports

Journal Article · · BMC Medical Informatics and Decision Making (Online)

Abstract Background Applying graph convolutional networks (GCN) to the classification of free-form natural language texts leveraged by graph-of-words features (TextGCN) was studied and confirmed to be an effective means of describing complex natural language texts. However, the text classification models based on the TextGCN possess weaknesses in terms of memory consumption and model dissemination and distribution. In this paper, we present a fast message passing network (FastMPN), implementing a GCN with message passing architecture that provides versatility and flexibility by allowing trainable node embedding and edge weights, helping the GCN model find the better solution. We applied the FastMPN model to the task of clinical information extraction from cancer pathology reports, extracting the following six properties: main site, subsite, laterality, histology, behavior, and grade. Results We evaluated the clinical task performance of the FastMPN models in terms of micro- and macro-averaged F1 scores. A comparison was performed with the multi-task convolutional neural network (MT-CNN) model. Results show that the FastMPN model is equivalent to or better than the MT-CNN. Conclusions Our implementation revealed that our FastMPN model, which is based on the PyTorch platform, can train a large corpus (667,290 training samples) with 202,373 unique words in less than 3 minutes per epoch using one NVIDIA V100 hardware accelerator. Our experiments demonstrated that using this implementation, the clinical task performance scores of information extraction related to tumors from cancer pathology reports were highly competitive.

Sponsoring Organization:
USDOE
Grant/Contract Number:
AC05-00OR22725
OSTI ID:
2447021
Journal Information:
BMC Medical Informatics and Decision Making (Online), Journal Name: BMC Medical Informatics and Decision Making (Online) Journal Issue: S5 Vol. 24; ISSN 1472-6947
Publisher:
Springer Science + Business MediaCopyright Statement
Country of Publication:
United Kingdom
Language:
English

References (21)

Classifying cancer pathology reports with hierarchical self-attention networks journal November 2019
Automatically extracting cancer disease characteristics from pathology reports into a Disease Knowledge Representation Model journal October 2009
Analysis of a complex of statistical variables into principal components. journal January 1933
CUR matrix decompositions for improved data analysis journal January 2009
Hierarchical attention networks for information extraction from cancer pathology reports journal November 2017
Automatic extraction of cancer registry reportable information from free-text pathology reports using multitask convolutional neural networks journal November 2019
Recent Cancer Trends in the United States journal February 1995
Principal component analysis: a review and recent developments
  • Jolliffe, Ian T.; Cadima, Jorge
  • Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, Vol. 374, Issue 2065 https://doi.org/10.1098/rsta.2015.0202
journal April 2016
Automated histologic grading from free-text pathology reports using graph-of-words features and machine learning conference January 2017
Information Extraction from Cancer Pathology Reports with Graph Convolution Networks for Natural Language Texts conference December 2019
Multi-Size Computer-Aided Diagnosis Of Positron Emission Tomography Images Using Graph Convolutional Networks conference April 2019
Deep Learning on Graphs: A Survey journal January 2020
On the Early History of the Singular Value Decomposition journal December 1993
Optimal CUR Matrix Decompositions journal January 2017
Information extraction from pathology reports in a hospital setting conference January 2011
Graph-of-word and TW-IDF
  • Rousseau, François; Vazirgiannis, Michalis
  • Proceedings of the 22nd ACM international conference on Conference on information & knowledge management - CIKM '13 https://doi.org/10.1145/2505515.2505671
conference January 2013
Laplacian Eigenmaps for Dimensionality Reduction and Data Representation journal June 2003
Graph Convolutional Networks for Text Classification journal July 2019
Text Level Graph Neural Network for Text Classification
  • Huang, Lianzhe; Ma, Dehong; Li, Sujian
  • Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) https://doi.org/10.18653/v1/D19-1345
conference January 2019
Convolutional Neural Networks for Sentence Classification conference January 2014
Cancer treatment and survivorship statistics, 2019 journal June 2019