skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Improving lung cancer prognosis assessment by incorporating synthetic minority oversampling technique and score fusion method

Journal Article · · Medical Physics
DOI:https://doi.org/10.1118/1.4948499· OSTI ID:22685100
 [1];  [2];  [3];  [4]
  1. School of Medical Instrument and Food Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China and School of Electrical and Computer Engineering, University of Oklahoma, Norman, Oklahoma 73019 (United States)
  2. Department of Electrical and Computer Engineering, University of Texas, El Paso, Texas 79968 and Sino-Dutch Biomedical and Information Engineering School, Northeastern University, Shenyang 110819 (China)
  3. Department of Radiology, Guangzhou Medical University, Guangzhou 510182 (China)
  4. School of Electrical and Computer Engineering, University of Oklahoma, Norman, Oklahoma 73019 (United States)

Purpose: This study aims to investigate the potential to improve lung cancer recurrence risk prediction performance for stage I NSCLS patients by integrating oversampling, feature selection, and score fusion techniques and develop an optimal prediction model. Methods: A dataset involving 94 early stage lung cancer patients was retrospectively assembled, which includes CT images, nine clinical and biological (CB) markers, and outcome of 3-yr disease-free survival (DFS) after surgery. Among the 94 patients, 74 remained DFS and 20 had cancer recurrence. Applying a computer-aided detection scheme, tumors were segmented from the CT images and 35 quantitative image (QI) features were initially computed. Two normalized Gaussian radial basis function network (RBFN) based classifiers were built based on QI features and CB markers separately. To improve prediction performance, the authors applied a synthetic minority oversampling technique (SMOTE) and a BestFirst based feature selection method to optimize the classifiers and also tested fusion methods to combine QI and CB based prediction results. Results: Using a leave-one-case-out cross-validation (K-fold cross-validation) method, the computed areas under a receiver operating characteristic curve (AUCs) were 0.716 ± 0.071 and 0.642 ± 0.061, when using the QI and CB based classifiers, respectively. By fusion of the scores generated by the two classifiers, AUC significantly increased to 0.859 ± 0.052 (p < 0.05) with an overall prediction accuracy of 89.4%. Conclusions: This study demonstrated the feasibility of improving prediction performance by integrating SMOTE, feature selection, and score fusion techniques. Combining QI features and CB markers and performing SMOTE prior to feature selection in classifier training enabled RBFN based classifier to yield improved prediction accuracy.

OSTI ID:
22685100
Journal Information:
Medical Physics, Vol. 43, Issue 6; Other Information: (c) 2016 American Association of Physicists in Medicine; Country of input: International Atomic Energy Agency (IAEA); ISSN 0094-2405
Country of Publication:
United States
Language:
English