skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: SU-D-204-06: Integration of Machine Learning and Bioinformatics Methods to Analyze Genome-Wide Association Study Data for Rectal Bleeding and Erectile Dysfunction Following Radiotherapy in Prostate Cancer

Abstract

Purpose: We investigated whether integration of machine learning and bioinformatics techniques on genome-wide association study (GWAS) data can improve the performance of predictive models in predicting the risk of developing radiation-induced late rectal bleeding and erectile dysfunction in prostate cancer patients. Methods: We analyzed a GWAS dataset generated from 385 prostate cancer patients treated with radiotherapy. Using genotype information from these patients, we designed a machine learning-based predictive model of late radiation-induced toxicities: rectal bleeding and erectile dysfunction. The model building process was performed using 2/3 of samples (training) and the predictive model was tested with 1/3 of samples (validation). To identify important single nucleotide polymorphisms (SNPs), we computed the SNP importance score, resulting from our random forest regression model. We performed gene ontology (GO) enrichment analysis for nearby genes of the important SNPs. Results: After univariate analysis on the training dataset, we filtered out many SNPs with p>0.001, resulting in 749 and 367 SNPs that were used in the model building process for rectal bleeding and erectile dysfunction, respectively. On the validation dataset, our random forest regression model achieved the area under the curve (AUC)=0.70 and 0.62 for rectal bleeding and erectile dysfunction, respectively. We performed GO enrichment analysismore » for the top 25%, 50%, 75%, and 100% SNPs out of the select SNPs in the univariate analysis. When we used the top 50% SNPs, more plausible biological processes were obtained for both toxicities. An additional test with the top 50% SNPs improved predictive power with AUC=0.71 and 0.65 for rectal bleeding and erectile dysfunction. A better performance was achieved with AUC=0.67 when age and androgen deprivation therapy were added to the model for erectile dysfunction. Conclusion: Our approach that combines machine learning and bioinformatics techniques enabled designing better models and identifying more plausible biological processes associated with the outcomes.« less

Authors:
;  [1];  [2];  [3];  [4]
  1. Memorial Sloan Kettering Cancer Center, New York, NY (United States)
  2. University of Rochester Medical Center, Rochester, NY (United States)
  3. Albert Einstein College of Medicine, Bronx, NY (United States)
  4. Mount Sinai School of Medicine, New York, NY (United States)
Publication Date:
OSTI Identifier:
22624364
Resource Type:
Journal Article
Resource Relation:
Journal Name: Medical Physics; Journal Volume: 43; Journal Issue: 6; Other Information: (c) 2016 American Association of Physicists in Medicine; Country of input: International Atomic Energy Agency (IAEA)
Country of Publication:
United States
Language:
English
Subject:
60 APPLIED LIFE SCIENCES; 61 RADIATION PROTECTION AND DOSIMETRY; ANDROGENS; DATASETS; GENOTYPE; HAZARDS; LEARNING; NEOPLASMS; NUCLEOTIDES; PATIENTS; PROSTATE; RADIOTHERAPY; RECTUM; TOXICITY; TRAINING; VALIDATION

Citation Formats

Oh, J, Deasy, J, Kerns, S, Ostrer, H, and Rosenstein, B. SU-D-204-06: Integration of Machine Learning and Bioinformatics Methods to Analyze Genome-Wide Association Study Data for Rectal Bleeding and Erectile Dysfunction Following Radiotherapy in Prostate Cancer. United States: N. p., 2016. Web. doi:10.1118/1.4955611.
Oh, J, Deasy, J, Kerns, S, Ostrer, H, & Rosenstein, B. SU-D-204-06: Integration of Machine Learning and Bioinformatics Methods to Analyze Genome-Wide Association Study Data for Rectal Bleeding and Erectile Dysfunction Following Radiotherapy in Prostate Cancer. United States. doi:10.1118/1.4955611.
Oh, J, Deasy, J, Kerns, S, Ostrer, H, and Rosenstein, B. Wed . "SU-D-204-06: Integration of Machine Learning and Bioinformatics Methods to Analyze Genome-Wide Association Study Data for Rectal Bleeding and Erectile Dysfunction Following Radiotherapy in Prostate Cancer". United States. doi:10.1118/1.4955611.
@article{osti_22624364,
title = {SU-D-204-06: Integration of Machine Learning and Bioinformatics Methods to Analyze Genome-Wide Association Study Data for Rectal Bleeding and Erectile Dysfunction Following Radiotherapy in Prostate Cancer},
author = {Oh, J and Deasy, J and Kerns, S and Ostrer, H and Rosenstein, B},
abstractNote = {Purpose: We investigated whether integration of machine learning and bioinformatics techniques on genome-wide association study (GWAS) data can improve the performance of predictive models in predicting the risk of developing radiation-induced late rectal bleeding and erectile dysfunction in prostate cancer patients. Methods: We analyzed a GWAS dataset generated from 385 prostate cancer patients treated with radiotherapy. Using genotype information from these patients, we designed a machine learning-based predictive model of late radiation-induced toxicities: rectal bleeding and erectile dysfunction. The model building process was performed using 2/3 of samples (training) and the predictive model was tested with 1/3 of samples (validation). To identify important single nucleotide polymorphisms (SNPs), we computed the SNP importance score, resulting from our random forest regression model. We performed gene ontology (GO) enrichment analysis for nearby genes of the important SNPs. Results: After univariate analysis on the training dataset, we filtered out many SNPs with p>0.001, resulting in 749 and 367 SNPs that were used in the model building process for rectal bleeding and erectile dysfunction, respectively. On the validation dataset, our random forest regression model achieved the area under the curve (AUC)=0.70 and 0.62 for rectal bleeding and erectile dysfunction, respectively. We performed GO enrichment analysis for the top 25%, 50%, 75%, and 100% SNPs out of the select SNPs in the univariate analysis. When we used the top 50% SNPs, more plausible biological processes were obtained for both toxicities. An additional test with the top 50% SNPs improved predictive power with AUC=0.71 and 0.65 for rectal bleeding and erectile dysfunction. A better performance was achieved with AUC=0.67 when age and androgen deprivation therapy were added to the model for erectile dysfunction. Conclusion: Our approach that combines machine learning and bioinformatics techniques enabled designing better models and identifying more plausible biological processes associated with the outcomes.},
doi = {10.1118/1.4955611},
journal = {Medical Physics},
number = 6,
volume = 43,
place = {United States},
year = {Wed Jun 15 00:00:00 EDT 2016},
month = {Wed Jun 15 00:00:00 EDT 2016}
}
  • Purpose: To identify single nucleotide polymorphisms (SNPs) associated with erectile dysfunction (ED) among African-American prostate cancer patients treated with external beam radiation therapy. Methods and Materials: A cohort of African-American prostate cancer patients treated with external beam radiation therapy was observed for the development of ED by use of the five-item Sexual Health Inventory for Men (SHIM) questionnaire. Final analysis included 27 cases (post-treatment SHIM score {<=}7) and 52 control subjects (post-treatment SHIM score {>=}16). A genome-wide association study was performed using approximately 909,000 SNPs genotyped on Affymetrix 6.0 arrays (Affymetrix, Santa Clara, CA). Results: We identified SNP rs2268363, locatedmore » in the follicle-stimulating hormone receptor (FSHR) gene, as significantly associated with ED after correcting for multiple comparisons (unadjusted p = 5.46 x 10{sup -8}, Bonferroni p = 0.028). We identified four additional SNPs that tended toward a significant association with an unadjusted p value < 10{sup -6}. Inference of population substructure showed that cases had a higher proportion of African ancestry than control subjects (77% vs. 60%, p = 0.005). A multivariate logistic regression model that incorporated estimated ancestry and four of the top-ranked SNPs was a more accurate classifier of ED than a model that included only clinical variables. Conclusions: To our knowledge, this is the first genome-wide association study to identify SNPs associated with adverse effects resulting from radiotherapy. It is important to note that the SNP that proved to be significantly associated with ED is located within a gene whose encoded product plays a role in male gonad development and function. Another key finding of this project is that the four SNPs most strongly associated with ED were specific to persons of African ancestry and would therefore not have been identified had a cohort of European ancestry been screened. This study demonstrates the feasibility of a genome-wide approach to investigate genetic predisposition to radiation injury.« less
  • Purpose: To identify single nucleotide polymorphisms (SNPs) associated with development of erectile dysfunction (ED) among prostate cancer patients treated with radiation therapy. Methods and Materials: A 2-stage genome-wide association study was performed. Patients were split randomly into a stage I discovery cohort (132 cases, 103 controls) and a stage II replication cohort (128 cases, 102 controls). The discovery cohort was genotyped using Affymetrix 6.0 genome-wide arrays. The 940 top ranking SNPs selected from the discovery cohort were genotyped in the replication cohort using Illumina iSelect custom SNP arrays. Results: Twelve SNPs identified in the discovery cohort and validated in themore » replication cohort were associated with development of ED following radiation therapy (Fisher combined P values 2.1 Multiplication-Sign 10{sup -5} to 6.2 Multiplication-Sign 10{sup -4}). Notably, these 12 SNPs lie in or near genes involved in erectile function or other normal cellular functions (adhesion and signaling) rather than DNA damage repair. In a multivariable model including nongenetic risk factors, the odds ratios for these SNPs ranged from 1.6 to 5.6 in the pooled cohort. There was a striking relationship between the cumulative number of SNP risk alleles an individual possessed and ED status (Sommers' D P value = 1.7 Multiplication-Sign 10{sup -29}). A 1-allele increase in cumulative SNP score increased the odds for developing ED by a factor of 2.2 (P value = 2.1 Multiplication-Sign 10{sup -19}). The cumulative SNP score model had a sensitivity of 84% and specificity of 75% for prediction of developing ED at the radiation therapy planning stage. Conclusions: This genome-wide association study identified a set of SNPs that are associated with development of ED following radiation therapy. These candidate genetic predictors warrant more definitive validation in an independent cohort.« less
  • Purpose: To determine if differences in patient positioning methods have an impact on the incidence and modeling of grade >=2 acute rectal toxicity in prostate cancer patients who were treated with Intensity Modulated Radiation Therapy (IMRT). Methods: We compared two databases of patients treated with radiation therapy for prostate cancer: a database of 79 patients who were treated with 7 field IMRT and daily image guided positioning based on implanted gold markers (IGRTdb), and a database of 302 patients who were treated with 5 field IMRT and daily positioning using a trans-abdominal ultrasound system (USdb). Complete planning dosimetry was availablemore » for IGRTdb patients while limited planning dosimetry, recorded at the time of planning, was available for USdb patients. We fit Lyman-Kutcher-Burman (LKB) model to IGRTdb only, and Univariate Logistic Regression (ULR) NTCP model to both databases. We perform Receiver Operating Characteristics analysis to determine the predictive power of NTCP models. Results: The incidence of grade >= 2 acute rectal toxicity in IGRTdb was 20%, while the incidence in USdb was 54%. Fits of both LKB and ULR models yielded predictive NTCP models for IGRTdb patients with Area Under the Curve (AUC) in the 0.63 – 0.67 range. Extrapolation of the ULR model from IGRTdb to planning dosimetry in USdb predicts that the incidence of acute rectal toxicity in USdb should not exceed 40%. Fits of the ULR model to the USdb do not yield predictive NTCP models and their AUC is consistent with AUC = 0.5. Conclusion: Accuracy of a patient positioning system affects clinically observed toxicity rates and the quality of NTCP models that can be derived from toxicity data. Poor correlation between planned and clinically delivered dosimetry may lead to erroneous or poorly performing NTCP models, even if the number of patients in a database is large.« less
  • Purpose: To determine the rectal tolerance to Grade 2 rectal bleeding after I-125 seed brachytherapy combined with external beam radiotherapy (EBRT), based on the rectal dose-volume histogram. Methods and Materials: A total of 458 consecutive patients with stages T1 to T3 prostate cancer received combined modality treatment consisting of I-125 seed implantation followed by EBRT to the prostate and seminal vesicles. The prescribed doses of brachytherapy and EBRT were 100 Gy and 45 Gy in 25 fractions, respectively. The rectal dosimetric factors were analyzed for rectal volumes receiving >100 Gy and >150 Gy (R100 and R150) during brachytherapy and formore » rectal volumes receiving >30 Gy to 40 Gy (V30-V40) during EBRT therapy in 373 patients for whom datasets were available. The patients were followed from 21 to 72 months (median, 45 months) after the I-125 seed implantation. Results: Forty-four patients (9.7%) developed Grade 2 rectal bleeding. On multivariate analysis, age (p = 0.014), R100 (p = 0.002), and V30 (p = 0.001) were identified as risk factors for Grade 2 rectal bleeding. The rectal bleeding rate increased as the R100 increased: 5.0% (2/40 patients) for 0 ml; 7.5% (20/267 patients) for >0 to 0.5 ml; 11.0% (11/100 patients) for >0.5 to 1 ml; 17.9% (5/28 patients) for >1 to 1.5 ml; and 27.3% (6/22 patients) for >1.5 ml (p = 0.014). Grade 2 rectal bleeding developed in 6.4% (12/188) of patients with a V30 {<=}35% and in 14.1% (26/185) of patients with a V30 >35% (p = 0.02). When these dose-volume parameters were considered in combination, the Grade 2 rectal bleeding rate was 4.2% (5/120 patients) for a R100 {<=}0.5 ml and a V30 {<=}35%, whereas it was 22.4% (13/58 patients) for R100 of >0.5 ml and V30 of >35%. Conclusion: The risk of rectal bleeding was found to be significantly volume-dependent in patients with prostate cancer who received combined modality treatment. Rectal dose-volume analysis is a practical method for predicting the risk of development of Grade 2 rectal bleeding.« less
  • Purpose: To evaluate the incidence of Grade 2 or worse rectal bleeding after high-dose-rate (HDR) brachytherapy combined with hypofractionated external-beam radiotherapy (EBRT), with special emphasis on the relationship between the incidence of rectal bleeding and the rectal dose from HDR brachytherapy. Methods and Materials: The records of 100 patients who were treated by HDR brachytherapy combined with EBRT for {>=}12 months were analyzed. The fractionation schema for HDR brachytherapy was prospectively changed, and the total radiation dose for EBRT was fixed at 51 Gy. The distribution of the fractionation schema used in the patients was as follows: 5 Gy xmore » 5 in 13 patients; 7 Gy x 3 in 19 patients; and 9 Gy x 2 in 68 patients. Results: Ten patients (10%) developed Grade 2 or worse rectal bleeding. Regarding the correlation with dosimetric factors, no significant differences were found in the average percentage of the entire rectal volume receiving 30%, 50%, 80%, and 90% of the prescribed radiation dose from EBRT between those with bleeding and those without. The average percentage of the entire rectal volume receiving 10%, 30%, 50%, 80%, and 90% of the prescribed radiation dose from HDR brachytherapy in those who developed rectal bleeding was 77.9%, 28.6%, 9.0%, 1.5%, and 0.3%, respectively, and was 69.2%, 22.2%, 6.6%, 0.9%, and 0.4%, respectively, in those without bleeding. The differences in the percentages of the entire rectal volume receiving 10%, 30%, and 50% between those with and without bleeding were statistically significant. Conclusions: The rectal dose from HDR brachytherapy for patients with prostate cancer may have a significant impact on the incidence of Grade 2 or worse rectal bleeding.« less