Predictive Big Data Analytics: A Study of Parkinson’s Disease Using Large, Complex, Heterogeneous, Incongruent, Multi-Source and Incomplete Observations

Dinov, Ivo D.; Heavner, Ben; Tang, Ming; Glusman, Gustavo; Chard, Kyle; Darcy, Mike; Madduri, Ravi; Pa, Judy; Spino, Cathie; Kesselman, Carl; Foster, Ian; Deutsch, Eric W.; Price, Nathan D.; Van Horn, John D.; Ames, Joseph; Clark, Kristi; Hood, Leroy; Hampstead, Benjamin M.; Dauer, William; Toga, Arthur W.; Draganski, Bogdan

doi:10.1371/journal.pone.0157077

Title: Predictive Big Data Analytics: A Study of Parkinson’s Disease Using Large, Complex, Heterogeneous, Incongruent, Multi-Source and Incomplete Observations

Journal Article · Fri Aug 05 00:00:00 EDT 2016 · PLoS ONE

DOI:https://doi.org/10.1371/journal.pone.0157077· OSTI ID:1627792

Dinov, Ivo D. ^[1]; Heavner, Ben ^[2]; Tang, Ming ^[3]; Glusman, Gustavo ^[2]; Chard, Kyle ^[4]; Darcy, Mike ^[5]; Madduri, Ravi ^[4]; Pa, Judy ^[6]; Spino, Cathie ^[7]; Kesselman, Carl ^[5]; Foster, Ian ^[4]; Deutsch, Eric W. ^[2]; Price, Nathan D. ^[2]; Van Horn, John D. ^[6]; Ames, Joseph ^[6]; Clark, Kristi ^[6]; Hood, Leroy ^[2]; Hampstead, Benjamin M. ^[8]; Dauer, William ^[7]; Toga, Arthur W. ^[6] more »

Univ. of Michigan, Ann Arbor, MI (United States). Statistics Online Computational Resource, School of Nursing, Michigan Institute for Data Science; Univ. of Southern California, Los Angeles, CA (United States). Stevens Neuroimaging and Informatics Institute; ; Univ. of Michigan, Ann Arbor, MI (United States). Udall Center of Excellence for Parkinson’s Disease Research
Institute for Systems Biology, Seattle, Washington, (United States)
Univ. of Michigan, Ann Arbor, MI (United States). Statistics Online Computational Resource, School of Nursing, Michigan Institute for Data Science
Argonne National Lab. (ANL), Argonne, IL (United States); ; Univ. of Chicago, IL (United States). Computation Institute
Univ. of Southern California, Los Angeles, CA (United States). Information Sciences Institute
Univ. of Southern California, Los Angeles, CA (United States). Stevens Neuroimaging and Informatics Institute
Univ. of Michigan, Ann Arbor, MI (United States). Udall Center of Excellence for Parkinson’s Disease Research
Univ. of Michigan, Ann Arbor, MI (United States). Department of Psychiatry and Michigan Alzheimer’s Disease Center; Veterans Affairs Ann Arbor Healthcare System, Ann Arbor, Michigan, (United States)

Background A unique archive of Big Data on Parkinson’s Disease is collected, managed and disseminated by the Parkinson’s Progression Markers Initiative (PPMI). The integration of such complex and heterogeneous Big Data from multiple sources offers unparalleled opportunities to study the early stages of prevalent neurodegenerative processes, track their progression and quickly identify the efficacies of alternative treatments. Many previous human and animal studies have examined the relationship of Parkinson’s disease (PD) risk to trauma, genetics, environment, co-morbidities, or life style. The defining characteristics of Big Data–large size, incongruency, incompleteness, complexity, multiplicity of scales, and heterogeneity of information-generating sources–all pose challenges to the classical techniques for data management, processing, visualization and interpretation. We propose, implement, test and validate complementary model-based and model-free approaches for PD classification and prediction. To explore PD risk using Big Data methodology, we jointly processed complex PPMI imaging, genetics, clinical and demographic data. Methods and Findings Collective representation of the multi-source data facilitates the aggregation and harmonization of complex data elements. This enables joint modeling of the complete data, leading to the development of Big Data analytics, predictive synthesis, and statistical validation. Using heterogeneous PPMI data, we developed a comprehensive protocol for end-to-end data characterization, manipulation, processing, cleaning, analysis and validation. Specifically, we (i) introduce methods for rebalancing imbalanced cohorts, (ii) utilize a wide spectrum of classification methods to generate consistent and powerful phenotypic predictions, and (iii) generate reproducible machine-learning based classification that enables the reporting of model parameters and diagnostic forecasting based on new data. We evaluated several complementary model-based predictive approaches, which failed to generate accurate and reliable diagnostic predictions. However, the results of several machine-learning based classification methods indicated significant power to predict Parkinson’s disease in the PPMI subjects (consistent accuracy, sensitivity, and specificity exceeding 96%, confirmed using statistical n-fold cross-validation). Clinical (e.g., Unified Parkinson's Disease Rating Scale (UPDRS) scores), demographic (e.g., age), genetics (e.g., rs34637584, chr12), and derived neuroimaging biomarker (e.g., cerebellum shape index) data all contributed to the predictive analytics and diagnostic forecasting. Conclusions Model-free Big Data machine learning-based classification methods (e.g., adaptive boosting, support vector machines) can outperform model-based techniques in terms of predictive precision and reliability (e.g., forecasting patient diagnosis). We observed that statistical rebalancing of cohort sizes yields better discrimination of group differences, specifically for predictive analytics based on heterogeneous and incomplete PPMI data. UPDRS scores play a critical role in predicting diagnosis, which is expected based on the clinical definition of Parkinson’s disease. Even without longitudinal UPDRS data, however, the accuracy of model-free machine learning based classification is over 80%. The methods, software and protocols developed here are openly shared and can be employed to study other neurodegenerative disorders (e.g., Alzheimer’s, Huntington’s, amyotrophic lateral sclerosis), as well as for other predictive Big Data analytics applications.

View Accepted Manuscript (DOE)

Cite

Export

Save

Sponsoring Organization:: USDOE; National Science Foundation (NSF); National Institutes of Health (NIH)

Grant/Contract Number:: AC02-06CH11357; 1023115; 1022560; 1022636; 0089377; 9652870; 0442992; 0442630; 0333672; 0716055; P20 NR015331; P50 NS091856; P30 DK089503; U54 EB02040

OSTI ID:: 1627792

Journal Information:: PLoS ONE, Vol. 11, Issue 8; ISSN 1932-6203

Publisher:: Public Library of ScienceCopyright Statement

Country of Publication:: United States

Language:: English

References (76)

EnsCat: clustering of categorical data via ensembling Clarke, Bertrand S.; Amiri, Saeid; Clarke, Jennifer L. BMC Bioinformatics, Vol. 17, Issue 1 https://doi.org/10.1186/s12859-016-1245-9	journal	September 2016
The properties of high-dimensional data spaces: implications for exploring gene and protein expression data Clarke, Robert; Ressom, Habtom W.; Wang, Antai Nature Reviews Cancer, Vol. 8, Issue 1 https://doi.org/10.1038/nrc2294	journal	January 2008
Kryder's Law Walter, Chip Scientific American, Vol. 293, Issue 2 https://doi.org/10.1038/scientificamerican0805-32	journal	August 2005
Establishing Moore's Law Mollick, E. IEEE Annals of the History of Computing, Vol. 28, Issue 3 https://doi.org/10.1109/MAHC.2006.45	journal	July 2006
Challenges and Trends of Big Data Analytics Li, Hui; Lu, Xin 2014 Ninth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC) https://doi.org/10.1109/3PGCIC.2014.136	conference	November 2014
The current and projected economic burden of Parkinson's disease in the United States: Economic Burden of PD in The US Kowal, Stacey L.; Dall, Timothy M.; Chakrabarti, Ritashree Movement Disorders, Vol. 28, Issue 3 https://doi.org/10.1002/mds.25292	journal	February 2013
2013 Alzheimer's disease facts and figures Thies, William; Bleiler, Laura Alzheimer's & Dementia, Vol. 9, Issue 2 https://doi.org/10.1016/j.jalz.2013.02.003	journal	March 2013
Soluble protein oligomers in neurodegeneration: lessons from the Alzheimer's amyloid β-peptide Haass, Christian; Selkoe, Dennis J. Nature Reviews Molecular Cell Biology, Vol. 8, Issue 2 https://doi.org/10.1038/nrm2101	journal	February 2007
Atomic structures of amyloid cross-β spines reveal varied steric zippers Sawaya, Michael R.; Sambashivan, Shilpa; Nelson, Rebecca Nature, Vol. 447, Issue 7143 https://doi.org/10.1038/nature05695	journal	April 2007
Brain banks as key part of biochemical and molecular studies on cerebral cortex involvement in Parkinson’s disease: Brain banks and the biochemistry of PD Ravid, Rivka; Ferrer, Isidro FEBS Journal, Vol. 279, Issue 7 https://doi.org/10.1111/j.1742-4658.2012.08518.x	journal	February 2012
Systematic Review of the Risk of Parkinson's Disease After Mild Traumatic Brain Injury: Results of the International Collaboration on Mild Traumatic Brain Injury Prognosis Marras, Connie; Hincapié, Cesar A.; Kristman, Vicki L. Archives of Physical Medicine and Rehabilitation, Vol. 95, Issue 3 https://doi.org/10.1016/j.apmr.2013.08.298	journal	March 2014
Environmental Toxins and Parkinson's Disease Goldman, Samuel M. Annual Review of Pharmacology and Toxicology, Vol. 54, Issue 1 https://doi.org/10.1146/annurev-pharmtox-011613-135937	journal	January 2014
To operate or not?: A literature review of surgical outcomes in 95 patients with Parkinson's disease undergoing spine surgery Sarkiss, Christopher A.; Fogg, Guy A.; Skovrlj, Branko Clinical Neurology and Neurosurgery, Vol. 134 https://doi.org/10.1016/j.clineuro.2015.04.022	journal	July 2015
The prevalence of Parkinson's disease: A systematic review and meta-analysis: PD PREVALENCE Pringsheim, Tamara; Jette, Nathalie; Frolkis, Alexandra Movement Disorders, Vol. 29, Issue 13 https://doi.org/10.1002/mds.25945	journal	June 2014
Clinical markers for identifying cholinergic deficits in Parkinson's disease: Clinical Marers of Cholinergic Deficits in PD Müller, Martijn L. T. M.; Bohnen, Nicolaas I.; Kotagal, Vikas Movement Disorders, Vol. 30, Issue 2 https://doi.org/10.1002/mds.26061	journal	November 2014
Prion-like mechanisms in neurodegenerative diseases Frost, Bess; Diamond, Marc I. Nature Reviews Neuroscience, Vol. 11, Issue 3 https://doi.org/10.1038/nrn2786	journal	December 2009
Neuroprotective effects and mechanisms of exercise in a chronic mouse model of Parkinson’s disease with moderate neurodegeneration: Exercise neuroprotection in chronic parkinsonism Lau, Yuen-Sum; Patki, Gaurav; Das-Panja, Kaberi European Journal of Neuroscience, Vol. 33, Issue 7 https://doi.org/10.1111/j.1460-9568.2011.07626.x	journal	March 2011
The biology and pathology of the familial Parkinson's disease protein LRRK2: Familial PD Protein LRRK2 Dauer, William; Ho, Cherry Cheng-Ying Movement Disorders, Vol. 25, Issue S1 https://doi.org/10.1002/mds.22717	journal	January 2010
Early-onset parkinsonism caused by alpha-synuclein gene triplication: Clinical and genetic findings in a novel family Olgiati, Simone; Thomas, Astrid; Quadri, Marialuisa Parkinsonism & Related Disorders, Vol. 21, Issue 8 https://doi.org/10.1016/j.parkreldis.2015.06.005	journal	August 2015
Phenotypic characterization of recessive gene knockout rat models of Parkinson's disease Dave, Kuldip D.; De Silva, Shehan; Sheth, Niketa P. Neurobiology of Disease, Vol. 70 https://doi.org/10.1016/j.nbd.2014.06.009	journal	October 2014
Clinical, imaging, and molecular findings in a sample of Mexican families with pantothenate kinase-associated neurodegeneration: PKAN disease findings in Mexican families Morales-Briceño, H.; Chacón-Camacho, O. F.; Pérez-González, E. A. Clinical Genetics, Vol. 87, Issue 3 https://doi.org/10.1111/cge.12400	journal	June 2014
Multitarget drug discovery projects in CNS diseases: quantitative systems pharmacology as a possible path forward Geerts, Hugo; Kennis, Ludo Future Medicinal Chemistry, Vol. 6, Issue 16 https://doi.org/10.4155/fmc.14.97	journal	October 2014
Neuroimaging Study Designs, Computational Analyses and Data Provenance Using the LONI Pipeline Dinov, Ivo; Lozev, Kamen; Petrosyan, Petros PLoS ONE, Vol. 5, Issue 9 https://doi.org/10.1371/journal.pone.0013070	journal	September 2010
NeuroX, a fast and efficient genotyping platform for investigation of neurodegenerative diseases Nalls, Mike A.; Bras, Jose; Hernandez, Dena G. Neurobiology of Aging, Vol. 36, Issue 3 https://doi.org/10.1016/j.neurobiolaging.2014.07.028	journal	March 2015
Confidence Interval Based Parameter Estimation—A New SOCR Applet and Activity Christou, Nicolas; Dinov, Ivo D. PLoS ONE, Vol. 6, Issue 5 https://doi.org/10.1371/journal.pone.0019178	journal	May 2011
Population stratification and spurious allelic association Cardon, Lon R.; Palmer, Lyle J. The Lancet, Vol. 361, Issue 9357 https://doi.org/10.1016/S0140-6736(03)12520-2	journal	February 2003
SMOTE-RSB *: a hybrid preprocessing approach based on oversampling and undersampling for high imbalanced data-sets using SMOTE and rough sets theory Ramentol, Enislay; Caballero, Yailé; Bello, Rafael Knowledge and Information Systems, Vol. 33, Issue 2 https://doi.org/10.1007/s10115-011-0465-6	journal	December 2011
SMOTE: Synthetic Minority Over-sampling Technique Chawla, N. V.; Bowyer, K. W.; Hall, L. O. Journal of Artificial Intelligence Research, Vol. 16 https://doi.org/10.1613/jair.953	journal	January 2002
Multiple Imputation for Multivariate Missing-Data Problems: A Data Analyst's Perspective Schafer, Joseph L.; Olsen, Maren K. Multivariate Behavioral Research, Vol. 33, Issue 4 https://doi.org/10.1207/s15327906mbr3304_5	journal	October 1998
mice : Multivariate Imputation by Chained Equations in R Buuren, Stef van; Groothuis-Oudshoorn, Karin Journal of Statistical Software, Vol. 45, Issue 3 https://doi.org/10.18637/jss.v045.i03	journal	January 2011
Generalized Linear Models McCulloch, Charles E. Journal of the American Statistical Association, Vol. 95, Issue 452 https://doi.org/10.1080/01621459.2000.10474340	journal	December 2000
Handling drop-out in longitudinal clinical trials: a comparison of the LOCF and MMRM approaches Lane, Peter Pharmaceutical Statistics, Vol. 7, Issue 2 https://doi.org/10.1002/pst.267	journal	January 2008
Top 10 algorithms in data mining Wu, Xindong; Kumar, Vipin; Ross Quinlan, J. Knowledge and Information Systems, Vol. 14, Issue 1 https://doi.org/10.1007/s10115-007-0114-2	journal	December 2007
Support vector machines Hearst, M. A.; Dumais, S. T.; Osuna, E. IEEE Intelligent Systems and their Applications, Vol. 13, Issue 4 https://doi.org/10.1109/5254.708428	journal	July 1998
Nearest neighbor pattern classification Cover, T.; Hart, P. IEEE Transactions on Information Theory, Vol. 13, Issue 1, p. 21-27 https://doi.org/10.1109/TIT.1967.1053964	journal	January 1967
Building Predictive Models in R Using the caret Package Kuhn, Max Journal of Statistical Software, Vol. 28, Issue 5 https://doi.org/10.18637/jss.v028.i05	journal	January 2008
Dopaminergic modulation of striato-frontal connectivity during motor timing in Parkinson's disease Jahanshahi, M.; Jones, C. R. G.; Zijlmans, J. Brain, Vol. 133, Issue 3 https://doi.org/10.1093/brain/awq012	journal	March 2010
Cognitive Rehabilitation in Parkinson’s Disease: Evidence from Neuroimaging Nombela, Cristina; Bustillo, Pedro J.; Castell, Pedro F. Frontiers in Neurology, Vol. 2 https://doi.org/10.3389/fneur.2011.00082	journal	January 2011
Cortical volume and folding abnormalities in Parkinson's disease patients with pathological gambling Cerasa, Antonio; Salsone, Maria; Nigro, Salvatore Parkinsonism & Related Disorders, Vol. 20, Issue 11 https://doi.org/10.1016/j.parkreldis.2014.09.001	journal	November 2014
Regional Brain Differences in Cortical Thickness, Surface Area and Subcortical Volume in Individuals with Williams Syndrome Meda, Shashwath A.; Pryweller, Jennifer R.; Thornton-Wells, Tricia A. PLoS ONE, Vol. 7, Issue 2 https://doi.org/10.1371/journal.pone.0031913	journal	February 2012
Rule Evaluation Measures: A Unifying View Lavrač, Nada; Flach, Peter; Zupan, Blaz Inductive Logic Programming https://doi.org/10.1007/3-540-48751-4_17	book	January 1999
Generalized Cross-Validation as a Method for Choosing a Good Ridge Parameter Golub, Gene H.; Heath, Michael; Wahba, Grace Technometrics, Vol. 21, Issue 2 https://doi.org/10.1080/00401706.1979.10489751	journal	May 1979
Rational inference: deductive, inductive and probabilistic thinking Ormerod, R. J. Journal of the Operational Research Society, Vol. 61, Issue 8 https://doi.org/10.1057/jors.2009.96	journal	August 2010
Learning Vector Quantization Kohonen, Teuvo Self-Organizing Maps https://doi.org/10.1007/978-3-642-97610-0_6	book	January 1995
Feature Selection and Classification of Hyperspectral Images With Support Vector Machines Archibald, Rick; Fann, George IEEE Geoscience and Remote Sensing Letters, Vol. 4, Issue 4 https://doi.org/10.1109/LGRS.2007.905116	journal	October 2007
Application of Akaike's information criterion (AIC) in the evaluation of linear pharmacokinetic equations Yamaoka, Kiyoshi; Nakagawa, Terumichi; Uno, Toyozo Journal of Pharmacokinetics and Biopharmaceutics, Vol. 6, Issue 2 https://doi.org/10.1007/BF01117450	journal	April 1978
A Proposal for a Comprehensive Grading of Parkinson's Disease Severity Combining Motor and Non-Motor Assessments: Meeting an Unmet Need Ray Chaudhuri, Kallol; Rojo, Jose Manuel; Schapira, Anthony H. V. PLoS ONE, Vol. 8, Issue 2 https://doi.org/10.1371/journal.pone.0057221	journal	February 2013
An efficient diagnosis system for detection of Parkinson’s disease using fuzzy k-nearest neighbor approach Chen, Hui-Ling; Huang, Chang-Cheng; Yu, Xin-Gang Expert Systems with Applications, Vol. 40, Issue 1 https://doi.org/10.1016/j.eswa.2012.07.014	journal	January 2013
Diagnosis of Parkinson's disease on the basis of clinical and genetic classification: a population-based modelling study Nalls, Mike A.; McLean, Cory Y.; Rick, Jacqueline The Lancet Neurology, Vol. 14, Issue 10 https://doi.org/10.1016/S1474-4422(15)00178-7	journal	October 2015
Computer-Aided Diagnosis of Parkinson’s Disease Using Enhanced Probabilistic Neural Network Hirschauer, Thomas J.; Adeli, Hojjat; Buford, John A. Journal of Medical Systems, Vol. 39, Issue 11 https://doi.org/10.1007/s10916-015-0353-9	journal	September 2015
Early diagnosis of Parkinson's disease via machine learning on speech data Hazan, Hananel; Hilu, Dan; Manevitz, Larry 2012 IEEE 27th Convention of Electrical & Electronics Engineers in Israel (IEEEI 2012), 2012 IEEE 27th Convention of Electrical and Electronics Engineers in Israel https://doi.org/10.1109/EEEI.2012.6377065	conference	November 2012
Cerebrospinal fluid proteomic patterns discriminate Parkinson's disease and multiple system atrophy: Proteomic Pattern Analysis in PD and MSA Ishigami, Noriko; Tokuda, Takahiko; Ikegawa, Masaya Movement Disorders, Vol. 27, Issue 7 https://doi.org/10.1002/mds.24994	journal	June 2012
Algorithm for image-based biomarker detection for differential diagnosis of Parkinson's disease Singh, Gurpreet; Samavedham, Lakshminarayanan IFAC-PapersOnLine, Vol. 48, Issue 8 https://doi.org/10.1016/j.ifacol.2015.09.087	journal	January 2015
Multiclass classification of FDG PET scans for the distinction between Parkinson's disease and atypical parkinsonian syndromes Garraux, Gaëtan; Phillips, Christophe; Schrouff, Jessica NeuroImage: Clinical, Vol. 2 https://doi.org/10.1016/j.nicl.2013.06.004	journal	January 2013
The Alzheimer's disease neuroimaging initiative (ADNI): MRI methods Jack, Clifford R.; Bernstein, Matt A.; Fox, Nick C. Journal of Magnetic Resonance Imaging, Vol. 27, Issue 4 https://doi.org/10.1002/jmri.21049	journal	January 2008
Introduction to Information Retrieval Larson, Ray R. Journal of the American Society for Information Science and Technology https://doi.org/10.1002/asi.21234	journal	January 2009
Mixed-Effects Models Knoblauch, Kenneth; Maloney, Laurence T. Modeling Psychophysical Data in R https://doi.org/10.1007/978-1-4614-4475-6_9	book	January 2012
Learning Vector Quantization Encyclopedia of Machine Learning and Data Mining https://doi.org/10.1007/978-1-4899-7687-1_464	book	January 2017
Environmental toxins and Parkinson's disease Barker, Roger Trends in Neurosciences, Vol. 12, Issue 5 https://doi.org/10.1016/0166-2236(89)90066-0	journal	January 1989
Rational Inference: Deductive, Inductive and Probabilistic Thinking Ormerod, Richard J. SSRN Electronic Journal https://doi.org/10.2139/ssrn.2744503	journal	January 2010
Mixed-effects models Tibon, Roni; Quent, Jörn; Fuhrmann, Delia Open Science Framework https://doi.org/10.17605/osf.io/kgewp	text	January 2019
Generalized linear models Koerts, J. European Journal of Operational Research, Vol. 18, Issue 2 https://doi.org/10.1016/0377-2217(84)90200-5	journal	November 1984
Generalized Linear Models Hilbe, Joseph M. The American Statistician, Vol. 48, Issue 3 https://doi.org/10.1080/00031305.1994.10476073	journal	August 1994
SMOTE: Synthetic Minority Over-sampling Technique Chawla, N. V.; Bowyer, K. W.; Hall, L. O. arXiv https://doi.org/10.48550/arxiv.1106.1813	text	January 2011
Empirical Studies on Usability of mHealth Apps: A Systematic Literature Review Zapata, Belén Cruz; Fernández-Alemán, José Luis; Idri, Ali Journal of Medical Systems, Vol. 39, Issue 2 https://doi.org/10.1007/s10916-014-0182-2	journal	January 2015
The perfect neuroimaging-genetics-computation storm: collision of petabytes of data, millions of hardware devices and thousands of software tools Dinov, Ivo D.; Petrosyan, Petros Brain Imaging and Behavior https://doi.org/10.1007/s11682-013-9248-x	journal	August 2013
A diagnostic algorithm for Parkinson's disease: what next? Goldman, Samuel M. The Lancet Neurology, Vol. 14, Issue 10 https://doi.org/10.1016/s1474-4422(15)00192-1	journal	October 2015
PTEN regulates RPA1 and protects DNA replication forks Wang, Guangxi; Li, Yang; Wang, Pan Cell Research, Vol. 25, Issue 11 https://doi.org/10.1038/cr.2015.115	journal	September 2015
Large-scale meta-analysis of genome-wide association data identifies six new risk loci for Parkinson's disease Nalls, Mike A.; Pankratz, Nathan; Lill, Christina M. Nature Genetics, Vol. 46, Issue 9 https://doi.org/10.1038/ng.3043	journal	July 2014
Classification of Parkinson's Disease Gait Using Spatial-Temporal Gait Features Wahid, Ferdous; Begg, Rezaul K.; Hass, Chris J. IEEE Journal of Biomedical and Health Informatics, Vol. 19, Issue 6 https://doi.org/10.1109/jbhi.2015.2450232	journal	November 2015
Forecasting Hotspots—A Predictive Analytics Approach Maciejewski, R.; Hafen, R.; Rudolph, S. IEEE Transactions on Visualization and Computer Graphics, Vol. 17, Issue 4 https://doi.org/10.1109/tvcg.2010.82	journal	April 2011
Genes associated with Parkinson's disease: regulation of autophagy and beyond Beilina, Alexandra; Cookson, Mark R. Journal of Neurochemistry, Vol. 139 https://doi.org/10.1111/jnc.13266	journal	September 2015
SOCR data dashboard: an integrated big data archive mashing medicare, labor, census and econometric information Husain, Syed S.; Kalinin, Alexandr; Truong, Anh Journal of Big Data, Vol. 2, Issue 1 https://doi.org/10.1186/s40537-015-0018-z	journal	July 2015
REM sleep behavior disorder and REM sleep without atonia in Parkinson's disease Gagnon, J. - F.; Bedard, M. - A.; Fantini, M. L. Neurology, Vol. 59, Issue 4 https://doi.org/10.1212/wnl.59.4.585	journal	August 2002
GLUMIP2.0:SAS/IMLSoftware for Planning Internal Pilots Kairalla, John A.; Coffey, Christopher S.; Muller, Keith E. Journal of Statistical Software, Vol. 28, Issue 7 https://doi.org/10.18637/jss.v028.i07	journal	January 2008
High-throughput neuroimaging-genetics computational infrastructure Dinov, Ivo D.; Petrosyan, Petros; Liu, Zhizhong Frontiers in Neuroinformatics, Vol. 8 https://doi.org/10.3389/fninf.2014.00041	journal	April 2014

Cited By (22)

Model-Based and Model-Free Techniques for Amyotrophic Lateral Sclerosis Diagnostic Prediction and Patient Clustering Tang, Ming; Gao, Chao; Goutman, Stephen A. Neuroinformatics, Vol. 17, Issue 3 https://doi.org/10.1007/s12021-018-9406-9	journal	November 2018
Big data in IBD: a look into the future Olivera, Pablo; Danese, Silvio; Jay, Nicolas Nature Reviews Gastroenterology & Hepatology, Vol. 16, Issue 5 https://doi.org/10.1038/s41575-019-0102-5	journal	January 2019
Model-based and Model-free Machine Learning Techniques for Diagnostic Prediction and Classification of Clinical Outcomes in Parkinson’s Disease Gao, Chao; Sun, Hanbo; Wang, Tuo Scientific Reports, Vol. 8, Issue 1 https://doi.org/10.1038/s41598-018-24783-4	journal	May 2018
Predictive Big Data Analytics using the UK Biobank Data Zhou, Yiwang; Zhao, Lu; Zhou, Nina Scientific Reports, Vol. 9, Issue 1 https://doi.org/10.1038/s41598-019-41634-y	journal	April 2019
Harmonization of Respiratory Data From 9 US Population-Based Cohorts Oelsner, Elizabeth C.; Balte, Pallavi P.; Cassano, Patricia A. American Journal of Epidemiology, Vol. 187, Issue 11 https://doi.org/10.1093/aje/kwy139	journal	June 2018
Big data, observational research and P-value: a recipe for false-positive findings? A study of simulated and real prospective cohorts Veronesi, Giovanni; Grassi, Guido; Savelli, Giordano International Journal of Epidemiology, Vol. 49, Issue 3 https://doi.org/10.1093/ije/dyz206	journal	October 2019
Patient Subtyping via Time-Aware LSTM Networks Baytas, Inci M.; Xiao, Cao; Zhang, Xi KDD '17: The 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining https://doi.org/10.1145/3097983.3097997	conference	August 2017
Leveraging healthcare utilization to explore outcomes from musculoskeletal disorders: methodology for defining relevant variables from a health services data repository Rhon, Daniel I.; Clewley, Derek; Young, Jodi L. BMC Medical Informatics and Decision Making, Vol. 18, Issue 1 https://doi.org/10.1186/s12911-018-0588-8	journal	January 2018
Machine learning techniques for personalized breast cancer risk prediction: comparison with the BCRAT and BOADICEA models Ming, Chang; Viassolo, Valeria; Probst-Hensch, Nicole Breast Cancer Research, Vol. 21, Issue 1 https://doi.org/10.1186/s13058-019-1158-4	journal	June 2019
Collaboration between a human group and artificial intelligence can improve prediction of multiple sclerosis course: a proof-of-principle study Tacchella, Andrea; Romano, Silvia; Ferraldeschi, Michela F1000Research, Vol. 6 https://doi.org/10.12688/f1000research.13114.1	journal	January 2017
Big Data Analytics in Medicine and Healthcare Ristevski, Blagoj; Chen, Ming Journal of Integrative Bioinformatics, Vol. 15, Issue 3 https://doi.org/10.1515/jib-2017-0030	journal	May 2018
A Systematic Review on Healthcare Analytics: Application and Theoretical Perspective of Data Mining Islam, Md; Hasan, Md; Wang, Xiaoyi Healthcare, Vol. 6, Issue 2 https://doi.org/10.3390/healthcare6020054	journal	May 2018
Reproducible Big Data Science: A Case Study In Continuous Fairness Chard, Kyle; Madduri, Ravi; D'Arcy, Michael Zenodo https://doi.org/10.5281/zenodo.1484402	text	January 2018
Collaboration between a human group and artificial intelligence can improve prediction of multiple sclerosis course: a proof-of-principle study Tacchella, Andrea; Romano, Silvia; Ferraldeschi, Michela F1000Research, Vol. 6 https://doi.org/10.12688/f1000research.13114.2	journal	January 2017
Reproducible Big Data Science: A Case Study In Continuous Fairness Chard, Kyle; Madduri, Ravi; D'Arcy, Michael Zenodo https://doi.org/10.5281/zenodo.1484403	text	January 2018
Machine learning techniques for personalized breast cancer risk prediction : comparison with the BCRAT and BOADICEA models Ming, Chang; Viassolo, Valeria; Probst-Hensch, Nicole BioMed Central https://doi.org/10.5451/unibas-ep71305	text	January 2019
Quadruple Decision Making for Parkinson’s Disease Patients: Combining Expert Opinion, Patient Preferences, Scientific Evidence, and Big Data Approaches to Reach Precision Medicine van den Heuvel, Lieneke; Dorsey, Ray R.; Prainsack, Barbara Journal of Parkinson's Disease, Vol. 10, Issue 1 https://doi.org/10.3233/jpd-191712	journal	January 2020
Big Health Data and Cardiovascular Diseases: A Challenge for Research, an Opportunity for Clinical Care Silverio, Angelo; Cavallo, Pierpaolo; De Rosa, Roberta Frontiers in Medicine, Vol. 6 https://doi.org/10.3389/fmed.2019.00036	journal	February 2019
Correlations between Motor Symptoms across Different Motor Tasks, Quantified via Random Forest Feature Classification in Parkinson’s Disease Kuhner, Andreas; Schubert, Tobias; Cenciarini, Massimo Frontiers in Neurology, Vol. 8 https://doi.org/10.3389/fneur.2017.00607	journal	November 2017
Understanding Physiology in the Continuum: Integration of Information from Multiple -Omics Levels Kamisoglu, Kubra; Acevedo, Alison; Almon, Richard R. Frontiers in Pharmacology, Vol. 8 https://doi.org/10.3389/fphar.2017.00091	journal	February 2017
Modernizing the Methods and Analytics Curricula for Health Science Doctoral Programs Dinov, Ivo D. Frontiers in Public Health, Vol. 8 https://doi.org/10.3389/fpubh.2020.00022	journal	February 2020
A Policy Guide on Integrated Care (PGIC): Lessons Learned from EU Project INTEGRATE and Beyond Borgermans, Liesbeth; Devroey, Dirk International Journal of Integrated Care, Vol. 17, Issue 4 https://doi.org/10.5334/ijic.3295	journal	September 2017