Machine learning prediction of incidence of Alzheimer’s disease using large-scale administrative health data

Park, Ji Hwan; Cho, Han Eol; Kim, Jong Hun; Wall, Melanie M.; Stern, Yaakov; Lim, Hyunsun; Yoo, Shinjae; Kim, Hyoung Seop; Cha, Jiook

doi:10.1038/s41746-020-0256-0

Title: Machine learning prediction of incidence of Alzheimer’s disease using large-scale administrative health data

Journal Article · Thu Mar 26 00:00:00 EDT 2020 · npj Digital Medicine

DOI:https://doi.org/10.1038/s41746-020-0256-0· OSTI ID:1618403

Park, Ji Hwan ^[1];

^[2]; Kim, Jong Hun ^[3]; Wall, Melanie M. ^[4]; Stern, Yaakov ^[4]; Lim, Hyunsun ^[3]; Yoo, Shinjae ^[1];

^[3];

^[5]

Brookhaven National Lab. (BNL), Upton, NY (United States)
Yonsei Univ. College of Medicine, Seoul (Korea)
National Health Insurance Service Ilsan Hospital, Goyang (Korea)
Columbia Univ., New York, NY (United States)
Columbia Univ., New York, NY (United States); Seoul National Univ. (Korea)

Nationwide population-based cohort provides a new opportunity to build an automated risk prediction model based on individuals’ history of health and healthcare beyond existing risk prediction models. We tested the possibility of machine learning models to predict future incidence of Alzheimer’s disease (AD) using large-scale administrative health data. From the Korean National Health Insurance Service database between 2002 and 2010, we obtained de-identified health data in elders above 65 years (N = 40,736) containing 4,894 unique clinical features including ICD-10 codes, medication codes, laboratory values, history of personal and family illness and socio-demographics. To define incident AD we considered two operational definitions: “definite AD” with diagnostic codes and dementia medication (n = 614) and “probable AD” with only diagnosis (n = 2026). We trained and validated random forest, support vector machine and logistic regression to predict incident AD in 1, 2, 3, and 4 subsequent years. For predicting future incidence of AD in balanced samples (bootstrapping), the machine learning models showed reasonable performance in 1-year prediction with AUC of 0.775 and 0.759, based on “definite AD” and “probable AD” outcomes, respectively; in 2-year, 0.730 and 0.693; in 3-year, 0.677 and 0.644; in 4-year, 0.725 and 0.683. The results were similar when the entire (unbalanced) samples were used. Important clinical features selected in logistic regression included hemoglobin level, age and urine protein level. This study may shed a light on the utility of the data-driven machine learning model based on large-scale administrative health data in AD risk prediction, which may enable better selection of individuals at risk for AD in clinical trials or early detection in clinical settings.

View Accepted Manuscript (DOE)

Cite

Export

Save

Research Organization:: Brookhaven National Laboratory (BNL), Upton, NY (United States)

Sponsoring Organization:: USDOE Office of Science (SC), Advanced Scientific Computing Research (SC-21); Seoul National University; NHS Ilsan Hospital Research Support Program; National Institutes of Health (NIH); Brain Behavior Research Foundation Young Investigator Award; Korean Scientists and Engineers Association Young Investigator Grant; National Research Foundation of Korea (NRF); Ministry of Science

Grant/Contract Number:: SC0012704; K01-MH109836

OSTI ID:: 1618403

Report Number(s):: BNL-215918-2020-JAAM

Journal Information:: npj Digital Medicine, Vol. 3, Issue 1; ISSN 2398-6352

Publisher:: Springer NatureCopyright Statement

Country of Publication:: United States

Language:: English

References (37)

Monetary Costs of Dementia in the United States Hurd, Michael D.; Martorell, Paco; Delavande, Adeline New England Journal of Medicine, Vol. 368, Issue 14 https://doi.org/10.1056/NEJMsa1204629	journal	April 2013
The Value of Delaying Alzheimer’s Disease Onset Zissimopoulos, Julie; Crimmins, Eileen; St. Clair, Patricia Forum for Health Economics and Policy, Vol. 18, Issue 1 https://doi.org/10.1515/fhep-2014-0013	journal	January 2015
Big data analytics in healthcare: promise and potential Raghupathi, Wullianallur; Raghupathi, Viju Health Information Science and Systems, Vol. 2, Issue 1 https://doi.org/10.1186/2047-2501-2-3	journal	February 2014
Dementia risk prediction in the population: are screening models accurate? Stephan, Blossom C. M.; Kurth, Tobias; Matthews, Fiona E. Nature Reviews Neurology, Vol. 6, Issue 6 https://doi.org/10.1038/nrneurol.2010.54	journal	May 2010
Multiple cognitive deficits during the transition to Alzheimer's disease Backman, L.; Jones, S.; Berger, A. -K. Journal of Internal Medicine, Vol. 256, Issue 3 https://doi.org/10.1111/j.1365-2796.2004.01386.x	journal	September 2004
Cognitive Deficits 3 to 6 Years Before Dementia Onset in a Population Sample: The Honolulu-Asia Aging Study: COGNITIVE DEFICITS BEFORE DEMENTIA ONSET Jorm, Anthony F.; Masaki, Kamal H.; Petrovitch, Helen Journal of the American Geriatrics Society, Vol. 53, Issue 3 https://doi.org/10.1111/j.1532-5415.2005.53163.x	journal	March 2005
Predictive models to assess risk of type 2 diabetes, hypertension and comorbidity: machine-learning algorithms and validation using national health data from Kuwait—a cohort study Farran, Bassam; Channanath, Arshad Mohamed; Behbehani, Kazem BMJ Open, Vol. 3, Issue 5 https://doi.org/10.1136/bmjopen-2012-002457	journal	January 2013
Prediction models to identify individuals at risk of metabolic syndrome who are unlikely to participate in a health intervention program Shimoda, Akihiro; Ichikawa, Daisuke; Oyama, Hiroshi International Journal of Medical Informatics, Vol. 111 https://doi.org/10.1016/j.ijmedinf.2017.12.009	journal	March 2018
Ten-year prediction of suicide death using Cox regression and machine learning in a nationwide retrospective cohort study in South Korea Choi, Soo Beom; Lee, Wanhyung; Yoon, Jin-Ha Journal of Affective Disorders, Vol. 231 https://doi.org/10.1016/j.jad.2018.01.019	journal	April 2018
Evaluation of Machine-Learning Algorithms for Predicting Opioid Overdose Risk Among Medicare Beneficiaries With Opioid Prescriptions Lo-Ciganic, Wei-Hsuan; Huang, James L.; Zhang, Hao H. JAMA Network Open, Vol. 2, Issue 3 https://doi.org/10.1001/jamanetworkopen.2019.0968	journal	March 2019
Predicting drug-resistant epilepsy — A machine learning approach based on administrative claims data An, Sungtae; Malhotra, Kunal; Dilley, Cynthia Epilepsy & Behavior, Vol. 89 https://doi.org/10.1016/j.yebeh.2018.10.013	journal	December 2018
Predicting the Future — Big Data, Machine Learning, and Clinical Medicine Obermeyer, Ziad; Emanuel, Ezekiel J. New England Journal of Medicine, Vol. 375, Issue 13 https://doi.org/10.1056/NEJMp1606181	journal	September 2016
On the Prospects for a (Deep) Learning Health Care System Naylor, C. David JAMA, Vol. 320, Issue 11 https://doi.org/10.1001/jama.2018.11103	journal	September 2018
Deep Learning—A Technology With the Potential to Transform Health Care Hinton, Geoffrey JAMA, Vol. 320, Issue 11 https://doi.org/10.1001/jama.2018.11100	journal	September 2018
Current Developments in Dementia Risk Prediction Modelling: An Updated Systematic Review Tang, Eugene Y. H.; Harrison, Stephanie L.; Errington, Linda PLOS ONE, Vol. 10, Issue 9 https://doi.org/10.1371/journal.pone.0136181	journal	September 2015
Scalable and accurate deep learning with electronic health records Rajkomar, Alvin; Oren, Eyal; Chen, Kai npj Digital Medicine, Vol. 1, Issue 1 https://doi.org/10.1038/s41746-018-0029-1	journal	May 2018
Anaemia increases the risk of dementia in cognitively intact elderly Atti, A.; Palmer, K.; Volpato, S. Neurobiology of Aging, Vol. 27, Issue 2 https://doi.org/10.1016/j.neurobiolaging.2005.02.007	journal	February 2006
Hemoglobin level in older persons and incident Alzheimer disease: Prospective cohort analysis Shah, R. C.; Buchman, A. S.; Wilson, R. S. Neurology, Vol. 77, Issue 3 https://doi.org/10.1212/WNL.0b013e318225aaa9	journal	July 2011
Anemia and risk of dementia in older adults: Findings from the Health ABC study Hong, C. H.; Falvey, C.; Harris, T. B. Neurology, Vol. 81, Issue 6 https://doi.org/10.1212/WNL.0b013e31829e701d	journal	July 2013
Anemia is associated with incidence of dementia: a national health screening study in Korea involving 37,900 persons Jeong, Su-Min; Shin, Dong Wook; Lee, Ji Eun Alzheimer's Research & Therapy, Vol. 9, Issue 1 https://doi.org/10.1186/s13195-017-0322-2	journal	December 2017
Diagnostic Accuracy of Urine Dipsticks for Detection of Albuminuria in the General Community White, Sarah L.; Yu, Richard; Craig, Jonathan C. American Journal of Kidney Diseases, Vol. 58, Issue 1 https://doi.org/10.1053/j.ajkd.2010.12.026	journal	July 2011
Short-term treatment with tolfenamic acid improves cognitive functions in Alzheimer's disease mice Subaiea, Gehad M.; Adwan, Lina I.; Ahmed, Aseef H. Neurobiology of Aging, Vol. 34, Issue 10 https://doi.org/10.1016/j.neurobiolaging.2013.04.002	journal	October 2013
Tolfenamic acid reduces tau and CDK5 levels: implications for dementia and tauopathies Adwan, Lina; Subaiea, Gehad M.; Basha, Riyaz Journal of Neurochemistry, Vol. 133, Issue 2 https://doi.org/10.1111/jnc.12960	journal	October 2014
Tolfenamic acid downregulates BACE1 and protects against lead-induced upregulation of Alzheimer's disease related biomarkers Adwan, Lina; Subaiea, Gehad M.; Zawia, Nasser H. Neuropharmacology, Vol. 79 https://doi.org/10.1016/j.neuropharm.2014.01.009	journal	April 2014
Tolfenamic Acid: A Modifier of the Tau Protein and its Role in Cognition and Tauopathy Chang, Joanna K.; Leso, Allison; Subaiea, Gehad M. Current Alzheimer Research, Vol. 15, Issue 7 https://doi.org/10.2174/1567205015666180119104036	journal	May 2018
Psychotropic Medication Burden and Factors Associated with Antipsychotic Use: An Analysis of a Population-Based Sample of Community-Dwelling Older Persons with Dementia Rhee, YongJoo; Csernansky, John G.; Emanuel, Linda L. Journal of the American Geriatrics Society, Vol. 59, Issue 11 https://doi.org/10.1111/j.1532-5415.2011.03660.x	journal	September 2011
Schizophrenia and risk of dementia: a meta-analysis study Cai, Laisheng; Huang, Jingwei Neuropsychiatric Disease and Treatment, Vol. Volume 14 https://doi.org/10.2147/NDT.S172933	journal	January 2018
The treatment of cognitive dysfunction in dementia: a multiple treatments meta-analysis Perng, Cheng-Hwang; Chang, Yue-Cune; Tzang, Ruu-Fen Psychopharmacology, Vol. 235, Issue 5 https://doi.org/10.1007/s00213-018-4867-y	journal	March 2018
Role of Vasodilation in Cognitive Impairment McLennan, Skye N.; Lam, Ada K.; Mathias, Jane L. International Journal of Stroke, Vol. 6, Issue 3 https://doi.org/10.1111/j.1747-4949.2011.00601.x	journal	May 2011
A Nationwide Survey on the Prevalence of Dementia and Mild Cognitive Impairment in South Korea Kim, Ki Woong; Park, Joon Hyuk; Kim, Myoung-Hee Journal of Alzheimer's Disease, Vol. 23, Issue 2 https://doi.org/10.3233/JAD-2010-101221	journal	February 2011
Cohort Profile: The National Health Insurance Service–National Sample Cohort (NHIS-NSC), South Korea Lee, Juneyoung; Lee, Ji Sung; Park, Sook-Hee International Journal of Epidemiology https://doi.org/10.1093/ije/dyv319	journal	January 2016
All Patient Refined-Diagnosis Related Groups’ (APR-DRGs) Severity of Illness and Risk of Mortality as predictors of in-hospital mortality Santos, João Vasco; Viana, João; Pinto, Carla Journal of Medical Systems, Vol. 46, Issue 6 https://doi.org/10.1007/s10916-022-01805-3	journal	May 2022
Risk score for the prediction of dementia risk in 20 years among middle aged people: a longitudinal, population-based study Kivipelto, Miia; Ngandu, Tiia; Laatikainen, Tiina The Lancet Neurology, Vol. 5, Issue 9 https://doi.org/10.1016/s1474-4422(06)70537-3	journal	September 2006
Comparison of Proteinuria Determination by Urine Dipstick, Spot Urine Protein Creatinine Index, and Urine Protein 24 Hours in Lupus Patients Chotayaporn, Thanyaluk; Kasitanon, Nuntana; Sukitawut, Waraporn Journal of Clinical Rheumatology, Vol. 17, Issue 3 https://doi.org/10.1097/rhu.0b013e318214bd18	journal	January 2011
Dementia risk in renal dysfunction: A systematic review and meta-analysis of prospective studies Deckers, Kay; Camerino, Ileana; van Boxtel, Martin P. J. Neurology, Vol. 88, Issue 2 https://doi.org/10.1212/wnl.0000000000003482	journal	December 2016
Projections of Alzheimer's disease in the United States and the public health impact of delaying disease onset. Brookmeyer, R.; Gray, S.; Kawas, C. American Journal of Public Health, Vol. 88, Issue 9 https://doi.org/10.2105/ajph.88.9.1337	journal	September 1998
Identification of Physician-Diagnosed Alzheimer’s Disease and Related Dementias in Population-Based Administrative Data: A Validation Study Using Family Physicians’ Electronic Medical Records Jaakkimainen, R. Liisa; Bronskill, Susan E.; Tierney, Mary C. Journal of Alzheimer's Disease, Vol. 54, Issue 1 https://doi.org/10.3233/jad-160105	journal	August 2016

Cited By (23)

Period, birth cohort and prevalence of dementia in mainland China, Hong Kong and Taiwan: a meta‐analysis Wu, Yu‐Tzu; Lee, Hsin‐yi; Norton, Samuel International Journal of Geriatric Psychiatry, Vol. 29, Issue 12 https://doi.org/10.1002/gps.4148	journal	May 2014
Prevalence of dementia in East Asia: a synthetic review of time trends Wu, Yu‐Tzu; Brayne, Carol; Matthews, Fiona E. International Journal of Geriatric Psychiatry, Vol. 30, Issue 8 https://doi.org/10.1002/gps.4297	journal	May 2015
Clinical Implications of Quantitative Electroencephalography and Current Source Density in Patients with Alzheimer’s Disease Kim, Ji-Sun; Lee, Seung-Hwan; Park, Gewnhi Brain Topography, Vol. 25, Issue 4 https://doi.org/10.1007/s10548-012-0234-1	journal	June 2012
Cognitive Function and Quality of Life in Community-Dwelling Seniors with Mild Cognitive Impairment in Taiwan Hsiao, Hua-Tsen; Li, Shu-Ying; Yang, Ya-Ping Community Mental Health Journal, Vol. 52, Issue 4 https://doi.org/10.1007/s10597-016-9993-6	journal	March 2016
Approaches in methodology for population-based longitudinal study on neuroprotective model for healthy longevity (TUA) among Malaysian Older Adults Shahar, Suzana; Omar, Azahadi; Vanoh, Divya Aging Clinical and Experimental Research, Vol. 28, Issue 6 https://doi.org/10.1007/s40520-015-0511-4	journal	December 2015
The prevalence of mild cognitive impairment and its etiological subtypes in elderly Chinese Jia, Jianping; Zhou, Aihong; Wei, Cuibai Alzheimer's & Dementia, Vol. 10, Issue 4 https://doi.org/10.1016/j.jalz.2013.09.008	journal	January 2014
Current and past leisure time physical activity in relation to risk of Alzheimer's disease in older adults Ogino, Erika; Manly, Jennifer J.; Schupf, Nicole Alzheimer's & Dementia, Vol. 15, Issue 12 https://doi.org/10.1016/j.jalz.2019.07.013	journal	October 2019
The prevalence and incidence of dementia with Lewy bodies: a systematic review of population and clinical studies Vann Jones, S. A.; O'Brien, J. T. Psychological Medicine, Vol. 44, Issue 4 https://doi.org/10.1017/S0033291713000494	journal	March 2013
Impact of illiteracy on depression symptomatology in community-dwelling older adults Kim, Byung-Soo; Lee, Dong-Woo; Bae, Jae Nam International Psychogeriatrics, Vol. 26, Issue 10 https://doi.org/10.1017/s1041610214001094	journal	June 2014
The role of depression in the insomnia of people with subjective memory impairment, mild cognitive impairment, and dementia in a community sample of elderly individuals in South Korea Kim, Won-Hyoung; Kim, Ji-Hyun; Kim, Byung-Soo International Psychogeriatrics, Vol. 29, Issue 4 https://doi.org/10.1017/s1041610216002076	journal	December 2016
Association between lifestyle and cognitive impairment among women aged 65 years and over in the Republic of Korea Lee, Haein; Park, Sunhee; Lim, Kyounjoo Educational Gerontology, Vol. 42, Issue 3 https://doi.org/10.1080/03601277.2015.1085794	journal	September 2015
Mental health service utilization among Korean elders in Korean churches: preliminary findings from the Memory and Aging Study of Koreans in Maryland (MASK-MD) Lee, Hochang Benjamin; Han, Hae-Ra; Huh, Bo-Yun Aging & Mental Health, Vol. 18, Issue 1 https://doi.org/10.1080/13607863.2013.814099	journal	July 2013
One-year mortality among newly admitted older patients in a long-term care hospital in South Korea Kim, Mi Sook; Shin, Dong-Soo; Kim, SookNyeo Australasian Journal on Ageing, Vol. 37, Issue 3 https://doi.org/10.1111/ajag.12567	journal	July 2018
Mild cognitive impairment: a concept in evolution Petersen, R. C.; Caracciolo, B.; Brayne, C. Journal of Internal Medicine, Vol. 275, Issue 3 https://doi.org/10.1111/joim.12190	journal	March 2014
Alzheimer's disease with cerebrovascular disease: current status in the Asia-Pacific region Chen, C.; Homma, A.; Mok, V. C. T. Journal of Internal Medicine, Vol. 280, Issue 4 https://doi.org/10.1111/joim.12495	journal	March 2016
Rehabilitation of lost teeth related to maintenance of cognitive function Shin, Myung‐Seop; Shin, Yoo Jin; Karna, Sandeep Oral Diseases, Vol. 25, Issue 1 https://doi.org/10.1111/odi.12960	journal	February 2018
Productive Activities and Risk of Cognitive Impairment and Depression: Does the Association Vary by Gender? Lee, Haena; Ang, Shannon Sociological Perspectives, Vol. 63, Issue 4 https://doi.org/10.1177/0731121419892622	journal	December 2019
Burden of disease due to dementia in the elderly population of Korea: present and future Park, Jae-Hyun; Eum, Jin-Hee; Bold, Bolor BMC Public Health, Vol. 13, Issue 1 https://doi.org/10.1186/1471-2458-13-293	journal	April 2013
Incidence and predictors of mild cognitive impairment (MCI) within a multi-ethnic Asian populace: a community-based longitudinal study Hussin, Norlela Mohd; Shahar, Suzana; Yahya, Hanis Mastura BMC Public Health, Vol. 19, Issue 1 https://doi.org/10.1186/s12889-019-7508-4	journal	August 2019
The changing prevalence and incidence of dementia over time — current evidence Wu, Y-T; Beiser, As; Breteler, Mmb Apollo - University of Cambridge Repository https://doi.org/10.17863/cam.12022	text	January 2017
Cancer Prevention Using Machine Learning, Nudge Theory and Social Impact Bond Misawa, Daitaro; Fukuyoshi, Jun; Sengoku, Shintaro International Journal of Environmental Research and Public Health, Vol. 17, Issue 3 https://doi.org/10.3390/ijerph17030790	journal	January 2020
Traditional Korean East Asian Medicines and Herbal Formulations for Cognitive Impairment Kumar, Hemant; Song, Soo-Yeol; More, Sandeep Molecules, Vol. 18, Issue 12 https://doi.org/10.3390/molecules181214670	journal	November 2013
Cognitive Stimulation as a Therapeutic Modality for Dementia: A Meta-Analysis Kim, Kayoung; Han, Ji Won; So, Yoonseop Psychiatry Investigation, Vol. 14, Issue 5 https://doi.org/10.4306/pi.2017.14.5.626	journal	January 2017

Similar Records

Advanced Analytics Studies Applied to US Department of Veterans Affairs' Corporate Data Warehouse (Initial Draft)

Technical Report · Mon Oct 01 00:00:00 EDT 2018 · OSTI ID:1618403

Park, Byung Hoony; Laska, Jason A.; Klasky, Hilda B.; +7 more

WE-A-16A-01: International Medical Physics Symposium: Increasing Access to Medical Physics Education/Training and Research Excellence

Journal Article · Sun Jun 15 00:00:00 EDT 2014 · Medical Physics · OSTI ID:1618403

Bortfeld, T; Ngoma, T; Odedina, F; +4 more

Diagnosis and prognosis of Alzheimer's disease using brain morphometry and white matter connectomes

Journal Article · Mon May 13 00:00:00 EDT 2019 · NeuroImage: Clinical · OSTI ID:1618403

Wang, Yun; Xu, Chenxiao; Park, Ji-Hwan; +6 more

Related Subjects

97 MATHEMATICS AND COMPUTING
Alzheimer's disease
predictive markers

Title: Machine learning prediction of incidence of Alzheimer’s disease using large-scale administrative health data

Citation Formats

References (37)

Cited By (23)

Similar Records

Related Subjects