Identifying schools at high-risk for elevated lead in drinking water using only publicly available data
Abstract
Estimating the risk of lead contamination of schools' drinking water at the State level is a complex, important, and unexplored challenge. Variable water quality among water systems and changes in water chemistry during distribution affect lead dissolution rates from pipes and fittings. In addition, the locations of lead-bearing plumbing materials are uncertain. We tested the capability of six machine learning models to predict the likelihood of lead contamination of drinking water at the schools' taps using only publicly available datasets. The predictive features used in the models correspond to those with a proven correlation to the dominant, but commonly unavailable, factors that govern lead leaching: the presence of lead-bearing plumbing materials and water quality conducive to lead corrosion. By combining water chemistry data from public reports, socioeconomic information from the US census, and spatial features using Geographic Information Systems, we trained and tested models to estimate the likelihood of lead contaminated tap water in over 8,000 schools across California and Massachusetts. Our best-performing model was a Random Forest, with a 10-fold cross validation score of 0.88 for Massachusetts and 0.78 for California using the average Area Under the Receiver Operating Characteristic Curve (ROC AUC) metric. The model was then usedmore »
- Authors:
-
- Univ. of California, Berkeley, CA (United States)
- Publication Date:
- Research Org.:
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
- Sponsoring Org.:
- USDOE Office of Science (SC); National Science Foundation (NSF)
- OSTI Identifier:
- 1821159
- Grant/Contract Number:
- AC02-05CH11231; DGE-1633740
- Resource Type:
- Accepted Manuscript
- Journal Name:
- Science of the Total Environment
- Additional Journal Information:
- Journal Volume: 803; Journal ID: ISSN 0048-9697
- Publisher:
- Elsevier
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 63 RADIATION, THERMAL, AND OTHER ENVIRON. POLLUTANT EFFECTS ON LIVING ORGS. AND BIOL. MAT.; lead in school drinking water; lead leaching; machine learning; environmental justice; public data mining
Citation Formats
Lobo, G. P., Laraway, J., and Gadgil, A. J. Identifying schools at high-risk for elevated lead in drinking water using only publicly available data. United States: N. p., 2021.
Web. doi:10.1016/j.scitotenv.2021.150046.
Lobo, G. P., Laraway, J., & Gadgil, A. J. Identifying schools at high-risk for elevated lead in drinking water using only publicly available data. United States. https://doi.org/10.1016/j.scitotenv.2021.150046
Lobo, G. P., Laraway, J., and Gadgil, A. J. Fri .
"Identifying schools at high-risk for elevated lead in drinking water using only publicly available data". United States. https://doi.org/10.1016/j.scitotenv.2021.150046. https://www.osti.gov/servlets/purl/1821159.
@article{osti_1821159,
title = {Identifying schools at high-risk for elevated lead in drinking water using only publicly available data},
author = {Lobo, G. P. and Laraway, J. and Gadgil, A. J.},
abstractNote = {Estimating the risk of lead contamination of schools' drinking water at the State level is a complex, important, and unexplored challenge. Variable water quality among water systems and changes in water chemistry during distribution affect lead dissolution rates from pipes and fittings. In addition, the locations of lead-bearing plumbing materials are uncertain. We tested the capability of six machine learning models to predict the likelihood of lead contamination of drinking water at the schools' taps using only publicly available datasets. The predictive features used in the models correspond to those with a proven correlation to the dominant, but commonly unavailable, factors that govern lead leaching: the presence of lead-bearing plumbing materials and water quality conducive to lead corrosion. By combining water chemistry data from public reports, socioeconomic information from the US census, and spatial features using Geographic Information Systems, we trained and tested models to estimate the likelihood of lead contaminated tap water in over 8,000 schools across California and Massachusetts. Our best-performing model was a Random Forest, with a 10-fold cross validation score of 0.88 for Massachusetts and 0.78 for California using the average Area Under the Receiver Operating Characteristic Curve (ROC AUC) metric. The model was then used to assign a lead leaching risk category to half of the schools across California (the other half was used for training). There was good agreement between the modeled risk categories and the actual lead leaching outcomes for every school; however, the model overestimated the lead leaching risk in up to 17% of the schools. This model is the first of its kind to offer a tool to predict the risk of lead leaching in schools at the State level. Further use of this model can help deploy limited resources more effectively to prevent childhood lead exposure from school drinking water.},
doi = {10.1016/j.scitotenv.2021.150046},
journal = {Science of the Total Environment},
number = ,
volume = 803,
place = {United States},
year = {Fri Sep 03 00:00:00 EDT 2021},
month = {Fri Sep 03 00:00:00 EDT 2021}
}
Works referenced in this record:
Lessons Learned From Helping Schools Manage Lead in Drinking Water to Protect Children's Health
journal, October 2018
- Burlingame, Gary A.; Bailey, Cathy; Nelson, James
- Journal - American Water Works Association, Vol. 110, Issue 10
Formation and Aggregation of Lead Phosphate Particles: Implications for Lead Immobilization in Water Supply Systems
journal, September 2018
- Zhao, Juntao; Giammar, Daniel E.; Pasteris, Jill D.
- Environmental Science & Technology, Vol. 52, Issue 21
Effects of orthophosphate corrosion inhibitor on lead in blended water quality environments
journal, January 2010
- Duranceau, Steven J.; Lintereur, Phillip A.; Taylor, James S.
- Desalination and Water Treatment, Vol. 13, Issue 1-3
Effect of pH on the concentrations of lead and trace contaminants in drinking water: A combined batch, pipe loop and sentinel home study
journal, April 2011
- Kim, Eun Jung; Herrera, Jose E.; Huggins, Dan
- Water Research, Vol. 45, Issue 9
Seasonal Variations in Lead Release to Potable Water
journal, May 2016
- Masters, Sheldon; Welter, Gregory J.; Edwards, Marc
- Environmental Science & Technology, Vol. 50, Issue 10
Spatial and seasonal variability of tap water disinfection by-products within distribution pipe networks
journal, February 2015
- Charisiadis, Pantelis; Andra, Syam S.; Makris, Konstantinos C.
- Science of The Total Environment, Vol. 506-507
Effect of water chemistry on the dissolution rate of the lead corrosion product hydrocerussite
journal, May 2014
- Noel, James D.; Wang, Yin; Giammar, Daniel E.
- Water Research, Vol. 54
Failing Our Children: Lead in U.S. School Drinking Water
journal, February 2010
- Lambrinidou, Yanna; Triantafyllidou, Simoni; Edwards, Marc
- NEW SOLUTIONS: A Journal of Environmental and Occupational Health Policy, Vol. 20, Issue 1
Self-Learning Random Forests Model for Mapping Groundwater Yield in Data-Scarce Areas
journal, October 2018
- Sameen, Maher Ibrahim; Pradhan, Biswajeet; Lee, Saro
- Natural Resources Research, Vol. 28, Issue 3
The Drinking Water Disparities Framework: On the Origins and Persistence of Inequities in Exposure
journal, April 2014
- Balazs, Carolina L.; Ray, Isha
- American Journal of Public Health, Vol. 104, Issue 4
Evaluating the Effects of Full and Partial Lead Service Line Replacement on Lead Levels in Drinking Water
journal, July 2016
- Trueman, Benjamin F.; Camara, Eliman; Gagnon, Graham A.
- Environmental Science & Technology, Vol. 50, Issue 14
Orthophosphate Interactions with Destabilized PbO 2 Scales
journal, October 2020
- DeSantis, Michael K.; Schock, Michael R.; Tully, Jennifer
- Environmental Science & Technology, Vol. 54, Issue 22
Optimal scheduling of booster disinfection in water distribution networks
journal, October 2017
- Sert, Çağlayan; Altan-Sakarya, A. Burcu
- Civil Engineering and Environmental Systems, Vol. 34, Issue 3-4
Electrochemistry of Free Chlorine and Monochloramine and its Relevance to the Presence of Pb in Drinking Water
journal, June 2007
- Rajasekharan, Vishnu V.; Clark, Brandi N.; Boonsalee, Sansanee
- Environmental Science & Technology, Vol. 41, Issue 12
Lead in drinking water at North Carolina childcare centers: Piloting a citizen science-based testing strategy
journal, April 2020
- Redmon, Jennifer Hoponick; Levine, Keith E.; Aceituno, Anna M.
- Environmental Research, Vol. 183
Modeling groundwater nitrate concentrations in private wells in Iowa
journal, December 2015
- Wheeler, David C.; Nolan, Bernard T.; Flory, Abigail R.
- Science of The Total Environment, Vol. 536
THE RACIAL ECOLOGY OF LEAD POISONING: Toxic Inequality in Chicago Neighborhoods, 1995-2013
journal, January 2016
- Sampson, Robert J.; Winter, Alix S.
- Du Bois Review: Social Science Research on Race, Vol. 13, Issue 2
In-pipe water quality monitoring in water supply systems under steady and unsteady state flow conditions: A quantitative assessment
journal, January 2012
- Aisopou, Angeliki; Stoianov, Ivan; Graham, Nigel J. D.
- Water Research, Vol. 46, Issue 1
Corrosion control in water supply systems: Effect of pH, alkalinity, and orthophosphate on lead and copper leaching from brass plumbing
journal, September 2009
- Tam, Y. S.; Elefsiniotis, P.
- Journal of Environmental Science and Health, Part A, Vol. 44, Issue 12
Kinetics of lead(IV) oxide (PbO2) reductive dissolution: Role of lead(II) adsorption and surface speciation
journal, January 2013
- Wang, Yin; Wu, Jiewei; Wang, Zimeng
- Journal of Colloid and Interface Science, Vol. 389, Issue 1
National Survey of Lead Service Line Occurrence
journal, April 2016
- Cornwell, David A.; Brown, Richard A.; Via, Steve H.
- Journal - American Water Works Association, Vol. 108
An Update on Childhood Lead Poisoning
journal, September 2017
- Hauptman, Marissa; Bruccoleri, Rebecca; Woolf, Alan D.
- Clinical Pediatric Emergency Medicine, Vol. 18, Issue 3
Variability and sampling of lead (Pb) in drinking water: Assessing potential human exposure depends on the sampling protocol
journal, January 2021
- Triantafyllidou, Simoni; Burkhardt, Jonathan; Tully, Jennifer
- Environment International, Vol. 146
Elevated Blood Lead Levels in Children Associated With the Flint Drinking Water Crisis: A Spatial Analysis of Risk and Public Health Response
journal, February 2016
- Hanna-Attisha, Mona; LaChance, Jenny; Sadler, Richard Casey
- American Journal of Public Health, Vol. 106, Issue 2
Machine Learning Models of Groundwater Arsenic Spatial Distribution in Bangladesh: Influence of Holocene Sediment Depositional History
journal, July 2020
- Tan, Zhen; Yang, Qiang; Zheng, Yan
- Environmental Science & Technology, Vol. 54, Issue 15
Lead (Pb) in Tap Water and in Blood: Implications for Lead Exposure in the United States
journal, July 2012
- Triantafyllidou, Simoni; Edwards, Marc
- Critical Reviews in Environmental Science and Technology, Vol. 42, Issue 13
Chloride‐to‐sulfate mass ratio and lead leaching to water
journal, July 2007
- Edwards, Marc; Triantafyllidou, Simoni
- Journal AWWA, Vol. 99, Issue 7
Lead variability testing in Seattle Public Schools
journal, February 2008
- Boyd, Glen R.; Pierson, Gregory L.; Kirmeyer, Gregory J.
- Journal - American Water Works Association, Vol. 100, Issue 2
Developing a framework for classifying water lead levels at private drinking water systems: A Bayesian Belief Network approach
journal, February 2021
- Fasaee, Mohammad Ali Khaksar; Berglund, Emily; Pieper, Kelsey J.
- Water Research, Vol. 189
Drinking water lead and socioeconomic factors as predictors of blood lead levels in New Jersey's children between two time periods
journal, February 2019
- Gleason, Jessie A.; Nanavaty, Jaydeep V.; Fagliano, Jerald A.
- Environmental Research, Vol. 169
Evidence that Monochloramine Disinfectant Could Lead to Elevated Pb Levels in Drinking Water
journal, April 2006
- Switzer, Jay A.; Rajasekharan, Vishnu V.; Boonsalee, Sansanee
- Environmental Science & Technology, Vol. 40, Issue 10
Monitoring and Control Experience Under the Lead and Copper Rule
journal, February 1993
- Ramaley, Brian L.
- Journal - American Water Works Association, Vol. 85, Issue 2
Weaknesses in Federal Drinking Water Regulations and Public Health Policies that Impede Lead Poisoning Prevention and Environmental Justice
journal, August 2016
- Katner, Adrienne; Pieper, Kelsey J.; Lambrinidou, Yanna
- Environmental Justice, Vol. 9, Issue 4
Water and poverty in the United States
journal, September 2007
- Wescoat, James L.; Headington, Lisa; Theobald, Rebecca
- Geoforum, Vol. 38, Issue 5
Causes of temporal variability of lead in domestic plumbing systems
journal, July 1990
- Schock, Michael R.
- Environmental Monitoring and Assessment, Vol. 15, Issue 1
Modeling Soluble and Particulate Lead Release into Drinking Water from Full and Partially Replaced Lead Service Lines
journal, March 2017
- Abokifa, Ahmed A.; Biswas, Pratim
- Environmental Science & Technology, Vol. 51, Issue 6
Control of Lead Sources in the United States, 1970-2017: Public Health Progress and Current Challenges to Eliminating Lead Exposure
journal, January 2019
- Dignam, Timothy; Kaufmann, Rachel B.; LeStourgeon, Lauren
- Journal of Public Health Management and Practice, Vol. 25, Issue 1
Importance of pipe deposits to Lead and Copper Rule compliance
journal, July 2014
- Schock, Michael R.; Cantor, Abigail F.; Triantafyllidou, Simoni
- Journal - American Water Works Association, Vol. 106, Issue 7
Influence of natural organic matter on the corrosion of leaded brass in potable water
journal, January 2000
- Korshin, Gregory V.; Ferguson, John F.; Lancaster, Alice N.
- Corrosion Science, Vol. 42, Issue 1
The Color of Drinking Water: Class, Race, Ethnicity, and Safe Drinking Water Act Compliance
journal, December 2017
- Switzer, David; Teodoro, Manuel P.
- Journal - American Water Works Association, Vol. 109