DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Using machine learning to derive cloud condensation nuclei number concentrations from commonly available measurements

Abstract

Cloud condensation nuclei (CCN) number concentrations are an important aspect of aerosol–cloud interactions and the subsequent climate effects; however, their measurements are very limited. We use a machine learning tool, random decision forests, to develop a random forest regression model (RFRM) to derive CCN at 0.4 % supersaturation ([CCN0.4]) from commonly available measurements. The RFRM is trained on the long-term simulations in a global size-resolved particle microphysics model. Using atmospheric state and composition variables as predictors, through associations of their variabilities, the RFRM is able to learn the underlying dependence of [CCN0.4] on these predictors, which are as follows: eight fractions of PM2.5 (NH4, SO4, NO3, secondary organic aerosol (SOA), black carbon (BC), primary organic carbon (POC), dust, and salt), seven gaseous species (NOx, NH3, O3, SO2, OH, isoprene, and monoterpene), and four meteorological variables (temperature (T), relative humidity (RH), precipitation, and solar radiation). The RFRM is highly robust: it has a median mean fractional bias (MFB) of 4.4 % with ≈96.33 % of the derived [CCN0.4] within a good agreement range of -60%<+60% and strong correlation of Kendall's τ coefficient ≈0.88. The RFRM demonstrates its robustness over 4 orders of magnitude of [CCN0.4] over varying spatial (such as continentalmore » to oceanic, clean to polluted, and near-surface to upper troposphere) and temporal (from the hourly to the decadal) scales. At the Atmospheric Radiation Measurement Southern Great Plains observatory (ARM SGP) in Lamont, Oklahoma, United States, long-term measurements for PM2.5 speciation (NH4, SO4, NO3, and organic carbon (OC)), NOx, O3, SO2, T, and RH, as well as [CCN0.4] are available. We modify, optimize, and retrain the developed RFRM to make predictions from 19 to 9 of these available predictors. This retrained RFRM (RFRM-ShortVars) shows a reduction in performance due to the unavailability and sparsity of measurements (predictors); it captures the [CCN0.4] variability and magnitude at SGP with ≈67.02 % of the derived values in the good agreement range. This work shows the potential of using the more commonly available measurements of PM2.5 speciation to alleviate the sparsity of CCN number concentrations' measurements.« less

Authors:
ORCiD logo [1]; ORCiD logo [1]
  1. State Univ. of New York (SUNY), Albany, NY (United States)
Publication Date:
Research Org.:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Atmospheric Radiation Measurement (ARM) Data Center
Sponsoring Org.:
USDOE Office of Science (SC), Biological and Environmental Research (BER); National Science Foundation (NSF); National Aeronautics and Space Administration (NASA); New York State Energy Research and Development Authority
OSTI Identifier:
1725776
Grant/Contract Number:  
AC05-76RL01830; AGS-1550816; NNX17AG35G; 137487
Resource Type:
Accepted Manuscript
Journal Name:
Atmospheric Chemistry and Physics (Online)
Additional Journal Information:
Journal Name: Atmospheric Chemistry and Physics (Online); Journal Volume: 20; Journal Issue: 21; Journal ID: ISSN 1680-7324
Publisher:
Copernicus Publications, EGU
Country of Publication:
United States
Language:
English
Subject:
54 ENVIRONMENTAL SCIENCES

Citation Formats

Nair, Arshad Arjunan, and Yu, Fangqun. Using machine learning to derive cloud condensation nuclei number concentrations from commonly available measurements. United States: N. p., 2020. Web. doi:10.5194/acp-20-12853-2020.
Nair, Arshad Arjunan, & Yu, Fangqun. Using machine learning to derive cloud condensation nuclei number concentrations from commonly available measurements. United States. https://doi.org/10.5194/acp-20-12853-2020
Nair, Arshad Arjunan, and Yu, Fangqun. Thu . "Using machine learning to derive cloud condensation nuclei number concentrations from commonly available measurements". United States. https://doi.org/10.5194/acp-20-12853-2020. https://www.osti.gov/servlets/purl/1725776.
@article{osti_1725776,
title = {Using machine learning to derive cloud condensation nuclei number concentrations from commonly available measurements},
author = {Nair, Arshad Arjunan and Yu, Fangqun},
abstractNote = {Cloud condensation nuclei (CCN) number concentrations are an important aspect of aerosol–cloud interactions and the subsequent climate effects; however, their measurements are very limited. We use a machine learning tool, random decision forests, to develop a random forest regression model (RFRM) to derive CCN at 0.4 % supersaturation ([CCN0.4]) from commonly available measurements. The RFRM is trained on the long-term simulations in a global size-resolved particle microphysics model. Using atmospheric state and composition variables as predictors, through associations of their variabilities, the RFRM is able to learn the underlying dependence of [CCN0.4] on these predictors, which are as follows: eight fractions of PM2.5 (NH4, SO4, NO3, secondary organic aerosol (SOA), black carbon (BC), primary organic carbon (POC), dust, and salt), seven gaseous species (NOx, NH3, O3, SO2, OH, isoprene, and monoterpene), and four meteorological variables (temperature (T), relative humidity (RH), precipitation, and solar radiation). The RFRM is highly robust: it has a median mean fractional bias (MFB) of 4.4 % with ≈96.33 % of the derived [CCN0.4] within a good agreement range of -60%<+60% and strong correlation of Kendall's τ coefficient ≈0.88. The RFRM demonstrates its robustness over 4 orders of magnitude of [CCN0.4] over varying spatial (such as continental to oceanic, clean to polluted, and near-surface to upper troposphere) and temporal (from the hourly to the decadal) scales. At the Atmospheric Radiation Measurement Southern Great Plains observatory (ARM SGP) in Lamont, Oklahoma, United States, long-term measurements for PM2.5 speciation (NH4, SO4, NO3, and organic carbon (OC)), NOx, O3, SO2, T, and RH, as well as [CCN0.4] are available. We modify, optimize, and retrain the developed RFRM to make predictions from 19 to 9 of these available predictors. This retrained RFRM (RFRM-ShortVars) shows a reduction in performance due to the unavailability and sparsity of measurements (predictors); it captures the [CCN0.4] variability and magnitude at SGP with ≈67.02 % of the derived values in the good agreement range. This work shows the potential of using the more commonly available measurements of PM2.5 speciation to alleviate the sparsity of CCN number concentrations' measurements.},
doi = {10.5194/acp-20-12853-2020},
journal = {Atmospheric Chemistry and Physics (Online)},
number = 21,
volume = 20,
place = {United States},
year = {Thu Nov 05 00:00:00 EST 2020},
month = {Thu Nov 05 00:00:00 EST 2020}
}

Works referenced in this record:

Feasibility study of multi-pixel retrieval of optical thickness and droplet effective radius of inhomogeneous clouds using deep learning
journal, January 2017

  • Okamura, Rintaro; Iwabuchi, Hironobu; Schmidt, K. Sebastian
  • Atmospheric Measurement Techniques, Vol. 10, Issue 12
  • DOI: 10.5194/amt-10-4747-2017

Impact of nucleation on global CCN
journal, January 2009

  • Merikanto, J.; Spracklen, D. V.; Mann, G. W.
  • Atmospheric Chemistry and Physics, Vol. 9, Issue 21
  • DOI: 10.5194/acp-9-8601-2009

A machine learning approach to aerosol classification for single-particle mass spectrometry
journal, January 2018

  • Christopoulos, Costa D.; Garimella, Sarvesh; Zawadowicz, Maria A.
  • Atmospheric Measurement Techniques, Vol. 11, Issue 10
  • DOI: 10.5194/amt-11-5687-2018

ARM: Aerosol Observing System (AOS): cloud condensation nuclei data
dataset, January 2009

  • Hageman, Derek; Behrens, Bill; Smith, Scott
  • Atmospheric Radiation Measurement (ARM) Archive, Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (US);
  • DOI: 10.5439/1150249

H 2 SO 4 –H 2 O–NH 3 ternary ion-mediated nucleation (TIMN): kinetic-based model and comparison with CLOUD measurements
journal, January 2018

  • Yu, Fangqun; Nadykto, Alexey B.; Herb, Jason
  • Atmospheric Chemistry and Physics, Vol. 18, Issue 23
  • DOI: 10.5194/acp-18-17451-2018

Predicting atmospheric particle formation days by Bayesian classification of the time series features
journal, January 2018


ARM: ARMBE: Atmospheric measurements
dataset, January 1994

  • Chen, Xiao; Xie, Shaocheng
  • Atmospheric Radiation Measurement (ARM) Archive, Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (US);
  • DOI: 10.5439/1095313

The Model of Emissions of Gases and Aerosols from Nature version 2.1 (MEGAN2.1): an extended and updated framework for modeling biogenic emissions
journal, January 2012

  • Guenther, A. B.; Jiang, X.; Heald, C. L.
  • Geoscientific Model Development, Vol. 5, Issue 6
  • DOI: 10.5194/gmd-5-1471-2012

A global perspective on aerosol from low-volatility organic compounds
journal, January 2010


Spatioseasonal Variations of Atmospheric Ammonia Concentrations Over the United States: Comprehensive Model‐Observation Comparison
journal, June 2019

  • Nair, Arshad Arjunan; Yu, Fangqun; Luo, Gan
  • Journal of Geophysical Research: Atmospheres, Vol. 124, Issue 12
  • DOI: 10.1029/2018JD030057

Aerosols, Cloud Microphysics, and Fractional Cloudiness
journal, September 1989


HEMCO v1.0: a versatile, ESMF-compliant component for calculating emissions in atmospheric models
journal, January 2014

  • Keller, C. A.; Long, M. S.; Yantosca, R. M.
  • Geoscientific Model Development, Vol. 7, Issue 4
  • DOI: 10.5194/gmd-7-1409-2014

Bagging predictors
journal, August 1996


ARM: Aerosol Observing System (AOS): aerosol data, 1-min, mentor-QC applied
dataset, January 1996

  • Hageman, Derek; Behrens, Bill; Smith, Scott
  • Atmospheric Radiation Measurement (ARM) Archive, Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (US);
  • DOI: 10.5439/1025259

A Continuous-Flow Streamwise Thermal-Gradient CCN Chamber for Atmospheric Measurements
journal, March 2005


aosso2
dataset, January 2012

  • Trojanowski, Rebecca
  • Atmospheric Radiation Measurement (ARM) Archive, Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (US); ARM Data Center, Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
  • DOI: 10.5439/1095586

ARM: Aerosol Observing System (AOS): cloud condensation nuclei data, averaged
dataset, January 2007

  • Shi, Yan; Flynn, Connor
  • Atmospheric Radiation Measurement (ARM) Archive, Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (US)
  • DOI: 10.5439/1095302

Analysis of daily, monthly, and annual burned area using the fourth-generation global fire emissions database (GFED4): ANALYSIS OF BURNED AREA
journal, March 2013

  • Giglio, Louis; Randerson, James T.; van der Werf, Guido R.
  • Journal of Geophysical Research: Biogeosciences, Vol. 118, Issue 1
  • DOI: 10.1002/jgrg.20042

An Aerosol Chemical Speciation Monitor (ACSM) for Routine Monitoring of the Composition and Mass Concentrations of Ambient Aerosol
journal, July 2011


ARM: AOS: aerosol chemical speciation monitor
dataset, January 1990

  • Behrens, Bill; Salwen, Cynthia; Springston, Stephen
  • Atmospheric Radiation Measurement (ARM) Archive, Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (US);
  • DOI: 10.5439/1046180

The Influence of Pollution on the Shortwave Albedo of Clouds
journal, July 1977


ARM: ARM-standard Meteorological Instrumentation at Surface
dataset, January 1993

  • Holdridge, Donna; Kyrouac, Jenni
  • Atmospheric Radiation Measurement (ARM) Archive, Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (US);
  • DOI: 10.5439/1025220

Global and regional decreases in tropospheric oxidants from photochemical effects of aerosols: PHOTOCHEMICAL EFFECTS OF AEROSOLS
journal, February 2003

  • Martin, Randall V.; Jacob, Daniel J.; Yantosca, Robert M.
  • Journal of Geophysical Research: Atmospheres, Vol. 108, Issue D3
  • DOI: 10.1029/2002JD002622

Why Kendall Tau?
journal, May 1981


Anthropogenic contribution to cloud condensation nuclei and the first aerosol indirect climate effect
journal, May 2013


A Cloud Chamber Study of the Effect That Nonprecipitating Water Clouds Have on the Aerosol Size Distribution
journal, January 1994

  • Hoppel, W. A.; Frick, G. M.; Fitzgerald, J. W.
  • Aerosol Science and Technology, Vol. 20, Issue 1
  • DOI: 10.1080/02786829408959660

ARM: AOS: Cloud Condensation Nuclei Counter (Single Column), averaged
dataset, January 2011

  • Smith, Scott; Salwen, Cynthia; Uin, Janek
  • Atmospheric Radiation Measurement (ARM) Archive, Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (US);
  • DOI: 10.5439/1342133

ARM: AOS: Cloud Condensation Nuclei Counter
dataset, January 2011

  • Smith, Scott; Salwen, Cynthia; Uin, Janek
  • Atmospheric Radiation Measurement (ARM) Archive, Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (US);
  • DOI: 10.5439/1256093

ranger : A Fast Implementation of Random Forests for High Dimensional Data in C++ and R
journal, January 2017

  • Wright, Marvin N.; Ziegler, Andreas
  • Journal of Statistical Software, Vol. 77, Issue 1
  • DOI: 10.18637/jss.v077.i01

Machine Learning to Predict the Global Distribution of Aerosol Mixing State Metrics
journal, January 2018

  • Hughes, Michael; Kodros, John; Pierce, Jeffrey
  • Atmosphere, Vol. 9, Issue 1
  • DOI: 10.3390/atmos9010015

Neural network for aerosol retrieval from hyperspectral imagery
journal, January 2019

  • Mauceri, Steffen; Kindel, Bruce; Massie, Steven
  • Atmospheric Measurement Techniques, Vol. 12, Issue 11
  • DOI: 10.5194/amt-12-6017-2019

Identification of new particle formation events with deep learning
journal, January 2018

  • Joutsensaari, Jorma; Ozon, Matthew; Nieminen, Tuomo
  • Atmospheric Chemistry and Physics, Vol. 18, Issue 13
  • DOI: 10.5194/acp-18-9597-2018

Natural and transboundary pollution influences on sulfate-nitrate-ammonium aerosols in the United States: Implications for policy
journal, January 2004


Global modeling of tropospheric chemistry with assimilated meteorology: Model description and evaluation
journal, October 2001

  • Bey, Isabelle; Jacob, Daniel J.; Yantosca, Robert M.
  • Journal of Geophysical Research: Atmospheres, Vol. 106, Issue D19
  • DOI: 10.1029/2001JD000807

Machine learning for observation bias correction with application to dust storm data assimilation
journal, January 2019

  • Jin, Jianbing; Lin, Hai Xiang; Segers, Arjo
  • Atmospheric Chemistry and Physics, Vol. 19, Issue 15
  • DOI: 10.5194/acp-19-10009-2019

Improving our fundamental understanding of the role of aerosol−cloud interactions in the climate system
journal, May 2016

  • Seinfeld, John H.; Bretherton, Christopher; Carslaw, Kenneth S.
  • Proceedings of the National Academy of Sciences, Vol. 113, Issue 21
  • DOI: 10.1073/pnas.1514043113

Building a cloud in the southeast Atlantic: understanding low-cloud controls based on satellite observations with machine learning
journal, January 2018

  • Fuchs, Julia; Cermak, Jan; Andersen, Hendrik
  • Atmospheric Chemistry and Physics, Vol. 18, Issue 22
  • DOI: 10.5194/acp-18-16537-2018

Random forest meteorological normalisation models for Swiss PM10 trend analysis
journal, January 2018

  • Grange, Stuart K.; Carslaw, David C.; Lewis, Alastair C.
  • Atmospheric Chemistry and Physics, Vol. 18, Issue 9
  • DOI: 10.5194/acp-18-6223-2018

Optimized regional and interannual variability of lightning in a global chemical transport model constrained by LIS/OTD satellite data: IAV OF LIGHTNING CONSTRAINED BY LIS/OTD
journal, October 2012

  • Murray, Lee T.; Jacob, Daniel J.; Logan, Jennifer A.
  • Journal of Geophysical Research: Atmospheres, Vol. 117, Issue D20
  • DOI: 10.1029/2012JD017934

Hyperparameters and tuning strategies for random forest
journal, November 2018

  • Probst, Philipp; Wright, Marvin N.; Boulesteix, Anne‐Laure
  • Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, Vol. 9, Issue 3
  • DOI: 10.1002/widm.1301

Random Forests
journal, January 2001


Impact of temperature dependence on the possible contribution of organics to new particle formation in the atmosphere
journal, January 2017

  • Yu, Fangqun; Luo, Gan; Nadykto, Alexey B.
  • Atmospheric Chemistry and Physics, Vol. 17, Issue 8
  • DOI: 10.5194/acp-17-4997-2017

aosacsm.b1
dataset, January 2019

  • Zawadowicz, Maria; Watson, Thomas
  • Atmospheric Radiation Measurement (ARM) Archive, Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (US); ARM Data Center, Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
  • DOI: 10.5439/1558768

ARM: AOS: Cloud Condensation Nuclei Counter (Single Column), averaged
dataset, January 1951

  • Smith, Scott; Salwen, Cynthia; Uin, Janek
  • Atmospheric Radiation Measurement (ARM) Archive, Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (US);
  • DOI: 10.5439/1342393