skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A Comparison of Machine Learning Methods to Forecast Tropospheric Ozone Levels in Delhi

Journal Article · · Atmosphere (Basel)

Ground-level ozone is a pollutant that is harmful to urban populations, particularly in developing countries where it is present in significant quantities. It greatly increases the risk of heart and lung diseases and harms agricultural crops. This study hypothesized that, as a secondary pollutant, ground-level ozone is amenable to 24 h forecasting based on measurements of weather conditions and primary pollutants such as nitrogen oxides and volatile organic compounds. We developed software to analyze hourly records of 12 air pollutants and 5 weather variables over the course of one year in Delhi, India. To determine the best predictive model, eight machine learning algorithms were tuned, trained, tested, and compared using cross-validation with hourly data for a full year. The algorithms, ranked by R2 values, were XGBoost (0.61), Random Forest (0.61), K-Nearest Neighbor Regression (0.55), Support Vector Regression (0.48), Decision Trees (0.43), AdaBoost (0.39), and linear regression (0.39). When trained by separate seasons across five years, the predictive capabilities of all models increased, with a maximum R2 of 0.75 during winter. Bidirectional Long Short-Term Memory was the least accurate model for annual training, but had some of the best predictions for seasonal training. Out of five air quality index categories, the XGBoost model was able to predict the correct category 24 h in advance 90% of the time when trained with full-year data. Separated by season, winter is considerably more predictable (97.3%), followed by post-monsoon (92.8%), monsoon (90.3%), and summer (88.9%). These results show the importance of training machine learning methods with season-specific data sets and comparing a large number of methods for specific applications.

Research Organization:
Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
Sponsoring Organization:
USDOE Office of Science (SC)
Grant/Contract Number:
89233218CNA000001
OSTI ID:
1853933
Report Number(s):
LA-UR-21-31571
Journal Information:
Atmosphere (Basel), Vol. 13, Issue 1; ISSN 2073-4433
Publisher:
MDPICopyright Statement
Country of Publication:
United States
Language:
English

References (57)

Air Pollution Concentration Forecast Method Based on the Deep Ensemble Neural Network journal October 2020
A novel Encoder-Decoder model based on read-first LSTM for air pollutant prediction journal April 2021
Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets journal November 2020
Monthly runoff forecasting based on LSTM–ALO model journal May 2018
Applications of Deep Learning to Ocean Data Inference and Subgrid Parameterization journal January 2019
A review of artificial neural network models for ambient air pollution prediction journal September 2019
Determination of Deep Learning Model and Optimum Length of Training Data in the River with Large Fluctuations in Flow Rates journal December 2020
Forecasting the Carbon Price Using Extreme-Point Symmetric Mode Decomposition and Extreme Learning Machine Optimized by the Grey Wolf Optimizer Algorithm journal March 2019
Summarizing multiple aspects of model performance in a single diagram journal April 2001
Using neural networks for prediction of air pollution index in industrial city journal October 2017
Measurement and prediction of ozone levels around a heavily industrialized area: a neural network approach journal February 2001
XGBoost: A Scalable Tree Boosting System conference January 2016
Evaluating a Space-Based Indicator of Surface Ozone-NO x -VOC Sensitivity Over Midlatitude Source Regions and Application to Decadal Trends : Space-Based Indicator of O journal October 2017
Estimating reference evapotranspiration using hybrid adaptive fuzzy inferencing coupled with heuristic algorithms journal December 2021
Prediction of hourly ozone concentrations with multiple regression and multilayer perceptron models journal January 2016
Application of GWO-ELM Model to Prediction of Caojiatuo Landslide Displacement in the Three Gorge Reservoir Area journal June 2020
The Ant Lion Optimizer journal May 2015
Air quality modelling using long short-term memory (LSTM) over NCT-Delhi, India journal April 2019
Prediction of land surface temperature of major coastal cities of India using bidirectional LSTM neural networks journal September 2021
Understanding the true effects of the COVID-19 lockdown on air pollution by means of machine learning journal April 2021
A comprehensive evaluation of air pollution prediction improvement by a machine learning method conference November 2015
Air Pollution Prediction with Multi-Modal Data and Deep Neural Networks journal December 2020
Protein Structure Prediction Using Rosetta book January 2004
Air pollution prediction by using an artificial neural network model journal May 2019
Real Time Attention Based Bidirectional Long Short-Term Memory Networks for Air Pollution Forecasting conference April 2019
Air Quality Prediction in Smart Cities Using Machine Learning Technologies based on Sensor Data: A Review journal April 2020
Improving streamflow prediction using a new hybrid ELM model combined with hybrid particle swarm optimization and grey wolf optimization journal October 2021
Air Pollution Prediction Using Long Short-Term Memory (LSTM) and Deep Autoencoder (DAE) Models journal March 2020
The DOE E3SM Coupled Model Version 1: Overview and Evaluation at Standard Resolution journal July 2019
Review on air pollution of Delhi zone using machine learning algorithm journal June 2021
Regression-based flexible models for photochemical air pollutants in the national capital territory of megacity Delhi journal June 2021
Meta optimization of an adaptive neuro-fuzzy inference system with grey wolf optimizer and biogeography-based optimization algorithms for spatial prediction of landslide susceptibility journal April 2019
An Evaluation of the Ocean and Sea Ice Climate of E3SM Using MPAS and Interannual CORE‐II Forcing journal May 2019
A Sequence-to-Sequence Air Quality Predictor Based on the n-Step Recurrent Prediction journal January 2019
Machine Learning-Based Prediction of Air Quality journal December 2020
Predicting River Flow Using an AI-Based Sequential Adaptive Neuro-Fuzzy Inference System journal June 2020
Modeling of daily pan evaporation in sub tropical climates using ANN, LS-SVR, Fuzzy Logic, and ANFIS journal September 2014
A model for particulate matter (PM2.5) prediction for Delhi based on machine learning approaches journal January 2020
An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression journal August 1992
Estimation of Air Pollution in Delhi Using Machine Learning Techniques conference September 2018
LSTM Network Based on on Antlion Optimization and its Application in Flight Trajectory Prediction
  • Zhang, Zhenxing; Yang, Rennong; Fang, Yuhuan
  • 2018 2nd IEEE Advanced Information Management,Communicates, Electronic and Automation Control Conference (IMCEC), 2018 2nd IEEE Advanced Information Management,Communicates,Electronic and Automation Control Conference (IMCEC) https://doi.org/10.1109/IMCEC.2018.8469476
conference May 2018
Predicting ozone levels from climatic parameters and leaf traits of Bel-W3 tobacco variety journal May 2019
Support-vector networks journal September 1995
A Machine Learning Model for Air Quality Prediction for Smart Cities conference March 2019
Outdoor Air Pollution: Ozone Health Effects journal April 2007
Forecasting of air quality in Delhi using principal component regression technique journal October 2011
Forcing for statistically stationary compressible isotropic turbulence journal November 2010
Grey Wolf Optimizer journal March 2014
Spatiotemporal distributions of surface ozone levels in China from 2005 to 2017: A machine learning approach journal September 2020
Instance-based learning algorithms journal January 1991
Machine learning versus linear regression modelling approach for accurate ozone concentrations prediction journal January 2020
Satellite-based estimation of full-coverage ozone (O3) concentration and health effect assessment across Hainan Island journal January 2020
Forecasting and Evaluating Water Quality of Chao Lake based on an Improved Decision Tree Method journal January 2010
ANFIS: adaptive-network-based fuzzy inference system journal January 1993
Ground-level Ozone Prediction Using Machine Learning Techniques: A Case Study in Amman, Jordan journal May 2020
Bridging observations, theory and numerical simulation of the ocean using machine learning journal July 2021
Development and Testing of a Decision Tree for the Forecasting of Sea Fog Along the Georgia and South Carolina Coast journal June 2018