DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Using soil library hyperspectral reflectance and machine learning to predict soil organic carbon: Assessing potential of airborne and spaceborne optical soil sensing

Abstract

Soil organic carbon (SOC) is a key variable to determine soil functioning, ecosystem services, and global carbon cycles. Spectroscopy, particularly optical hyperspectral reflectance coupled with machine learning, can provide rapid, efficient, and cost-effective quantification of SOC. However, how to exploit soil hyperspectral reflectance to predict SOC concentration, and the potential performance of airborne and satellite data for predicting surface SOC at large scales remain relatively underknown. Here, this study utilized a continental-scale soil laboratory spectral library (37,540 full-pedon 350–2500 nm reflectance spectra with SOC concentration of 0–780 g·kg–1 across the US) to thoroughly evaluate seven machine learning algorithms including Partial-Least Squares Regression (PLSR), Random Forest (RF), K-Nearest Neighbors (KNN), Ridge, Artificial Neural Networks (ANN), Convolutional Neural Networks (CNN), and Long Short-Term Memory (LSTM) along with four preprocessed spectra, i.e. original, vector normalization, continuum removal, and first-order derivative, to quantify SOC concentration. Furthermore, by using the coupled soil-vegetation-atmosphere radiative transfer model, we simulated twelve airborne and spaceborne hyper/multi-spectral remote sensing data from surface bare soil laboratory spectra to evaluate their potential for estimating SOC concentration of surface bare soils. Results show that LSTM achieved best predictive performance of quantifying SOC concentration for the whole data sets (R2 = 0.96, RMSE =more » 30.81 g·kg–1), mineral soils (SOC ≤ 120 g·kg–1, R2 = 0.71, RMSE = 10.60 g·kg–1), and organic soils (SOC > 120 g·kg–1, R2 = 0.78, RMSE = 62.31 g·kg–1). Spectral data preprocessing, particularly the first-order derivative, improved the performance of PLSR, RF, Ridge, KNN, and ANN, but not LSTM or CNN. We found that the SOC models of mineral and organic soils should be distinguished given their distinct spectral signatures. Finally, we identified that the shortwave infrared is vital for airborne and spaceborne hyperspectral sensors to monitor surface SOC. This study highlights the high accuracy of LSTM with hyperspectral/multispectral data to mitigate a certain level of noise (soil moisture <0.4 m3·m–3, green leaf area < 0.3 m2·m–2, plant residue <0.4 m2·m–2) for quantifying surface SOC concentration. Forthcoming satellite hyperspectral missions like Surface Biology and Geology (SBG) have a high potential for future global soil carbon monitoring, while high-resolution satellite multispectral fusion data can be an alternative.« less

Authors:
 [1];  [1];  [1];  [1];  [1];  [2];  [1];  [1];  [1];  [1]
  1. Univ. of Illinois at Urbana-Champaign, IL (United States)
  2. Univ. of Nebraska, Lincoln, NE (United States)
Publication Date:
Research Org.:
Univ. of Illinois at Urbana-Champaign, IL (United States)
Sponsoring Org.:
USDOE Advanced Research Projects Agency - Energy (ARPA-E); USDA
OSTI Identifier:
1977612
Grant/Contract Number:  
AR0001382
Resource Type:
Accepted Manuscript
Journal Name:
Remote Sensing of Environment
Additional Journal Information:
Journal Volume: 271; Journal Issue: C; Journal ID: ISSN 0034-4257
Publisher:
Elsevier
Country of Publication:
United States
Language:
English
Subject:
54 ENVIRONMENTAL SCIENCES; Spectroscopy; Soil organic carbon; Hyperspectral reflectance; Radiative transfer modeling; Machine learning; Long short-term memory; SBG

Citation Formats

Wang, Sheng, Guan, Kaiyu, Zhang, Chenhui, Lee, DoKyoung, Margenot, Andrew J., Ge, Yufeng, Peng, Jian, Zhou, Wang, Zhou, Qu, and Huang, Yizhi. Using soil library hyperspectral reflectance and machine learning to predict soil organic carbon: Assessing potential of airborne and spaceborne optical soil sensing. United States: N. p., 2022. Web. doi:10.1016/j.rse.2022.112914.
Wang, Sheng, Guan, Kaiyu, Zhang, Chenhui, Lee, DoKyoung, Margenot, Andrew J., Ge, Yufeng, Peng, Jian, Zhou, Wang, Zhou, Qu, & Huang, Yizhi. Using soil library hyperspectral reflectance and machine learning to predict soil organic carbon: Assessing potential of airborne and spaceborne optical soil sensing. United States. https://doi.org/10.1016/j.rse.2022.112914
Wang, Sheng, Guan, Kaiyu, Zhang, Chenhui, Lee, DoKyoung, Margenot, Andrew J., Ge, Yufeng, Peng, Jian, Zhou, Wang, Zhou, Qu, and Huang, Yizhi. Wed . "Using soil library hyperspectral reflectance and machine learning to predict soil organic carbon: Assessing potential of airborne and spaceborne optical soil sensing". United States. https://doi.org/10.1016/j.rse.2022.112914. https://www.osti.gov/servlets/purl/1977612.
@article{osti_1977612,
title = {Using soil library hyperspectral reflectance and machine learning to predict soil organic carbon: Assessing potential of airborne and spaceborne optical soil sensing},
author = {Wang, Sheng and Guan, Kaiyu and Zhang, Chenhui and Lee, DoKyoung and Margenot, Andrew J. and Ge, Yufeng and Peng, Jian and Zhou, Wang and Zhou, Qu and Huang, Yizhi},
abstractNote = {Soil organic carbon (SOC) is a key variable to determine soil functioning, ecosystem services, and global carbon cycles. Spectroscopy, particularly optical hyperspectral reflectance coupled with machine learning, can provide rapid, efficient, and cost-effective quantification of SOC. However, how to exploit soil hyperspectral reflectance to predict SOC concentration, and the potential performance of airborne and satellite data for predicting surface SOC at large scales remain relatively underknown. Here, this study utilized a continental-scale soil laboratory spectral library (37,540 full-pedon 350–2500 nm reflectance spectra with SOC concentration of 0–780 g·kg–1 across the US) to thoroughly evaluate seven machine learning algorithms including Partial-Least Squares Regression (PLSR), Random Forest (RF), K-Nearest Neighbors (KNN), Ridge, Artificial Neural Networks (ANN), Convolutional Neural Networks (CNN), and Long Short-Term Memory (LSTM) along with four preprocessed spectra, i.e. original, vector normalization, continuum removal, and first-order derivative, to quantify SOC concentration. Furthermore, by using the coupled soil-vegetation-atmosphere radiative transfer model, we simulated twelve airborne and spaceborne hyper/multi-spectral remote sensing data from surface bare soil laboratory spectra to evaluate their potential for estimating SOC concentration of surface bare soils. Results show that LSTM achieved best predictive performance of quantifying SOC concentration for the whole data sets (R2 = 0.96, RMSE = 30.81 g·kg–1), mineral soils (SOC ≤ 120 g·kg–1, R2 = 0.71, RMSE = 10.60 g·kg–1), and organic soils (SOC > 120 g·kg–1, R2 = 0.78, RMSE = 62.31 g·kg–1). Spectral data preprocessing, particularly the first-order derivative, improved the performance of PLSR, RF, Ridge, KNN, and ANN, but not LSTM or CNN. We found that the SOC models of mineral and organic soils should be distinguished given their distinct spectral signatures. Finally, we identified that the shortwave infrared is vital for airborne and spaceborne hyperspectral sensors to monitor surface SOC. This study highlights the high accuracy of LSTM with hyperspectral/multispectral data to mitigate a certain level of noise (soil moisture <0.4 m3·m–3, green leaf area < 0.3 m2·m–2, plant residue <0.4 m2·m–2) for quantifying surface SOC concentration. Forthcoming satellite hyperspectral missions like Surface Biology and Geology (SBG) have a high potential for future global soil carbon monitoring, while high-resolution satellite multispectral fusion data can be an alternative.},
doi = {10.1016/j.rse.2022.112914},
journal = {Remote Sensing of Environment},
number = C,
volume = 271,
place = {United States},
year = {Wed Feb 02 00:00:00 EST 2022},
month = {Wed Feb 02 00:00:00 EST 2022}
}

Works referenced in this record:

Global soil carbon: understanding and managing the largest terrestrial carbon pool
journal, February 2014

  • Scharlemann, Jörn PW; Tanner, Edmund VJ; Hiederer, Roland
  • Carbon Management, Vol. 5, Issue 1
  • DOI: 10.4155/cmt.13.77

Soil carbon sequestration to mitigate climate change
journal, November 2004


Comparison of soil reflectance spectra and calibration models obtained using multiple spectrometers
journal, March 2011


VisNIR spectra of dried ground soils predict properties of soils scanned moist and intact
journal, June 2014


Evaluation of the potential of the current and forthcoming multispectral and hyperspectral imagers to estimate soil texture and organic carbon
journal, June 2016


An integrated model of soil-canopy spectral radiances, photosynthesis, fluorescence, temperature and energy balance
journal, January 2009


Climate change and the permafrost carbon feedback
journal, April 2015

  • Schuur, E. A. G.; McGuire, A. D.; Schädel, C.
  • Nature, Vol. 520, Issue 7546
  • DOI: 10.1038/nature14338

Neocognitron: A neural network model for a mechanism of visual pattern recognition
journal, September 1983

  • Fukushima, Kunihiko; Miyake, Sei; Ito, Takayuki
  • IEEE Transactions on Systems, Man, and Cybernetics, Vol. SMC-13, Issue 5
  • DOI: 10.1109/TSMC.1983.6313076

An integrated methodology using open soil spectral libraries and Earth Observation data for soil organic carbon estimations in support of soil-related SDGs
journal, July 2020

  • Tziolas, Nikolaos; Tsakiridis, Nikolaos; Ogen, Yaron
  • Remote Sensing of Environment, Vol. 244
  • DOI: 10.1016/j.rse.2020.111793

The SPART model: A soil-plant-atmosphere radiative transfer model for satellite measurements in the solar spectrum
journal, September 2020


Prediction of Soil Carbon in the Conterminous United States: Visible and Near Infrared Reflectance Spectroscopy Analysis of the Rapid Carbon Assessment Project
journal, January 2016

  • Wijewardane, Nuwan K.; Ge, Yufeng; Wills, Skye
  • Soil Science Society of America Journal, Vol. 80, Issue 4
  • DOI: 10.2136/sssaj2016.02.0052

Principal component analysis
journal, August 1987

  • Wold, Svante; Esbensen, Kim; Geladi, Paul
  • Chemometrics and Intelligent Laboratory Systems, Vol. 2, Issue 1-3
  • DOI: 10.1016/0169-7439(87)80084-9

Permutation importance: a corrected feature importance measure
journal, April 2010


Partial least-squares regression: a tutorial
journal, January 1986


Soil variability and quantification based on Sentinel-2 and Landsat-8 bare soil images: A comparison
journal, January 2021

  • Silvero, Nélida Elizabet Quiñonez; Demattê, José Alexandre Melo; Amorim, Merilyn Taynara Accorsi
  • Remote Sensing of Environment, Vol. 252
  • DOI: 10.1016/j.rse.2020.112117

Soil organic carbon prediction by hyperspectral remote sensing and field vis-NIR spectroscopy: An Australian case study
journal, August 2008


Airborne hyperspectral imaging of spatial soil organic carbon heterogeneity at the field-scale
journal, April 2012


A fuzzy K-nearest neighbor algorithm
journal, July 1985

  • Keller, James M.; Gray, Michael R.; Givens, James A.
  • IEEE Transactions on Systems, Man, and Cybernetics, Vol. SMC-15, Issue 4
  • DOI: 10.1109/TSMC.1985.6313426

Bare Earth’s Surface Spectra as a Proxy for Soil Resource Monitoring
journal, March 2020

  • Demattê, José A. M.; Safanelli, José Lucas; Poppiel, Raul Roberto
  • Scientific Reports, Vol. 10, Issue 1
  • DOI: 10.1038/s41598-020-61408-1

Genetic-based EM algorithm for learning Gaussian mixture models
journal, August 2005

  • Pernkopf, F.; Bouchaffra, D.
  • IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 27, Issue 8
  • DOI: 10.1109/TPAMI.2005.162

Measuring soil organic carbon in croplands at regional scale using airborne imaging spectroscopy
journal, August 2010


Field spectroscopy of canopy nitrogen concentration in temperate grasslands using a convolutional neural network
journal, May 2021

  • Pullanagari, R. R.; Dehghan-Shoar, Mohammad; Yule, Ian J.
  • Remote Sensing of Environment, Vol. 257
  • DOI: 10.1016/j.rse.2021.112353

Gradient Theory of Optimal Flight Paths
journal, October 1960


A survey on addressing high-class imbalance in big data
journal, November 2018

  • Leevy, Joffrey L.; Khoshgoftaar, Taghi M.; Bauder, Richard A.
  • Journal of Big Data, Vol. 5, Issue 1
  • DOI: 10.1186/s40537-018-0151-6

Remote Sensing Techniques for Soil Organic Carbon Estimation: A Review
journal, March 2019

  • Angelopoulou, Theodora; Tziolas, Nikolaos; Balafoutis, Athanasios
  • Remote Sensing, Vol. 11, Issue 6
  • DOI: 10.3390/rs11060676

Moisture insensitive prediction of soil properties from VNIR reflectance spectra based on external parameter orthogonalization
journal, April 2016


EUSBoost: Enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling
journal, December 2013


Random forest in remote sensing: A review of applications and future directions
journal, April 2016


Ridge Regression: Applications to Nonorthogonal Problems
journal, February 1970


A remote sensing adapted approach for soil organic carbon prediction based on the spectrally clustered LUCAS soil database
journal, November 2019


PROSPECT: A model of leaf optical properties spectra
journal, November 1990


Long Short-Term Memory
journal, November 1997


Geospatial Soil Sensing System (GEOS3): A powerful data mining procedure to retrieve soil spectral reflectance from satellite images
journal, June 2018

  • Demattê, José Alexandre Melo; Fongaro, Caio Troula; Rizzo, Rodnei
  • Remote Sensing of Environment, Vol. 212
  • DOI: 10.1016/j.rse.2018.04.047

Random forest classifier for remote sensing classification
journal, January 2005


Predicting Physical and Chemical Properties of US Soils with a Mid‐Infrared Reflectance Spectral Library
journal, April 2018

  • Wijewardane, Nuwan K.; Ge, Yufeng; Wills, Skye
  • Soil Science Society of America Journal, Vol. 82, Issue 3
  • DOI: 10.2136/sssaj2017.10.0361

NASA's surface biology and geology designated observable: A perspective on surface imaging algorithms
journal, May 2021

  • Cawse-Nicholson, Kerry; Townsend, Philip A.; Schimel, David
  • Remote Sensing of Environment, Vol. 257
  • DOI: 10.1016/j.rse.2021.112349

Soil health and carbon management
journal, November 2016

  • Lal, Rattan
  • Food and Energy Security, Vol. 5, Issue 4
  • DOI: 10.1002/fes3.96

The LUCAS topsoil database and derived information on the regional variability of cropland topsoil properties in the European Union
journal, February 2013

  • Tóth, Gergely; Jones, Arwyn; Montanarella, Luca
  • Environmental Monitoring and Assessment, Vol. 185, Issue 9
  • DOI: 10.1007/s10661-013-3109-3

Mitigating the effects of soil and residue water contents on remotely sensed estimates of crop residue cover
journal, April 2008


Machine learning and soil sciences: a review aided by machine learning tools
journal, February 2020


How to estimate soil organic carbon stocks of agricultural fields? Perspectives using ex-ante evaluation
journal, April 2022


Building an exposed soil composite processor (SCMaP) for mapping spatial and temporal characteristics of soils with Landsat imagery (1984–2014)
journal, February 2018


GSV: a general model for hyperspectral soil reflectance simulation
journal, November 2019

  • Jiang, Chongya; Fang, Hongliang
  • International Journal of Applied Earth Observation and Geoinformation, Vol. 83
  • DOI: 10.1016/j.jag.2019.101932

Predicting subsurface thermohaline structure from remote sensing data based on long short-term memory neural networks
journal, July 2021


The potential of diffuse reflectance spectroscopy for the determination of carbon inventories in soils
journal, March 2002


Spectral reflectance based indices for soil organic carbon quantification
journal, May 2008


Light scattering by leaf layers with application to canopy reflectance modeling: The SAIL model
journal, October 1984


Short and mid-term sea surface temperature prediction using time-series satellite data and LSTM-AdaBoost combination approach
journal, November 2019


Assessment of soil organic carbon at local scale with spiked NIR calibrations: effects of selection and extra-weighting on the spiking subset
journal, February 2014

  • Guerrero, C.; Stenberg, B.; Wetterlind, J.
  • European Journal of Soil Science, Vol. 65, Issue 2
  • DOI: 10.1111/ejss.12129

Deep learning and process understanding for data-driven Earth system science
journal, February 2019


Soil organic carbon storage as a key function of soils - A review of drivers and indicators at various scales
journal, January 2019


Visible and near‐infrared reflectance spectroscopy analysis of soils
journal, September 2020

  • Ge, Yufeng; Morgan, Cristine L. S.; Wijewardane, Nuwan K.
  • Soil Science Society of America Journal, Vol. 84, Issue 5
  • DOI: 10.1002/saj2.20158

Imaging Spectroscopy for Soil Mapping and Monitoring
journal, March 2019


SMOTE: Synthetic Minority Over-sampling Technique
journal, January 2002

  • Chawla, N. V.; Bowyer, K. W.; Hall, L. O.
  • Journal of Artificial Intelligence Research, Vol. 16
  • DOI: 10.1613/jair.953

Evaluating the capability of the Sentinel 2 data for soil organic carbon prediction in croplands
journal, January 2019


The effect of splitting on random forests
journal, July 2014


Inconsistencies in Terminology and Definitions of Organic Soil Materials
journal, July 2014


Is Standardization Necessary for Sharing of a Large Mid-Infrared Soil Spectral Library?
journal, November 2020

  • Dangal, Shree R. S.; Sanderman, Jonathan
  • Sensors, Vol. 20, Issue 23
  • DOI: 10.3390/s20236729

rs-local data-mines information from spectral libraries to improve local calibrations: rs-local improves local spectroscopic calibrations
journal, November 2017

  • Lobsey, C. R.; Viscarra Rossel, R. A.; Roudier, P.
  • European Journal of Soil Science, Vol. 68, Issue 6
  • DOI: 10.1111/ejss.12490

Soil organic carbon fractions in the Great Plains of the United States: an application of mid-infrared spectroscopy
journal, February 2021

  • Sanderman, Jonathan; Baldock, Jeffrey A.; Dangal, Shree R. S.
  • Biogeochemistry, Vol. 156, Issue 1
  • DOI: 10.1007/s10533-021-00755-1

The extent of soil loss across the US Corn Belt
journal, February 2021

  • Thaler, Evan A.; Larsen, Isaac J.; Yu, Qian
  • Proceedings of the National Academy of Sciences, Vol. 118, Issue 8
  • DOI: 10.1073/pnas.1922375118

Simulation of Optical Remote-Sensing Scenes With Application to the EnMAP Hyperspectral Mission
journal, July 2009

  • Guanter, L.; Segl, K.; Kaufmann, H.
  • IEEE Transactions on Geoscience and Remote Sensing, Vol. 47, Issue 7
  • DOI: 10.1109/TGRS.2008.2011616

Using Imaging Spectroscopy to study soil properties
journal, September 2009


Soil Organic Carbon Mapping Using LUCAS Topsoil Database and Sentinel-2 Data: An Approach to Reduce Soil Moisture and Crop Residue Effects
journal, September 2019

  • Castaldi, Fabio; Chabrillat, Sabine; Don, Axel
  • Remote Sensing, Vol. 11, Issue 18
  • DOI: 10.3390/rs11182121

PROSPECT-D: Towards modeling leaf optical properties through a complete lifecycle
journal, May 2017