skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Machine Learning-Based Classification of Lignocellulosic Biomass from Pyrolysis-Molecular Beam Mass Spectrometry Data

Journal Article · · International Journal of Molecular Sciences (Online)
DOI:https://doi.org/10.3390/ijms22084107· OSTI ID:1777643

High-throughput analysis of biomass is necessary to ensure consistent and uniform feedstocks for agricultural and bioenergy applications and is needed to inform genomics and systems biology models. Pyrolysis followed by mass spectrometry such as molecular beam mass spectrometry (py-MBMS) analyses are becoming increasingly popular for the rapid analysis of biomass cell wall composition and typically require the use of different data analysis tools depending on the need and application. Here, the authors report the py-MBMS analysis of several types of lignocellulosic biomass to gain an understanding of spectral patterns and variation with associated biomass composition and use machine learning approaches to classify, differentiate, and predict biomass types on the basis of py-MBMS spectra. Py-MBMS spectra were also corrected for instrumental variance using generalized linear modeling (GLM) based on the use of select ions relative abundances as spike-in controls. Machine learning classification algorithms e.g., random forest, k-nearest neighbor, decision tree, Gaussian Naïve Bayes, gradient boosting, and multilayer perceptron classifiers were used. The k-nearest neighbors (k-NN) classifier generally performed the best for classifications using raw spectral data, and the decision tree classifier performed the worst. After normalization of spectra to account for instrumental variance, all the classifiers had comparable and generally acceptable performance for predicting the biomass types, although the k-NN and decision tree classifiers were not as accurate for prediction of specific sample types. Gaussian Naïve Bayes (GNB) and extreme gradient boosting (XGB) classifiers performed better than the k-NN and the decision tree classifiers for the prediction of biomass mixtures. The data analysis workflow reported here could be applied and extended for comparison of biomass samples of varying types, species, phenotypes, and/or genotypes or subjected to different treatments, environments, etc. to further elucidate the sources of spectral variance, patterns, and to infer compositional information based on spectral analysis, particularly for analysis of data without a priori knowledge of the feedstock composition or identity.

Research Organization:
National Renewable Energy Laboratory (NREL), Golden, CO (United States); USDOE Bioenergy Research Centers (BRC) (United States). Center for Bioenergy Innovation (CBI)
Sponsoring Organization:
USDOE Office of Energy Efficiency and Renewable Energy (EERE); USDOE Office of Energy Efficiency and Renewable Energy (EERE), Transportation Office. Bioenergy Technologies Office; USDOE Office of Science (SC), Biological and Environmental Research (BER)
Grant/Contract Number:
AC36-08GO28308; NA
OSTI ID:
1777643
Alternate ID(s):
OSTI ID: 1781625
Report Number(s):
NREL/JA-2800-79514; IJMCFK; PII: ijms22084107
Journal Information:
International Journal of Molecular Sciences (Online), Journal Name: International Journal of Molecular Sciences (Online) Vol. 22 Journal Issue: 8; ISSN 1422-0067
Publisher:
MDPI AGCopyright Statement
Country of Publication:
Switzerland
Language:
English

References (26)

Validation of PyMBMS as a High-throughput Screen for Lignin Abundance in Lignocellulosic Biomass of Grasses journal January 2014
Within tree variability of lignin composition in Populus journal June 2008
Chemical profiles of switchgrass journal May 2010
Pleiotropic and Epistatic Network-Based Discovery: Integrated Networks for Target Gene Discovery journal May 2018
Molecular characterization of the pyrolysis of biomass journal March 1987
Rapid and quantitative analysis and bioprocesses using pyrolysis mass spectrometry and neural networks: application to indole production journal July 1993
TG-FTIR and Py-GC/MS analyses of pyrolysis behaviors and products of cattle manure in CO2 and N2 atmospheres: Kinetic, thermodynamic, and machine-learning models journal September 2019
Estimation of terpene content in loblolly pine biomass using a hybrid fast-GC and pyrolysis-molecular beam mass spectrometry method journal March 2017
Normalization of RNA-seq data using factor analysis of control genes or samples journal August 2014
Rapid screening for metabolite overproduction in fermentor broths, using pyrolysis mass spectrometry with multivariate calibration and artificial neural networks journal November 1994
A comparative investigation of modern feature selection and classification approaches for the analysis of mass spectrometry data journal June 2014
Lignocellulosic biomass pyrolysis mechanism: A state-of-the-art review journal September 2017
High-throughput Screening of Recalcitrance Variations in Lignocellulosic Biomass: Total Lignin, Lignin Monomers, and Enzymatic Sugar Release journal January 2015
High-Throughput Method for Determining the Sugar Content in Biomass with Pyrolysis Molecular Beam Mass Spectrometry journal April 2015
High-resolution genetic mapping of allelic variants associated with cell wall chemistry in Populus journal January 2015
High Throughput Screening Technologies in Biomass Characterization journal November 2018
Rapid and Quantitative Analysis of the Pyrolysis Mass Spectra of Complex Binary and Tertiary Mixtures Using Multivariate Calibration and Artificial Neural Networks journal April 1994
Molecular-beam mass-spectrometric analysis of lignocellulosic materials journal December 1994
Characterization of pyrolysis products from fast pyrolysis of live and dead vegetation native to the Southern United States journal October 2018
Correction of Mass Spectral Drift Using Artificial Neural Networks journal January 1996
Pyrolysis–GC/MS of sinapyl and coniferyl alcohol journal January 2013
Role of a thermostable laccase produced by Streptomyces ipomoeae in the degradation of wheat straw lignin in solid state fermentation journal November 2016
Influence of inorganic salts on the primary pyrolysis products of cellulose journal June 2010
Characterization of Endocarp Biomass and Extracted Lignin Using Pyrolysis and Spectroscopic Methods journal September 2014
Quantitative Analysis of Multivariate Data Using Artificial Neural Networks: A Tutorial Review and Applications to the Deconvolution of Pyrolysis Mass Spectra journal August 1996
Biochemical analysis of wood and wood products by pyrolysis-mass spectrometry and multivariate analysis journal August 1984