skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Nonnegative tensor factorization for contaminant source identification

Abstract

Unsupervised Machine Learning (ML) is becoming increasingly popular for solving various types of data analytics problems including feature extraction, blind source separation, exploratory analyses, model diagnostics, etc. In this work, we have developed a new unsupervised ML method based on Nonnegative Tensor Factorization (NTF) for identification of the original groundwater types (including contaminant sources) present in geochemical mixtures observed in an aquifer. Frequently, groundwater types with different geochemical signatures are related to different background and/or contamination sources. The characterization of groundwater mixing processes is a challenging but very important task critical for any environmental management project aiming to characterize the fate and transport of contaminants in the subsurface and perform contaminant remediation. This task typically requires solving complex inverse models representing groundwater flow and geochemical transport in the aquifer, where the inverse analysis accounts for available site data. Usually, the model is calibrated against the available data characterizing the spatial and temporal distribution of the observed geochemical types. Numerous different geochemical constituents and processes may need to be simulated in these models which further complicates the analyses. Additionally, the application of inverse methods may introduce biases in the analyses through the assumptions made in the model development process. Here, wemore » substitute the model inversion with unsupervised ML analysis. The ML analysis does not make any assumptions about underlying physical and geochemical processes occurring in the aquifer. Our ML methodology, called NTFk, is capable of identifying (1) the unknown number of groundwater types (contaminant sources) present in the aquifer, (2) the original geochemical concentrations (signatures) of these groundwater types and (3) spatial and temporal dynamics in the mixing of these groundwater types. These results are obtained only from the measured geochemical data without any additional site information. In general, the NTFk methodology allows for interpretation of large high-dimensional datasets representing diverse spatial and temporal components such as state variables and velocities. NTFk has been tested on synthetic and real-world site three-dimensional datasets. Finally, the NTFk algorithm is designed to work with geochemical data represented in the form of concentrations, ratios (of two constituents; for example, isotope ratios), and delta notations (standard normalized stable isotope ratios).« less

Authors:
ORCiD logo [1]; ORCiD logo [1]; ORCiD logo [1]
  1. Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Publication Date:
Research Org.:
Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Sponsoring Org.:
USDOE National Nuclear Security Administration (NNSA)
OSTI Identifier:
1489953
Report Number(s):
LA-UR-18-22259
Journal ID: ISSN 0169-7722
Grant/Contract Number:  
89233218CNA000001
Resource Type:
Accepted Manuscript
Journal Name:
Journal of Contaminant Hydrology
Additional Journal Information:
Journal Volume: 220; Journal ID: ISSN 0169-7722
Publisher:
Elsevier
Country of Publication:
United States
Language:
English
Subject:
58 GEOSCIENCES; 97 MATHEMATICS AND COMPUTING; Earth Sciences; Mathematics; Nonnegative tensor factorization; Tucker decomposition; Optimal Tucker decomposition; Feature Extraction; Exploratory analysis; Blind Source Separation; Robustness analysis; Semi-supervised learning

Citation Formats

Vesselinov, Velimir Valentinov, Alexandrov, Boian S., and O'Malley, Daniel. Nonnegative tensor factorization for contaminant source identification. United States: N. p., 2018. Web. doi:10.1016/j.jconhyd.2018.11.010.
Vesselinov, Velimir Valentinov, Alexandrov, Boian S., & O'Malley, Daniel. Nonnegative tensor factorization for contaminant source identification. United States. doi:10.1016/j.jconhyd.2018.11.010.
Vesselinov, Velimir Valentinov, Alexandrov, Boian S., and O'Malley, Daniel. Tue . "Nonnegative tensor factorization for contaminant source identification". United States. doi:10.1016/j.jconhyd.2018.11.010. https://www.osti.gov/servlets/purl/1489953.
@article{osti_1489953,
title = {Nonnegative tensor factorization for contaminant source identification},
author = {Vesselinov, Velimir Valentinov and Alexandrov, Boian S. and O'Malley, Daniel},
abstractNote = {Unsupervised Machine Learning (ML) is becoming increasingly popular for solving various types of data analytics problems including feature extraction, blind source separation, exploratory analyses, model diagnostics, etc. In this work, we have developed a new unsupervised ML method based on Nonnegative Tensor Factorization (NTF) for identification of the original groundwater types (including contaminant sources) present in geochemical mixtures observed in an aquifer. Frequently, groundwater types with different geochemical signatures are related to different background and/or contamination sources. The characterization of groundwater mixing processes is a challenging but very important task critical for any environmental management project aiming to characterize the fate and transport of contaminants in the subsurface and perform contaminant remediation. This task typically requires solving complex inverse models representing groundwater flow and geochemical transport in the aquifer, where the inverse analysis accounts for available site data. Usually, the model is calibrated against the available data characterizing the spatial and temporal distribution of the observed geochemical types. Numerous different geochemical constituents and processes may need to be simulated in these models which further complicates the analyses. Additionally, the application of inverse methods may introduce biases in the analyses through the assumptions made in the model development process. Here, we substitute the model inversion with unsupervised ML analysis. The ML analysis does not make any assumptions about underlying physical and geochemical processes occurring in the aquifer. Our ML methodology, called NTFk, is capable of identifying (1) the unknown number of groundwater types (contaminant sources) present in the aquifer, (2) the original geochemical concentrations (signatures) of these groundwater types and (3) spatial and temporal dynamics in the mixing of these groundwater types. These results are obtained only from the measured geochemical data without any additional site information. In general, the NTFk methodology allows for interpretation of large high-dimensional datasets representing diverse spatial and temporal components such as state variables and velocities. NTFk has been tested on synthetic and real-world site three-dimensional datasets. Finally, the NTFk algorithm is designed to work with geochemical data represented in the form of concentrations, ratios (of two constituents; for example, isotope ratios), and delta notations (standard normalized stable isotope ratios).},
doi = {10.1016/j.jconhyd.2018.11.010},
journal = {Journal of Contaminant Hydrology},
number = ,
volume = 220,
place = {United States},
year = {2018},
month = {12}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 1 work
Citation information provided by
Web of Science

Save / Share: