skip to main content


This content will become publicly available on November 8, 2018

Title: Contaminant source identification using semi-supervised machine learning

Identification of the original groundwater types present in geochemical mixtures observed in an aquifer is a challenging but very important task. Frequently, some of the groundwater types are related to different infiltration and/or contamination sources associated with various geochemical signatures and origins. The characterization of groundwater mixing processes typically requires solving complex inverse models representing groundwater flow and geochemical transport in the aquifer, where the inverse analysis accounts for available site data. Usually, the model is calibrated against the available data characterizing the spatial and temporal distribution of the observed geochemical types. Numerous different geochemical constituents and processes may need to be simulated in these models which further complicates the analyses. In this paper, we propose a new contaminant source identification approach that performs decomposition of the observation mixtures based on Non-negative Matrix Factorization (NMF) method for Blind Source Separation (BSS), coupled with a custom semi-supervised clustering algorithm. Our methodology, called NMFk, is capable of identifying (a) the unknown number of groundwater types and (b) the original geochemical concentration of the contaminant sources from measured geochemical mixtures with unknown mixing ratios without any additional site information. NMFk is tested on synthetic and real-world site data. Finally, the NMFk algorithm worksmore » with geochemical data represented in the form of concentrations, ratios (of two constituents; for example, isotope ratios), and delta notations (standard normalized stable isotope ratios).« less
ORCiD logo [1] ; ORCiD logo [1] ; ORCiD logo [1]
  1. Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Publication Date:
Report Number(s):
Journal ID: ISSN 0169-7722; TRN: US1703077
Grant/Contract Number:
Accepted Manuscript
Journal Name:
Journal of Contaminant Hydrology
Additional Journal Information:
Journal Volume: 212; Journal ID: ISSN 0169-7722
Research Org:
Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Sponsoring Org:
USDOE Office of Environmental Management (EM)
Country of Publication:
United States
54 ENVIRONMENTAL SCIENCES; 97 MATHEMATICS AND COMPUTING; Earth Sciences; Mathematics; Non-negative matrix factorization; Feature Extraction; Blind Source Separation; Robustness analysis; Semi-supervised learning; Groundwater contamination; Source identification; Advection-diffusion transport; Geochemical signatures
OSTI Identifier: