Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Chaconne: A Statistical Approach to Nonlocal Compression for Supervised Learning, Semi-Supervised Learning, and Anomaly Detection

Technical Report ·
DOI:https://doi.org/10.2172/2430240· OSTI ID:2430240
 [1];  [1];  [1];  [1];  [1];  [2];  [2]
  1. Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
  2. Univ. of Illinois at Urbana-Champaign, IL (United States)
This project developed a novel statistical understanding of compression analytics (CA), which has challenged and clarified some core assumptions about CA, and enabled the development of novel techniques that address vital challenges of national security. Specifically, this project has yielded the development of novel capabilities including 1. Principled metrics for model selection in CA, 2. Techniques for deriving/applying optimal classification rules and decision theory to supervised CA, including how to properly handle class imbalance and differing costs of misclassification, 3. Two techniques for handling nonlocal information in CA, 4. A novel technique for unsupervised CA that is agnostic with regard to the underlying compression algorithm, 5. A framework for semisupervised CA when a small number of labels are known in an otherwise large unlabeled dataset. 6. The academic alliance component of this project has focused on the development of a novel exemplar-based Bayesian technique for estimating variable length Markov models (closely related to PPM [prediction by partial matching] compression techniques). We have developed examples illustrating the application of our work to text, video, genetic sequences, and unstructured cybersecurity log files.
Research Organization:
Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA); USDOE Laboratory Directed Research and Development (LDRD) Program
DOE Contract Number:
NA0003525
OSTI ID:
2430240
Report Number(s):
SAND--2023-10771R
Country of Publication:
United States
Language:
English

Similar Records

Incorporating Physical Priors into Weakly Supervised Anomaly Detection
Journal Article · Thu Jul 10 20:00:00 EDT 2025 · Physical Review Letters · OSTI ID:2583310

Semisupervised Learning for Seismic Monitoring Applications
Journal Article · Tue Oct 20 20:00:00 EDT 2020 · Seismological Research Letters · OSTI ID:1830513

Reification of latent microstructures: On supervised unsupervised and semi-supervised deep learning applications for microstructures in materials informatics
Technical Report · Sat Feb 29 23:00:00 EST 2020 · OSTI ID:1673174

Related Subjects