Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

MalGen: Malware Generation with Specific Behaviors to Improve Machine Learning-based Detectors

Technical Report ·
DOI:https://doi.org/10.2172/1893244· OSTI ID:1893244
In recent years, infections and damage caused by malware have increased at exponential rates. At the same time, machine learning (ML) techniques have shown tremendous promise in many domains, often out performing human efforts by learning from large amounts of data. Results in the open literature suggest that ML is able to provide similar results for malware detection, achieving greater than 99% classifcation accuracy [49]. However, the same detection rates when applied in deployed settings have not been achieved. Malware is distinct from many other domains in which ML has shown success in that (1) it purposefully tries to hide, leading to noisy labels and (2) often its behavior is similar to benign software only differing in intent, among other complicating factors. This report details the reasons for the diffcultly of detecting novel malware by ML methods and offers solutions to improve the detection of novel malware.
Research Organization:
Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States); Sandia National Laboratories (SNL-CA), Livermore, CA (United States)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA)
DOE Contract Number:
NA0003525
OSTI ID:
1893244
Report Number(s):
SAND2022-14321; 710967
Country of Publication:
United States
Language:
English

Similar Records

Tensor Text-Mining Methods for Malware Identification and Detection, Malware Dynamics Characterization, and Hosts Ranking
Technical Report · Mon Oct 11 00:00:00 EDT 2021 · OSTI ID:1826495

Malware analysis and recovery
Patent · Mon Feb 22 23:00:00 EST 2021 · OSTI ID:1805551

Deep PDF parsing to extract features for detecting embedded malware.
Technical Report · Thu Sep 01 00:00:00 EDT 2011 · OSTI ID:1030303

Related Subjects