skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Establishing Malware Attribution and Binary Provenance Using Multicompilation Techniques

Abstract

Malware is a serious problem for computer systems and costs businesses and customers billions of dollars a year in addition to compromising their private information. Detecting malware is particularly difficult because malware source code can be compiled in many different ways and generate many different digital signatures, which causes problems for most anti-malware programs that rely on static signature detection. Our project uses a convolutional neural network to identify malware programs but these require large amounts of data to be effective. Towards that end, we gather thousands of source code files from publicly available programming contest sites and compile them with several different compilers and flags. Building upon current research, we then transform these binary files into image representations and use them to train a long-term recurrent convolutional neural network that will eventually be used to identify how a malware binary was compiled. This information will include the compiler, version of the compiler and the options used in compilation, information which can be critical in determining where a malware program came from and even who authored it.

Authors:
 [1]
  1. Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Publication Date:
Research Org.:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1390004
Report Number(s):
LLNL-TR-737549
DOE Contract Number:  
AC52-07NA27344
Resource Type:
Technical Report
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE

Citation Formats

Ramshaw, M. J. Establishing Malware Attribution and Binary Provenance Using Multicompilation Techniques. United States: N. p., 2017. Web. doi:10.2172/1390004.
Ramshaw, M. J. Establishing Malware Attribution and Binary Provenance Using Multicompilation Techniques. United States. doi:10.2172/1390004.
Ramshaw, M. J. Fri . "Establishing Malware Attribution and Binary Provenance Using Multicompilation Techniques". United States. doi:10.2172/1390004. https://www.osti.gov/servlets/purl/1390004.
@article{osti_1390004,
title = {Establishing Malware Attribution and Binary Provenance Using Multicompilation Techniques},
author = {Ramshaw, M. J.},
abstractNote = {Malware is a serious problem for computer systems and costs businesses and customers billions of dollars a year in addition to compromising their private information. Detecting malware is particularly difficult because malware source code can be compiled in many different ways and generate many different digital signatures, which causes problems for most anti-malware programs that rely on static signature detection. Our project uses a convolutional neural network to identify malware programs but these require large amounts of data to be effective. Towards that end, we gather thousands of source code files from publicly available programming contest sites and compile them with several different compilers and flags. Building upon current research, we then transform these binary files into image representations and use them to train a long-term recurrent convolutional neural network that will eventually be used to identify how a malware binary was compiled. This information will include the compiler, version of the compiler and the options used in compilation, information which can be critical in determining where a malware program came from and even who authored it.},
doi = {10.2172/1390004},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Fri Jul 28 00:00:00 EDT 2017},
month = {Fri Jul 28 00:00:00 EDT 2017}
}

Technical Report:

Save / Share: