skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A Biosequence-based Approach to Software Characterization

Abstract

For many applications, it is desirable to have some process for recognizing when software binaries are closely related without relying on them to be identical or have identical segments. Some examples include monitoring utilization of high performance computing centers or service clouds, detecting freeware in licensed code, and enforcing application whitelists. But doing so in a dynamic environment is a nontrivial task because most approaches to software similarity require extensive and time-consuming analysis of a binary, or they fail to recognize executables that are similar but nonidentical. Presented herein is a novel biosequence-based method for quantifying similarity of executable binaries. Using this method, it is shown in an example application on large-scale multi-author codes that 1) the biosequence-based method has a statistical performance in recognizing and distinguishing between a collection of real-world high performance computing applications better than 90% of ideal; and 2) an example of using family tree analysis to tune identification for a code subfamily can achieve better than 99% of ideal performance.

Authors:
; ; ;
Publication Date:
Research Org.:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1339045
Report Number(s):
PNNL-SA-115860
453040300
DOE Contract Number:
AC05-76RL01830
Resource Type:
Conference
Resource Relation:
Conference: IEEE Security and Privacy Workshops (SPW 2016), May 22-26, 2016, San Jose, California, 118-125
Country of Publication:
United States
Language:
English
Subject:
software analysis; sequence analysis; cyber security

Citation Formats

Oehmen, Christopher S., Peterson, Elena S., Phillips, Aaron R., and Curtis, Darren S. A Biosequence-based Approach to Software Characterization. United States: N. p., 2016. Web. doi:10.1109/SPW.2016.43.
Oehmen, Christopher S., Peterson, Elena S., Phillips, Aaron R., & Curtis, Darren S. A Biosequence-based Approach to Software Characterization. United States. doi:10.1109/SPW.2016.43.
Oehmen, Christopher S., Peterson, Elena S., Phillips, Aaron R., and Curtis, Darren S. Thu . "A Biosequence-based Approach to Software Characterization". United States. doi:10.1109/SPW.2016.43.
@article{osti_1339045,
title = {A Biosequence-based Approach to Software Characterization},
author = {Oehmen, Christopher S. and Peterson, Elena S. and Phillips, Aaron R. and Curtis, Darren S.},
abstractNote = {For many applications, it is desirable to have some process for recognizing when software binaries are closely related without relying on them to be identical or have identical segments. Some examples include monitoring utilization of high performance computing centers or service clouds, detecting freeware in licensed code, and enforcing application whitelists. But doing so in a dynamic environment is a nontrivial task because most approaches to software similarity require extensive and time-consuming analysis of a binary, or they fail to recognize executables that are similar but nonidentical. Presented herein is a novel biosequence-based method for quantifying similarity of executable binaries. Using this method, it is shown in an example application on large-scale multi-author codes that 1) the biosequence-based method has a statistical performance in recognizing and distinguishing between a collection of real-world high performance computing applications better than 90% of ideal; and 2) an example of using family tree analysis to tune identification for a code subfamily can achieve better than 99% of ideal performance.},
doi = {10.1109/SPW.2016.43},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Thu Aug 04 00:00:00 EDT 2016},
month = {Thu Aug 04 00:00:00 EDT 2016}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:
  • Traditional software development consists of many knowledge intensive and intellectual activities related to understanding a problem to be solved and designing a solution to that problem. These activities are informal, subjective, and undocumented and are the same for original development and subsequent support. Since 1982, the USAF Rome Laboratory has been developing the Knowledge-Based Software Assistant (KBSA), a revolutionary new paradigm for software development that will achieve orders of magnitude improvement in productivity and quality. KBSA does not pursue the improvement of traditional technologies or methodologies such as new programming languages and management procedures to fulfill this objective, but hasmore » instead adopted a revolutionary new approach. KBSA is a knowledge-based, computer-mediated paradigm for the evolutionary definition, specification, development, and long-term support of software. The computer becomes an `intelligent partner` and `corporate memory` in this paradigm, formally capturing the appropriate knowledge and actively using this knowledge to provide assistance and automation. The productivity of developers will dramatically improve because of the increased assistance, automation and re-utilization of domain and programming knowledge. The quality of software, both correctness and satisfying requirements, will also improve because the development process is formal and easier to use.« less
  • The purpose of this paper is to demonstrate how transformation can be used to derive a high integrity implementation of a train controller from an algorithmic specification. The paper begins with a general discussion of high consequence systems (e.g., software systems) and describes how rewrite-based transformation systems can be used in the development of such systems. The authors then discuss how such transformations can be used to derive a high assurance controller for the Bay Area Rapid Transit (BART) system from an algorithmic specification.
  • Assumptions about the economics of making a system safe are usually not explicitly stated in industrial and software models of safety-critical systems. These assumptions span a wide spectrum of economic tradeoffs with respect to resources expended to make a system safe. The missing component in these models that is necessary for capturing the effect of economic tradeoffs is risk. A qualitative risk-based software safety model is proposed that combines features of industrial and software systems safety models. The risk-based model provides decision makers with a basis for performing cost-benefit analyses of software safety-related activities.
  • The NEPTUNE project constitutes the thermal-hydraulics part of a long-term joint development program for the next generation of nuclear reactor simulation tools. This project is being carried through by EDF (Electricite de France) and CEA (Commissariat a l'Energie Atomique), with the co-sponsorship of IRSN (Institut de Radioprotection et de Surete Nucleaire) and AREVA NP. NEPTUNE is a multi-phase flow software platform that includes advanced physical models and numerical methods for each simulation scale (CFD, component, system). NEPTUNE also provides new multi-scale and multi-disciplinary coupling functionalities. This new generation of two-phase flow simulation tools aims at meeting major industrial needs. DNBmore » (Departure from Nucleate Boiling) prediction in PWRs is one of the high priority needs, and this paper focuses on its anticipated improvement by means of a so-called 'Local Predictive Approach' using the NEPTUNE CFD code. We firstly present the ambitious 'Local Predictive Approach' anticipated for a better prediction of DNB, i.e. an approach that intends to result in CHF correlations based on relevant local parameters as provided by the CFD modeling. The associated requirements for the two-phase flow modeling are underlined as well as those for the good level of performance of the NEPTUNE CFD code; hence, the code validation strategy based on different experimental data base types (including separated effect and integral-type tests data) is depicted. Secondly, we present comparisons between low pressure adiabatic bubbly flow experimental data obtained on the DEDALE experiment and the associated numerical simulation results. This study anew shows the high potential of NEPTUNE CFD code, even if, with respect to the aforementioned DNB-related aim, there is still a need for some modeling improvements involving new validation data obtained in thermal-hydraulics conditions representative of PWR ones. Finally, we deal with one of these new experimental data needs and present a scaling method for the design of the associated experimentation devoted to the analysis of the dynamics-related modeling of a bubbly flow in PWR representative conditions. (authors)« less