skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Mitigate: An Adaptive Network Data Anonymization Tool Using Condensation-Based Differential Privacy

Technical Report ·
DOI:https://doi.org/10.2172/1854575· OSTI ID:1854575
 [1];  [2];  [3]
  1. Anonitech LLC; Univ. of Maryland Baltimore County (UMBC), Baltimore, MD (United States)
  2. Univ. of Maryland Baltimore County (UMBC), Baltimore, MD (United States)
  3. Augusta University

Modern network devices collect a large amount of data that can be analyzed to identify bottlenecks, anomalies, cyber-attacks, etc. Therefore, there is often a need to analyze such collections of network data quite often by an external expert or by the research community. However, these collections of data contain sensitive, proprietary information. In order for the network data to be shared, it must first be anonymized. The overall objective of this project is to develop an innovative privacy management tool to anonymize network data and achieve sufficient privacy, acceptable data utility, and efficient data analysis at the same time. No existing anonymization methods can achieve all of these at the same time. The core of this technology is a differential private clustering algorithm that provides strong privacy protection, preserves data properties important for subsequent analysis, and allows the party receiving the anonymized data to conduct analysis directly on anonymized data without the need of decryption or any extra processing. The research carried out was to design, implement and verify a solution to this problem by completing the following tasks: 1) developing the core technology; 2) developing a context based method that automatically recommends fields that must be anonymized; 3) conducted experiments showing superior results using our approach compared to existing tools, and 4) developed an intuitive but basic user interface. The research that was conducted generated novel algorithmic techniques that utilize state-of-the-art methods such as condensation, differential privacy preservation, clustering, automated tuning based on contextual awareness, and recommendation techniques to specify columns to users for anonymization leading to optimal privacy that allows research analysis on the dataset. Experiments were conducted to evaluate the efficacy of these novel algorithmic techniques by performing analysis on original non-anonymized datasets, then conducting analysis on the same yet anonymized datasets and comparing the results of the analyses. Overall, the anonymized analysis results were within 1% of the original results, verifying that the generated technology not only guarantees a high level of privacy but also enables research analysis as if it were conducted on the original dataset. Potential applications of this technology include anonymization of any type of structured network datasets that contain sensitive identifiers, such as IP addresses, that can be used in multiple applications. For example, to create an AI or machine learning model for cyber security, e.g., to detect attacks, or for performance analysis, e.g., identify bottlenecks or predict performance. In addition, a market analysis that was conducted for potential applications of this technology identified a broader range of applications of our anonymization technology beyond the network sector that includes healthcare, banking, insurance, securities, finance (FISB), data brokering, cloud services, ad sales, and government.

Research Organization:
Anonitech LLC
Sponsoring Organization:
USDOE Office of Science (SC)
DOE Contract Number:
SC0021525
OSTI ID:
1854575
Type / Phase:
STTR (Phase I)
Report Number(s):
DOE-ANONITECH-SC0021525
Resource Relation:
Related Information: https://dl.acm.org/doi/10.1145/3425401
Country of Publication:
United States
Language:
English