skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Analysis of Community Detection Algorithms for Large Scale Cyber Networks

Abstract

The aim of this project is to use existing community detection algorithms on an IP network dataset to create supernodes within the network. This study compares the performance of different algorithms on the network in terms of running time. The paper begins with an introduction to the concept of clustering and community detection followed by the research question that the team aimed to address. Further the paper describes the graph metrics that were considered in order to shortlist algorithms followed by a brief explanation of each algorithm with respect to the graph metric on which it is based. The next section in the paper describes the methodology used by the team in order to run the algorithms and determine which algorithm is most efficient with respect to running time. Finally, the last section of the paper includes the results obtained by the team and a conclusion based on those results as well as future work.

Authors:
; ; ; ;
Publication Date:
Research Org.:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1339042
Report Number(s):
PNNL-SA-119853
DOE Contract Number:
AC05-76RL01830
Resource Type:
Conference
Resource Relation:
Conference: Proceedings of the 2016 Information Security Research and Education (INSuRE) Conference (INSuRECon-16), September 30, 2016
Country of Publication:
United States
Language:
English
Subject:
Network traffic analysis; community detection; graph clustering; modularity; algorithms

Citation Formats

Mane, Prachita, Shanbhag, Sunanda, Kamath, Tanmayee, Mackey, Patrick S., and Springer, John. Analysis of Community Detection Algorithms for Large Scale Cyber Networks. United States: N. p., 2016. Web.
Mane, Prachita, Shanbhag, Sunanda, Kamath, Tanmayee, Mackey, Patrick S., & Springer, John. Analysis of Community Detection Algorithms for Large Scale Cyber Networks. United States.
Mane, Prachita, Shanbhag, Sunanda, Kamath, Tanmayee, Mackey, Patrick S., and Springer, John. 2016. "Analysis of Community Detection Algorithms for Large Scale Cyber Networks". United States. doi:.
@article{osti_1339042,
title = {Analysis of Community Detection Algorithms for Large Scale Cyber Networks},
author = {Mane, Prachita and Shanbhag, Sunanda and Kamath, Tanmayee and Mackey, Patrick S. and Springer, John},
abstractNote = {The aim of this project is to use existing community detection algorithms on an IP network dataset to create supernodes within the network. This study compares the performance of different algorithms on the network in terms of running time. The paper begins with an introduction to the concept of clustering and community detection followed by the research question that the team aimed to address. Further the paper describes the graph metrics that were considered in order to shortlist algorithms followed by a brief explanation of each algorithm with respect to the graph metric on which it is based. The next section in the paper describes the methodology used by the team in order to run the algorithms and determine which algorithm is most efficient with respect to running time. Finally, the last section of the paper includes the results obtained by the team and a conclusion based on those results as well as future work.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = 2016,
month = 9
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:
  • A class of algorithms, Hierarchical Aggregation Algorithms (HAA), for approximately solving shortest paths problems in very large scale networks are proposed which aim at reducing the computational effort. Networks are first aggregated into a set of subnetworks. Higher level imbedded macronetworks are then defined. The shortest paths are approximated by combining exact shortest paths in subnetworks and in higher level networks. We discuss a probabilistic error analysis and the simulation results in Manhattan-type networks. The algorithm is furthermore implemented on a real-world network, southeastern Michigan network. The numerical results from variations of the algorithm will be compared.
  • We present parallel genetic algorithms (GA`s) for several classes of fixed-charge multicommodity flow problems arising from applications in parallel database design, domain decomposition, and telecommunications. These algorithms utilize a high-level approach based upon representing individual (in the GA sense) in terms of selections from a library of pre-computed {open_quotes}building blocks{close_quotes} of sets of variables rather than as values of individual binary variables corresponding to single links. The fitness function for this form of representation is then evaluated by applying heuristics to the starting point represented by an individual, thereby allowing for modifications in the original {open_quotes}blueprint{close_quotes} represented by the individual.more » These heuristics lead to objective function improvements and are also used to force feasibility. With this type of fitness function, the amount of time spent on the other operations of the GA (selection, mutation, etc.) is relatively small, so that high efficiency may be achieved in parallel implementations of the algorithm. We present computational results on the CM-5 supercomputer, demonstrating the ability to solve to optimality certain fixed-charge problems with more than one million binary variables.« less
  • In a recent paper, Lowry (1981) described an architecture for a computer vision rectangular processor array that is suitable for VLSI implementation. In this paper the authors review that architecture, discuss extensions to it and present results of an array simulator applied to vision algorithms. They also present an algorithm for re-routing an array with bad processors into a working subset of the array, making it feasible to implement a large array on one wafer-sized chip. 7 references.
  • The authors present a model for the parallel performance of algorithms that consist of concurrent, two-dimensional wavefronts implemented in a message passing environment. The model combines the separate contributions of computation and communication wavefronts. They validate the model on three important supercomputer systems, on up to 500 processors. They use data from a deterministic particle transport application taken from the ASCI workload, although the model is general to any wavefront algorithm implemented on a 2-D processor domain. They also use the validated model to make estimates of performance and scalability of wavefront algorithms on 100-TFLOPS computer systems expected to bemore » in existence within the next decade as part of the ASCI program and elsewhere. On such machines the analysis shows that, contrary to conventional wisdom, interprocessor communication performance is not the bottleneck. Single-node efficiency is the dominant factor.« less
  • The authors introduced a performance model for parallel, multidimensional, wavefront calculations with machine performance characterized using the LogGP framework. The model accounts for overlap in the communication and computation components. The agreement with experimental data is very good under a variety of model sizes, data partitionings, blocking strategies, and on three different parallel architectures. Using the model, the authors analyzed performance of a deterministic transport code on a hypothetical 100 Tflops future parallel system of interest to ASCI.