Analysis of Community Detection Algorithms for Large Scale Cyber Networks
Abstract
The aim of this project is to use existing community detection algorithms on an IP network dataset to create supernodes within the network. This study compares the performance of different algorithms on the network in terms of running time. The paper begins with an introduction to the concept of clustering and community detection followed by the research question that the team aimed to address. Further the paper describes the graph metrics that were considered in order to shortlist algorithms followed by a brief explanation of each algorithm with respect to the graph metric on which it is based. The next section in the paper describes the methodology used by the team in order to run the algorithms and determine which algorithm is most efficient with respect to running time. Finally, the last section of the paper includes the results obtained by the team and a conclusion based on those results as well as future work.
 Authors:
 Publication Date:
 Research Org.:
 Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
 Sponsoring Org.:
 USDOE
 OSTI Identifier:
 1339042
 Report Number(s):
 PNNLSA119853
 DOE Contract Number:
 AC0576RL01830
 Resource Type:
 Conference
 Resource Relation:
 Conference: Proceedings of the 2016 Information Security Research and Education (INSuRE) Conference (INSuRECon16), September 30, 2016
 Country of Publication:
 United States
 Language:
 English
 Subject:
 Network traffic analysis; community detection; graph clustering; modularity; algorithms
Citation Formats
Mane, Prachita, Shanbhag, Sunanda, Kamath, Tanmayee, Mackey, Patrick S., and Springer, John. Analysis of Community Detection Algorithms for Large Scale Cyber Networks. United States: N. p., 2016.
Web.
Mane, Prachita, Shanbhag, Sunanda, Kamath, Tanmayee, Mackey, Patrick S., & Springer, John. Analysis of Community Detection Algorithms for Large Scale Cyber Networks. United States.
Mane, Prachita, Shanbhag, Sunanda, Kamath, Tanmayee, Mackey, Patrick S., and Springer, John. 2016.
"Analysis of Community Detection Algorithms for Large Scale Cyber Networks". United States.
doi:.
@article{osti_1339042,
title = {Analysis of Community Detection Algorithms for Large Scale Cyber Networks},
author = {Mane, Prachita and Shanbhag, Sunanda and Kamath, Tanmayee and Mackey, Patrick S. and Springer, John},
abstractNote = {The aim of this project is to use existing community detection algorithms on an IP network dataset to create supernodes within the network. This study compares the performance of different algorithms on the network in terms of running time. The paper begins with an introduction to the concept of clustering and community detection followed by the research question that the team aimed to address. Further the paper describes the graph metrics that were considered in order to shortlist algorithms followed by a brief explanation of each algorithm with respect to the graph metric on which it is based. The next section in the paper describes the methodology used by the team in order to run the algorithms and determine which algorithm is most efficient with respect to running time. Finally, the last section of the paper includes the results obtained by the team and a conclusion based on those results as well as future work.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = 2016,
month = 9
}

A class of algorithms, Hierarchical Aggregation Algorithms (HAA), for approximately solving shortest paths problems in very large scale networks are proposed which aim at reducing the computational effort. Networks are first aggregated into a set of subnetworks. Higher level imbedded macronetworks are then defined. The shortest paths are approximated by combining exact shortest paths in subnetworks and in higher level networks. We discuss a probabilistic error analysis and the simulation results in Manhattantype networks. The algorithm is furthermore implemented on a realworld network, southeastern Michigan network. The numerical results from variations of the algorithm will be compared.

Parallel genetic algorithms for largescale fixed charge networks
We present parallel genetic algorithms (GA`s) for several classes of fixedcharge multicommodity flow problems arising from applications in parallel database design, domain decomposition, and telecommunications. These algorithms utilize a highlevel approach based upon representing individual (in the GA sense) in terms of selections from a library of precomputed {open_quotes}building blocks{close_quotes} of sets of variables rather than as values of individual binary variables corresponding to single links. The fitness function for this form of representation is then evaluated by applying heuristics to the starting point represented by an individual, thereby allowing for modifications in the original {open_quotes}blueprint{close_quotes} represented by the individual.more » 
Analysis of lowlevel computer vision algorithms for implementation on a very large scale integrated (VLSI) processor array
In a recent paper, Lowry (1981) described an architecture for a computer vision rectangular processor array that is suitable for VLSI implementation. In this paper the authors review that architecture, discuss extensions to it and present results of an array simulator applied to vision algorithms. They also present an algorithm for rerouting an array with bad processors into a working subset of the array, making it feasible to implement a large array on one wafersized chip. 7 references. 
Performance analysis of wavefront algorithms on verylarge scale distributed systems
The authors present a model for the parallel performance of algorithms that consist of concurrent, twodimensional wavefronts implemented in a message passing environment. The model combines the separate contributions of computation and communication wavefronts. They validate the model on three important supercomputer systems, on up to 500 processors. They use data from a deterministic particle transport application taken from the ASCI workload, although the model is general to any wavefront algorithm implemented on a 2D processor domain. They also use the validated model to make estimates of performance and scalability of wavefront algorithms on 100TFLOPS computer systems expected to bemore » 
Performance analysis of largescale applications based on wavefront algorithms
The authors introduced a performance model for parallel, multidimensional, wavefront calculations with machine performance characterized using the LogGP framework. The model accounts for overlap in the communication and computation components. The agreement with experimental data is very good under a variety of model sizes, data partitionings, blocking strategies, and on three different parallel architectures. Using the model, the authors analyzed performance of a deterministic transport code on a hypothetical 100 Tflops future parallel system of interest to ASCI.