skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Deep Packet/Flow Analysis using GPUs for High-Bandwidth Networks


Deep packet inspection (DPI) is widely used in content-aware network applications, such as surveillance, statistics gathering, and traffic control.

 [1];  [1];  [1];  [1];  [1]
  1. Fermilab
Publication Date:
Research Org.:
Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States)
Sponsoring Org.:
USDOE Office of Science (SC), High Energy Physics (HEP) (SC-25)
OSTI Identifier:
Report Number(s):
DOE Contract Number:
Resource Type:
Resource Relation:
Journal Name: The Greater Chicago Area Systems Research Workshop; Conference: 6th Greater Chicago Area Systems Research Workshop (GCASR), Illinois Institute of Technology, McCormick Tribune Campus Center, 04/24/2017
Country of Publication:
United States

Citation Formats

Gong, Qian, Wu, W., Zhang, L., Sasidharan, S., and DeMar, P.. Deep Packet/Flow Analysis using GPUs for High-Bandwidth Networks. United States: N. p., 2017. Web.
Gong, Qian, Wu, W., Zhang, L., Sasidharan, S., & DeMar, P.. Deep Packet/Flow Analysis using GPUs for High-Bandwidth Networks. United States.
Gong, Qian, Wu, W., Zhang, L., Sasidharan, S., and DeMar, P.. Thu . "Deep Packet/Flow Analysis using GPUs for High-Bandwidth Networks". United States. doi:.
title = {Deep Packet/Flow Analysis using GPUs for High-Bandwidth Networks},
author = {Gong, Qian and Wu, W. and Zhang, L. and Sasidharan, S. and DeMar, P.},
abstractNote = {Deep packet inspection (DPI) is widely used in content-aware network applications, such as surveillance, statistics gathering, and traffic control.},
doi = {},
journal = {The Greater Chicago Area Systems Research Workshop},
number = ,
volume = ,
place = {United States},
year = {Thu Apr 20 00:00:00 EDT 2017},
month = {Thu Apr 20 00:00:00 EDT 2017}

Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:
  • Deep packet inspection (DPI) faces severe performance challenges in high-speed networks (40/100 GE) as it requires a large amount of raw computing power and high I/O throughputs. Recently, researchers have tentatively used GPUs to address the above issues and boost the performance of DPI. Typically, DPI applications involve highly complex operations in both per-packet and per-flow data level, often in real-time. The parallel architecture of GPUs fits exceptionally well for per-packet network traffic processing. However, for stateful network protocols such as TCP, their data stream need to be reconstructed in a per-flow level to deliver a consistent content analysis. Sincemore » the flow-centric operations are naturally antiparallel and often require large memory space for buffering out-of-sequence packets, they can be problematic for GPUs, whose memory is normally limited to several gigabytes. In this work, we present a highly efficient GPU-based deep packet/flow analysis framework. The proposed design includes a purely GPU-implemented flow tracking and TCP stream reassembly. Instead of buffering and waiting for TCP packets to become in sequence, our framework process the packets in batch and uses a deterministic finite automaton (DFA) with prefix-/suffix- tree method to detect patterns across out-of-sequence packets that happen to be located in different batches. In conclusion, evaluation shows that our code can reassemble and forward tens of millions of packets per second and conduct a stateful signature-based deep packet inspection at 55 Gbit/s using an NVIDIA K40 GPU.« less
  • With the increasing number of geographically distributed scientific collaborations and the scale of the data size growth, it has become more challenging for users to achieve the best possible network performance on a shared network. We have developed a forecast model to predict expected bandwidth utilization for high-bandwidth wide area network. The forecast model can improve the efficiency of resource utilization and scheduling data movements on high-bandwidth network to accommodate ever increasing data volume for large-scale scientific data applications. Univariate model is developed with STL and ARIMA on SNMP path utilization data. Compared with traditional approach such as Box-Jenkins methodology,more » our forecast model reduces computation time by 83.2%. It also shows resilience against abrupt network usage change. The accuracy of the forecast model is within the standard deviation of the monitored measurements.« less
  • This paper contains a methodology for analyzing and designing a computer network for application to complex control systems. The focus is on the analysis and design of a local area network (LAN) for realizing the high-level control network that interconnects input-output controllers with devices for monitoring and analysis and with high-level controllers such as supervisory PLCs. Part of the development given in this paper can also be applied to the device-level network (fieldbus) that interconnects input-output controllers with sensors, actuators, and other devices in the system being controlled. The high-level network and the device-level network form a two-layer architecture thatmore » is typical in control applications. A procedure is given for generating a network design with a hierarchical hub topology having full redundancy. Then in terms of a graph model of the network, procedures are given for studying network availability and analyzing the information flow rates through the links and internal nodes of the network'« less
  • This paper proposes and presents the design and implementation of an underlay communication channel (UCC) for 5G cognitive mesh networks. The UCC builds its waveform based on filter bank multicarrier spread spectrum (FB-MCSS) signaling. The use of this novel spread spectrum signaling allows the device-to-device (D2D) user equipments (UEs) to communicate at a level well below noise temperature and hence, minimize taxation on macro-cell/small-cell base stations and their UEs in 5G wireless systems. Moreover, the use of filter banks allows us to avoid those portions of the spectrum that are in use by macro-cell and small-cell users. Hence, both D2D-to-cellularmore » and cellular-to-D2D interference will be very close to none. We propose a specific packet for UCC and develop algorithms for packet detection, timing acquisition and tracking, as well as channel estimation and equalization. We also present the detail of an implementation of the proposed transceiver on a software radio platform and compare our experimental results with those from a theoretical analysis of our packet detection algorithm.« less
  • The paper describes a multicast switch architecture for multi-service networks that supports multi-destination packet delivery at high data transfer rates ({approx} 150 mb/sec for full motion video) and allows large aggregate data carrying capacity ({approx} 1000 mb/sec). The switch architecture is made extensible by adopting a network-oriented design whereby the switch functions are cast with the requirements of a canonical network model for packet multi-casting. The requirements are routing and priority-based scheduling of packets from the input to output link(s) of each multicast channel segment supported by a switch. Packet routing is efficiently implementable in hardware by maintaining the informationmore » about all channel segments supported by the switch in a fast associative store. Our architecture yields high switching efficiency by using high speed link processors, distributed associative store, and parallel execution of routing and scheduling activities. The paper describes various functional elements of the switch architecture, and identifies the performance boundaries of switch realization on high speed processor and communication components.« less