skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: High performance data analytics on a commodity private cloud

Authors:
 [1]
  1. Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States)
Publication Date:
Research Org.:
Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Sponsoring Org.:
USDOE National Nuclear Security Administration (NNSA)
OSTI Identifier:
1222658
Report Number(s):
SAND-2015-8351R
607082
DOE Contract Number:
AC04-94AL85000
Resource Type:
Technical Report
Country of Publication:
United States
Language:
English

Citation Formats

Beggio, Christopher. High performance data analytics on a commodity private cloud. United States: N. p., 2015. Web. doi:10.2172/1222658.
Beggio, Christopher. High performance data analytics on a commodity private cloud. United States. doi:10.2172/1222658.
Beggio, Christopher. 2015. "High performance data analytics on a commodity private cloud". United States. doi:10.2172/1222658. https://www.osti.gov/servlets/purl/1222658.
@article{osti_1222658,
title = {High performance data analytics on a commodity private cloud},
author = {Beggio, Christopher},
abstractNote = {},
doi = {10.2172/1222658},
journal = {},
number = ,
volume = ,
place = {United States},
year = 2015,
month = 9
}

Technical Report:

Save / Share:
  • For a more detailed look, Netpipe was used to provide the signature graphs of the NDIS services over the Giganet VIA and Packet Engines Gigabit Ethernet hardware. Two Hamachi, a second generation Gigabit Ethernet NIC, were installed in place of the first generation cards. From figure 23 the Gigabit Ethernet (GigE) had significant less bandwidth performance although the theoretical line speed was equal. This means that buffering and device tuning would be necessary for such a gateway to function effectively. To examine the TCP/IP transport stack delay, the Giganet VIA latency and the Netpipe TCP stream tests were used. Themore » following assumptions were made during the analysis: (1) The latency introduced by address translation between VIA and TCP/IP is assumed to be negligible compared to the TCP/IP processing. For a real gateway, the lookup would be done via a hash table and set up a priority. (2) Transferring the data from the VIA to the TCP/IP stack would be done through a DMA copy from the VIA NIC to user memory, followed by a memory copy into the TCP/IP stack. The DMA transfer time is assumed to be negligible compared to the memory copy. The latency of the memory copy is included as part of the TCP/IP processing time. The VIA latency test provides a baseline of the inbound VIA. Both the Giganet and the Netpipe application were using the same hardware setup. The time difference between the two provides an insight as to the latency added by the TCP/IP processing. The graph shows the time difference between polling and non-polling VIA latency versus the TCP/IP processing.« less
  • A variety of VIA network models was developed in this project. For model validation, small network models were constructed using the network model editor GUI. These small models allowed node model functions to be tested and debugged. As the node models neared completion, small networks of actual VIA hardware were constructed in the lab, and the same networks were constructed in OPNET. Simulation results for these networks were compared against the measured performance metrics in the laboratory to determine the degree to which the simulation model agreed with the actual network results. Once validation was completed, the EMA feature wasmore » used to construct a large VIA network. This network represented a cluster of 1024 nodes. In this cluster, 4 nodes were connected to each switch, and the switches were arranged in a 16x16, 2-D torus configuration. Therefore, each switch had one connection to each of its four neighbors. Furthermore, the routing tables of each switch were populated so that each of its nodes communicated to its corresponding node four hops away.« less
  • Query-driven visualization and analytics is a unique approach for high-performance visualization that offers new capabilities for knowledge discovery and hypothesis testing. The new capabilities akin to finding needles in haystacks are the result of combining technologies from the fields of scientific visualization and scientific data management. This approach is crucial for rapid data analysis and visualization in the petascale regime. This article describes how query-driven visualization is applied to a hero-sized network traffic analysis problem.
  • Abstract not provided.