skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: SDN-NGenIA, a Software Defined Next Generation Integrated Architecture for HEP and Data Intensive Science

Technical Report ·
DOI:https://doi.org/10.2172/1593835· OSTI ID:1593835
 [1];  [1];  [1];  [1]
  1. California Institute of Technology (CalTech), Pasadena, CA (United States)

This project aims to enable the LHC and other leading programs in high energy physics and other global science domains funded by the DOE to operate with a new level of efficiency and control, through the development of a Next Generation Integrated Architecture (NGenIA) based on intelligent software defined network (SDN)-driven network systems coupled to high throughput applications. While the initial focus is on the challenging LHC use case, the systems and products being developed are general, and apply to many fields of data intensive science ranging from astrophysical sky surveys to bioinformatics and earth observation, as well as other organizations facing the challenges of extracting knowledge from distributed multi-petabyte data stores. A central concept in this development program is a new paradigm ‘consistent network operations’ among widely distributed computing and storage facilities, where stable high throughput flows, at set rates, cross load-balanced network paths, up to flexible high water marks that are adjusted in real time to accommodate other network traffic. The large smooth flows are launched and managed by SDN services that act in concert with the experiments’ site-resident data distribution and management systems, to meet the expanding needs of the science programs. The technical goals include the construction of autonomous, intelligent site-resident services that interact dynamically with network-resident services, and with the science programs’ principal data distribution and management tools, to request or command network resources in support of high throughput petascale to exascale workflow Specific work items include: 1. Developing compact Data Transfer Nodes (DTNs) with auto-tuning functions that support data transfer rates in the 100 Gbps to the 1 Tbps range when used with high throughput data transfer applications 2. Deep site orchestration among virtualized clusters, storage subsystems and subnets to successfully coschedule CPU, storage and network resources 3. Science-program designed site architectures, operational modes, and policy and resource usage priorities, adjudicated across multiple network domains and virtual organizations, using the orchestration functions and methods 4. Seamlessly extending end-to-end operation across both extra-site and intra-site boundaries through the use of Open vSwitch (OVS), FireQoS + next generation Science DMZs 5. Novel methods of system integration that enable granular control of extreme scale long distance transfers through flow matching of scattered source-destination address pairs to multi-domain dynamic circuits 6. Funneling massive sets of streams to DTNs at the site edge hosting petascale buffer pools configured for flows of 100 Gbps and up, exploiting state of the art data transfers 7. Adaptive scheduling based on pervasive end-to-end monitoring, including DTN or compute-node resident agents providing comprehensive end-system profiling 8. Developing unsupervised and supervised machine learning and modeling methods to optimize the workflow involving terabyte to multi-petabyte datasets.

Research Organization:
California Institute of Technology (CalTech), Pasadena, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
Contributing Organization:
ESnet; CENIC; Pacific Wave; AmLight; SCinet; Ciena; USC; StarLight; iCAIR; USC/ISI; Internet2; Northeastern; Yale Univ.; Colorado State Univ.; Fermi National Accelerator Lab.; CERN; TIFR (Mumbai); Univ. of California, Los Angeles; KISTI; Lawrence Berkeley National Lab.; Michigan State Univ.; RNP; UNESP; REUNA; SURFnet
DOE Contract Number:
SC0015527
OSTI ID:
1593835
Report Number(s):
FINAL-report-DOE-Caltech-11111-4; TRN: US2102574
Country of Publication:
United States
Language:
English