DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: NUMA-Aware Thread Scheduling for Big Data Transfers over Terabits Network Infrastructure

Abstract

The evergrowing trend of big data has led scientists to share and transfer the simulation and analytical data across the geodistributed research and computing facilities. However, the existing data transfer frameworks used for data sharing lack the capability to adopt the attributes of the underlying parallel file systems (PFS). LADS (Layout-Aware Data Scheduling) is an end-to-end data transfer tool optimized for terabit network using a layout-aware data scheduling via PFS. However, it does not consider the NUMA (Nonuniform Memory Access) architecture. In this paper, we propose a NUMA-aware thread and resource scheduling for optimized data transfer in terabit network. First, we propose distributed RMA buffers to reduce memory controller contention in CPU sockets and then schedule the threads based on CPU socket and NUMA nodes inside CPU socket to reduce memory access latency. We design and implement the proposed resource and thread scheduling in the existing LADS framework. Experimental results showed from 21.7% to 44% improvement with memory-level optimizations in the LADS framework as compared to the baseline without any optimization.

Authors:
ORCiD logo [1]; ORCiD logo [1]; ORCiD logo [1]; ORCiD logo [2];  [3]
  1. Sogang Univ., Seoul (Korea)
  2. Ajou Univ., Suwon (Korea)
  3. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Publication Date:
Research Org.:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1565699
Resource Type:
Accepted Manuscript
Journal Name:
Scientific Programming
Additional Journal Information:
Journal Volume: 2018; Journal ID: ISSN 1058-9244
Publisher:
Hindawi
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; Computer Science

Citation Formats

Kim, Taeuk, Khan, Awais, Kim, Youngjae, Kasu, Preethika, and Atchley, Scott. NUMA-Aware Thread Scheduling for Big Data Transfers over Terabits Network Infrastructure. United States: N. p., 2018. Web. doi:10.1155/2018/4120561.
Kim, Taeuk, Khan, Awais, Kim, Youngjae, Kasu, Preethika, & Atchley, Scott. NUMA-Aware Thread Scheduling for Big Data Transfers over Terabits Network Infrastructure. United States. https://doi.org/10.1155/2018/4120561
Kim, Taeuk, Khan, Awais, Kim, Youngjae, Kasu, Preethika, and Atchley, Scott. Mon . "NUMA-Aware Thread Scheduling for Big Data Transfers over Terabits Network Infrastructure". United States. https://doi.org/10.1155/2018/4120561. https://www.osti.gov/servlets/purl/1565699.
@article{osti_1565699,
title = {NUMA-Aware Thread Scheduling for Big Data Transfers over Terabits Network Infrastructure},
author = {Kim, Taeuk and Khan, Awais and Kim, Youngjae and Kasu, Preethika and Atchley, Scott},
abstractNote = {The evergrowing trend of big data has led scientists to share and transfer the simulation and analytical data across the geodistributed research and computing facilities. However, the existing data transfer frameworks used for data sharing lack the capability to adopt the attributes of the underlying parallel file systems (PFS). LADS (Layout-Aware Data Scheduling) is an end-to-end data transfer tool optimized for terabit network using a layout-aware data scheduling via PFS. However, it does not consider the NUMA (Nonuniform Memory Access) architecture. In this paper, we propose a NUMA-aware thread and resource scheduling for optimized data transfer in terabit network. First, we propose distributed RMA buffers to reduce memory controller contention in CPU sockets and then schedule the threads based on CPU socket and NUMA nodes inside CPU socket to reduce memory access latency. We design and implement the proposed resource and thread scheduling in the existing LADS framework. Experimental results showed from 21.7% to 44% improvement with memory-level optimizations in the LADS framework as compared to the baseline without any optimization.},
doi = {10.1155/2018/4120561},
journal = {Scientific Programming},
number = ,
volume = 2018,
place = {United States},
year = {Mon May 07 00:00:00 EDT 2018},
month = {Mon May 07 00:00:00 EDT 2018}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 2 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Analysis of NUMA effects in modern multicore systems for the design of high-performance data transfer applications
journal, September 2017


RAMSYS: Resource-Aware Asynchronous Data Transfer with Multicore SYStems
journal, May 2017

  • Li, Tan; Ren, Yufei; Yu, Dantong
  • IEEE Transactions on Parallel and Distributed Systems, Vol. 28, Issue 5
  • DOI: 10.1109/TPDS.2016.2619344

Globus Toolkit Version 4: Software for Service-Oriented Systems
journal, July 2006