skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Sorting large files on a backend multiprocessor

Journal Article · · IEEE Trans. Comput.; (United States)
DOI:https://doi.org/10.1109/12.2222· OSTI ID:7177748

A fundamental measure of processing power in a database management system is the performance of the sort utility it provides. When sorting a large data file on a serial computer, performance is limited by factors involving processor speed, memory capacity, and I/O bandwidth. In this paper, the authors investigate the feasibility and efficiency of a parallel sort-merge algorithm through implementation on the JASMIN prototype, a backend multiprocessor built around a fast packet bus. The authors describe the design and implementation of a parallel sort utility. They then present and analyze the results of measurements corresponding to a range of file sizes and processor configurations. Their results show that using current, off-the-shelf technology coupled with a streamlined distributed operating system, three- and five-microprocessor configurations provide a very cost-effective sort of large files. The three-processor configuration sorts a 100 Mbyte file in 1 h, which compares well to commercial sort packages available on high-performance mainframes. In additional experiments, the authors investigate a model to tune their sort software and scale their results to higher processor and network capabilities.

Research Organization:
Dept. of Computer Science, Cornell Univ., Ithaca, NY (US)
OSTI ID:
7177748
Journal Information:
IEEE Trans. Comput.; (United States), Vol. 37:7
Country of Publication:
United States
Language:
English