Sorting large files on a backend multiprocessor
A fundamental measure of processing power in a database management system is the performance of the sort utility it provides. When sorting a large data file on a serial computer, performance is limited by factors involving processor speed, memory capacity, and I/O bandwidth. In this paper, the authors investigate the feasibility and efficiency of a parallel sort-merge algorithm through implementation on the JASMIN prototype, a backend multiprocessor built around a fast packet bus. The authors describe the design and implementation of a parallel sort utility. They then present and analyze the results of measurements corresponding to a range of file sizes and processor configurations. Their results show that using current, off-the-shelf technology coupled with a streamlined distributed operating system, three- and five-microprocessor configurations provide a very cost-effective sort of large files. The three-processor configuration sorts a 100 Mbyte file in 1 h, which compares well to commercial sort packages available on high-performance mainframes. In additional experiments, the authors investigate a model to tune their sort software and scale their results to higher processor and network capabilities.
- Research Organization:
- Dept. of Computer Science, Cornell Univ., Ithaca, NY (US)
- OSTI ID:
- 7177748
- Journal Information:
- IEEE Trans. Comput.; (United States), Vol. 37:7
- Country of Publication:
- United States
- Language:
- English
Similar Records
Beyond striping; The Bridge multiprocessor file system
Hector; A hierarchically structured shared-memory multiprocessor