| | |
Summary: On Sorting Strings in External Memory
(extended Abstract)
Lars Arge \Lambda Paolo Ferragina y Roberto Grossi z Jeffrey Scott Vitter x
Abstract. In this paper we address for the first time the I/O
complexity of the problem of sorting strings in external mem
ory, which is a fundamental component of many largescale
text applications. In the standard unitcost RAM comparison
model, the complexity of sorting K strings of total length N
is \Theta(K log 2 K+N). By analogy, in the external memory (or
I/O) model, where the internal memory has size M and the
block transfer size is B, it would be natural to guess that the
I/O complexity of sorting strings is \Theta( K
B log M=B
K
B + N
B ),
but the known algorithms do not come even close to achiev
ing this bound. Our results show, somewhat counterintu
itively, that the I/O complexity of string sorting depends upon
the length of the strings relative to the block size. We first
|