 
Summary: PREPRINT. Algorithmica 23, 1999.
SuOEx Trees on Words
Arne Andersson N. Jesper Larsson Kurt Swanson
Dept. of Computer Science, Lund University,
Box 118, S221 00 LUND, Sweden
arne,jesper,kurt@dna.lth.se
Abstract
We discuss an intrinsic generalization of the suOEx tree, designed to
index a string of length n which has a natural partitioning into m multi
character substrings or words. This word suOEx tree represents only the m
suOExes that start at word boundaries. These boundaries are determined
by delimiters, whose deønition depends on the application.
Since traditional suOEx tree construction algorithms rely heavily on the
fact that all suOExes are inserted, construction of a word suOEx tree is non
trivial, in particular when only O(m) construction space is allowed. We
solve this problem, presenting an algorithm with O(n) expected running
time. In general, construction cost is \Omega\Gamma n) due to the need of scanning
the entire input. In applications that require strict node ordering, an ad
ditional cost of sorting O(m 0
) characters arises, where m 0 is the number
