Summary: PREPRINT. SoftwarePractice and Experience, 25(2):129141, 1995.
Efficient Implementation of Suffix Trees \Lambda
Arne Andersson Stefan Nilsson
Department of Computer Science, Lund University,
Box 118, S221 00 Lund, Sweden
We study the problem of string searching using the traditional approach of
storing all unique substrings of the text in a suffix tree. The methods of path
compression, level compression, and data compression are combined to build
a simple, compact, and efficient implementation of a suffix tree. Based on
a comparative discussion and extensive experiments, we argue that our new
data structure is superior to previous methods in many practical situations.
Keywords: LCtrie, path compression, level compression, data compression,
suffix tree, suffix array.
Locating substrings in text documents is a commonly occurring task. For large
documents, such as books, dictionaries, encyclopedias, and DNA sequences, the
choice of data structure and algorithms will have noticeable effect on computer
capacity requirements and response times. If the document is small or if speed
is of no concern, we can answer a query by scanning the entire text. However, if