| | |
Summary: Reference-Based Alignment in Large Sequence Databases
Panagiotis Papapetrou 1
, Vassilis Athitsos 2
, George Kollios 1
, and Dimitrios Gunopulos 3,4
1
Computer Science Department, Boston University
2
Computer Science and Engineering Department, University of Texas at Arlington
3
Department of Informatics and Telecommunications, University of Athens
4
Computer Science and Engineering Department, UC Riverside
ABSTRACT
This paper introduces a novel method, called Reference-Based String
Alignment (RBSA), that speeds up retrieval of optimal subsequence
matches in large databases of sequences under the edit distance and
the Smith-Waterman similarity measure. RBSA operates using the
assumption that the optimal match deviates by a relatively small
amount from the query, an amount that does not exceed a prespec-
|