Summary: Alignment of Low Information Sequences
David R. Powell, Lloyd Allison, Trevor I. Dix, David L. Dowe
Department of Computer Science, Monash University, Clayton,
Vic. 3168, Australia
e-mail: fpowell, lloyd, trevor, firstname.lastname@example.org
Abstract. Alignment of two random sequences over a xed alphabet
can be shown to be optimally done by a Dynamic Programming Algo-
rithm (DPA). It is normally assumed that the sequences are random
and incompressible and that one sequence is a mutation of the other.
However, DNA and many other sequences are not always random and
unstructured, and the issue arises as how to best align compressible se-
Assuming our sequences to be non-random and to emanate from mu-
tations of a rst order Markov model, we note that alignment of high
information regions is more important than alignment of low informa-
tion regions and arrive at a new alignment method for low information
sequences which performs better than the standard DPA for data gener-
ated from mutations of a rst order Markov model.
Keywords: Sequence Alignment, DNA, Biology, Information Theory.