| | |
Summary: Alignment Statistics for Long-Range Correlated
Genomic Sequences
Philipp W. Messer1
, Ralf Bundschuh2
, Martin Vingron1
, and Peter F. Arndt1
1
Max Planck Institute for Molecular Genetics, Ihnestr. 73, 14195 Berlin, Germany
2
Department of Physics, Ohio State University, 191 W Woodruff Av.,
Columbus OH 43210-1117, USA
Abstract. It is well known that the base composition along eukaryotic
genomes is long-range correlated. Here, we investigate the effect of such
long-range correlations on alignment score statistics. We model the cor-
related score-landscape by means of a Gaussian approximation. In this
framework, we can calculate the corrections to the scale parameter of
the extreme value distribution of alignment scores. To evaluate our ap-
proximate analytic results, we perform a detailed numerical study based
on a simple algorithm to efficiently generate long-range correlated ran-
dom sequences. We find that the mean and the exponential tail of the
|