| | |
Summary: Colored de Bruijn Graphs and
the Genome Halving Problem
Max A. Alekseyev and Pavel A. Pevzner
Abstract--Breakpoint graph analysis is a key algorithmic technique in studies of genome rearrangements. However, breakpoint
graphs are defined only for genomes without duplicated genes, thus limiting their applications in rearrangement analysis. We discuss a
connection between the breakpoint graphs and de Bruijn graphs that leads to a generalization of the notion of breakpoint graph for
genomes with duplicated genes. We further use the generalized breakpoint graphs to study the Genome Halving Problem (first
introduced and solved by Nadia El-Mabrouk and David Sankoff). The El-Mabrouk-Sankoff algorithm is rather complex, and, in this
paper, we present an alternative approach that is based on generalized breakpoint graphs. The generalized breakpoint graphs make
the El-Mabrouk-Sankoff result more transparent and promise to be useful in future studies of genome rearrangements.
Index Terms--Genome duplication, genome halving, genome rearrangement, reversal, breakpoint graph, de Bruijn graph.
Ç
1 INTRODUCTION
THE Genome Halving Problem is motivated by the whole
genome duplication events in molecular evolution [18],
[25], [20], [17], [9]. These dramatic evolutionary events
double the gene content of a genome R and result in a
perfect duplicated genome R È R that contains two identical
copies of each chromosome. The genome then becomes
subject to rearrangements that shuffle the genes in R È R
|