Automated whole-genome multiple alignment of rat, mouse, and human
We have built a whole genome multiple alignment of the three currently available mammalian genomes using a fully automated pipeline which combines the local/global approach of the Berkeley Genome Pipeline and the LAGAN program. The strategy is based on progressive alignment, and consists of two main steps: (1) alignment of the mouse and rat genomes; and (2) alignment of human to either the mouse-rat alignments from step 1, or the remaining unaligned mouse and rat sequences. The resulting alignments demonstrate high sensitivity, with 87% of all human gene-coding areas aligned in both mouse and rat. The specificity is also high: <7% of the rat contigs are aligned to multiple places in human and 97% of all alignments with human sequence > 100kb agree with a three-way synteny map built independently using predicted exons in the three genomes. At the nucleotide level <1% of the rat nucleotides are mapped to multiple places in the human sequence in the alignment; and 96.5% of human nucleotides within all alignments agree with the synteny map. The alignments are publicly available online, with visualization through the novel Multi-VISTA browser that we also present.
- Research Organization:
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
- Sponsoring Organization:
- USDOE Director. Office of Science; National Heart Lung and Blood Institute. Genomic Applications Grant, National Science Foundation Fellowship, National Institutes of Health Grant U1HL66729A (US)
- DOE Contract Number:
- AC03-76SF00098
- OSTI ID:
- 840036
- Report Number(s):
- LBNL-54561; R&D Project: GHPGA2; TRN: US200509%%768
- Journal Information:
- Genome Research, Vol. 14; Other Information: Submitted to Genome Research: Volume 14; Journal Publication Date: April 2004; PBD: 4 Jul 2004
- Country of Publication:
- United States
- Language:
- English
Similar Records
Phylo-vista: Interactive visualization of multiple DNA sequence alignments
Phylo-VISTA: Interactive visualization of multiple DNA sequence alignments