skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Spliced alignment: A new approach to gene recognition

Abstract

Gene structure prediction is one of the most important problems in computational molecular biology. Previous attempts to solve this problem were based on statistics and artificial intelligence and, surprisingly enough, applications of theoretical computer science methods for gene recognition were almost unexplored. Recent advances in large-scale cDNA sequencing open a way towards a new combinatorial approach to gene recognition. This paper describes a spliced alignment algorithm and a software tool which explores all possible exon assemblies in polynomial time and finds the multi-exon structure with the best fit to a related protein. Unlike other existing methods, the algorithm successfully recognizes genes even in the case of short exons or exons with unusual codon usage; the authors also report correct assemblies for genes with more than 10 exons. On a test sample of human genes with known mammalian relatives the average correlation between the predicted and the actual genes was 99%, which is a very high accuracy as compared with other existing methods. The algorithm correctly reconstructed 87% of genes and the rare discrepancies between the predicted and real exon-intron structures were caused by either (i) extremely short (less than 5 amino acids) initial or terminal exons, or (ii) alternative splicing,more » or (iii) errors in database feature tables. 38 refs., 3 tabs.« less

Authors:
 [1];  [2];  [3]
  1. Inst. of Protein Research, Puschino (Russian Federation)
  2. NIIGENETIKA, Moscow (Russian Federation)
  3. Univ. of California, Los Angeles, CA (United States)
Publication Date:
OSTI Identifier:
495270
Report Number(s):
CONF-960679-
TRN: 97:000617-0002
Resource Type:
Conference
Resource Relation:
Conference: 7. symposium on combinatorial pattern matching, Laguna Beach, CA (United States), 10-12 Jun 1996; Other Information: PBD: 1996; Related Information: Is Part Of Combinatorial pattern matching; Hirschberg, D.; Myers, G. [eds.]; PB: 393 p.
Country of Publication:
United States
Language:
English
Subject:
55 BIOLOGY AND MEDICINE, BASIC STUDIES; HUMAN CHROMOSOMES; PATTERN RECOGNITION; GENETIC MAPPING; COMPUTER CALCULATIONS; CODONS; EXONS

Citation Formats

Gelfand, M S, Mironov, A A, and Pevzner, P A. Spliced alignment: A new approach to gene recognition. United States: N. p., 1996. Web.
Gelfand, M S, Mironov, A A, & Pevzner, P A. Spliced alignment: A new approach to gene recognition. United States.
Gelfand, M S, Mironov, A A, and Pevzner, P A. Tue . "Spliced alignment: A new approach to gene recognition". United States.
@article{osti_495270,
title = {Spliced alignment: A new approach to gene recognition},
author = {Gelfand, M S and Mironov, A A and Pevzner, P A},
abstractNote = {Gene structure prediction is one of the most important problems in computational molecular biology. Previous attempts to solve this problem were based on statistics and artificial intelligence and, surprisingly enough, applications of theoretical computer science methods for gene recognition were almost unexplored. Recent advances in large-scale cDNA sequencing open a way towards a new combinatorial approach to gene recognition. This paper describes a spliced alignment algorithm and a software tool which explores all possible exon assemblies in polynomial time and finds the multi-exon structure with the best fit to a related protein. Unlike other existing methods, the algorithm successfully recognizes genes even in the case of short exons or exons with unusual codon usage; the authors also report correct assemblies for genes with more than 10 exons. On a test sample of human genes with known mammalian relatives the average correlation between the predicted and the actual genes was 99%, which is a very high accuracy as compared with other existing methods. The algorithm correctly reconstructed 87% of genes and the rare discrepancies between the predicted and real exon-intron structures were caused by either (i) extremely short (less than 5 amino acids) initial or terminal exons, or (ii) alternative splicing, or (iii) errors in database feature tables. 38 refs., 3 tabs.},
doi = {},
url = {https://www.osti.gov/biblio/495270}, journal = {},
number = ,
volume = ,
place = {United States},
year = {1996},
month = {12}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: