Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Combinatorial methods for gene recognition

Technical Report ·
DOI:https://doi.org/10.2172/764709· OSTI ID:764709
 [1]
  1. Department of Computer Science, The Pennsylvania State University, University Park, PA 16802, USA
The major result of the project is the development of a new approach to gene recognition called spliced alignment algorithm. They have developed an algorithm and implemented a software tool (for both IBM PC and UNIX platforms) which explores all possible exon assemblies in polynomial time and finds the multi-exon structure with the best fit to a related protein. Unlike other existing methods, the algorithm successfully performs exons assemblies even in the case of short exons or exons with unusual codon usage; they also report correct assemblies for the genes with more than 10 exons provided a homologous protein is already known. On a test sample of human genes with known mammalian relatives the average overlap between the predicted and the actual genes was 99%, which is remarkably well as compared to other existing methods. At that, the algorithm absolute correctly reconstructed 87% of genes. The rare discrepancies between the predicted and real axon-intron structures were restricted either to extremely short initial or terminal exons or proved to be results of alternative splicing. Moreover, the algorithm performs reasonably well with non-vertebrate and even prokaryote targets. The spliced alignment software PROCRUSTES has been in extensive use by the academic community since its announcement in August, 1996 via the WWW server (www-hto.usc.edu/software/procrustes) and by biotech companies via the in-house UNIX version.
Research Organization:
Department of Computer Science, The Pennsylvania State Univ., University Park, PA 16802 (US)
Sponsoring Organization:
USDOE Office of Energy Research (ER) (US)
DOE Contract Number:
FG02-94ER61919
OSTI ID:
764709
Country of Publication:
United States
Language:
English

Similar Records

Spliced alignment: A new approach to gene recognition
Conference · Mon Dec 30 23:00:00 EST 1996 · OSTI ID:495270

Las Vegas algorithms for gene recognition: Suboptimal and error-tolerant spliced alignment
Conference · Sun Nov 30 23:00:00 EST 1997 · OSTI ID:549029

nGASP - the nematode genome annotation assessment project
Journal Article · Thu Dec 18 23:00:00 EST 2008 · nGASP - the nematode genome annotation assessment project, vol. 9, N/A, December 1, 2008, pp. 549 · OSTI ID:950645