skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Gene recognition and assembly in the GRAIL system: Progress and challenges

Abstract

GRAIL is a comprehensive system being constructed to analyze and characterize the genetic structure of DNA sequences. A number of program modules supply information to the system including the Coding Recognition Module (CRM), which forms the basis of the current e-mail GRAIL server system. Additional modules determine the positions and scores of possible splice junctions, the positions of potential translation-initiation sites, the coding strand for each gene, and the probable-translation-frame function over the sequence. The Gene Assembly Program module (GAP) attempts to predict the sequence of the spliced mRNA for agene from the genomic DNA sequence. It constructs and scores I gene models, given a DNA sequence and the outputs of the other GRAIL modules for the sequence. GAP tests combinations of those splice junctions which are within acceptable distance from the initial predicted edges of the coding regions. Every complete gene model comprising translation-initiation site, splice junctions and stop codon, which agrees with GAP's set of rules is scored, and the ten high-scoring models are saved. Each gene models score depends on the input scores of splice junctions used in the model, their positions relative to the initial predicted edges of the included coding regions, and the degree ofmore » agreement of the entire model with the probable-translation-frame function. If error conditions are detected, the present version of GAP attempts to correct them by the insertion and/or deletion of one or more coding regions. These insertions and deletions have resulted in a net improvement of gene models, and a particularly large improvement in the recognition and characterization of very short coding regions. The results of GRAIL including the GAP module for 26 sequences from GenBank, each with an experimentally characterized gene, are quite promising and demonstrate the feasibility of constructing largely accurate gene models strictly on the basis of DNA sequence data.« less

Authors:
; ; ;
Publication Date:
Research Org.:
Oak Ridge National Lab., TN (United States)
Sponsoring Org.:
USDOE; USDOE, Washington, DC (United States)
OSTI Identifier:
7263173
Report Number(s):
CONF-9206273-1
ON: DE92040709
DOE Contract Number:  
AC05-84OR21400
Resource Type:
Conference
Resource Relation:
Conference: 2. international conference on bioinformatics, supercomputing, and complex genome analysis, St. Petersburg, FL (United States), 4-7 Jun 1992
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; DNA SEQUENCING; PATTERN RECOGNITION; GENES; MATHEMATICAL MODELS; CODONS; GENE OPERONS; RNA PROCESSING; STRUCTURAL CHEMICAL ANALYSIS; 550400* - Genetics; 550200 - Biochemistry

Citation Formats

Uberbacher, E C, Einstein, J R, Guan, X, and Mural, R J. Gene recognition and assembly in the GRAIL system: Progress and challenges. United States: N. p., 1992. Web.
Uberbacher, E C, Einstein, J R, Guan, X, & Mural, R J. Gene recognition and assembly in the GRAIL system: Progress and challenges. United States.
Uberbacher, E C, Einstein, J R, Guan, X, and Mural, R J. Wed . "Gene recognition and assembly in the GRAIL system: Progress and challenges". United States.
@article{osti_7263173,
title = {Gene recognition and assembly in the GRAIL system: Progress and challenges},
author = {Uberbacher, E C and Einstein, J R and Guan, X and Mural, R J},
abstractNote = {GRAIL is a comprehensive system being constructed to analyze and characterize the genetic structure of DNA sequences. A number of program modules supply information to the system including the Coding Recognition Module (CRM), which forms the basis of the current e-mail GRAIL server system. Additional modules determine the positions and scores of possible splice junctions, the positions of potential translation-initiation sites, the coding strand for each gene, and the probable-translation-frame function over the sequence. The Gene Assembly Program module (GAP) attempts to predict the sequence of the spliced mRNA for agene from the genomic DNA sequence. It constructs and scores I gene models, given a DNA sequence and the outputs of the other GRAIL modules for the sequence. GAP tests combinations of those splice junctions which are within acceptable distance from the initial predicted edges of the coding regions. Every complete gene model comprising translation-initiation site, splice junctions and stop codon, which agrees with GAP's set of rules is scored, and the ten high-scoring models are saved. Each gene models score depends on the input scores of splice junctions used in the model, their positions relative to the initial predicted edges of the included coding regions, and the degree of agreement of the entire model with the probable-translation-frame function. If error conditions are detected, the present version of GAP attempts to correct them by the insertion and/or deletion of one or more coding regions. These insertions and deletions have resulted in a net improvement of gene models, and a particularly large improvement in the recognition and characterization of very short coding regions. The results of GRAIL including the GAP module for 26 sequences from GenBank, each with an experimentally characterized gene, are quite promising and demonstrate the feasibility of constructing largely accurate gene models strictly on the basis of DNA sequence data.},
doi = {},
url = {https://www.osti.gov/biblio/7263173}, journal = {},
number = ,
volume = ,
place = {United States},
year = {1992},
month = {1}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: