skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Computer-based construction of gene models using the GRAIL Gene Assembly Program

Technical Report ·
DOI:https://doi.org/10.2172/7160076· OSTI ID:7160076

The Gene Assembly Program (GAP), a module of GRAIL, assembles and scores gene models, given a DNA sequence and the outputs of other GRAIL modules for the sequence. The latter modules determine the positions of coding regions, the positions and scores of possible splice junctions, the positions of possible translation-initiation sites, the coding strand for the gene, and the probable-translation-frame function over the sequence. GAP tests combinations of those splice junctions which are within acceptable distances from the initial estimated edges of the coding regions. Every complete gene model, comprising translation-initiation site, splice junctions and stop codon, which agrees with GAP's set of rules is scored, and the ten highest-scoring models are saved. Each gene-model score depends on the input scores of splice junctions used in the model, their positions relative to the initial estimated edges of the included coding regions, and the degree of agreement of the entire model with the probable-translation-frame function. If error conditions are detected, the present version of GAP attempts to correct them by the insertion and/or deletion of one or more coding regions. These insertions and deletions have resulted in a net improvement of gene models, and a particularly large improvement in the recognition and characterization of very short coding regions. The results of GRAIL including the GAP module for 26 sequences from GenBank, each with a biochemically characterized single gene, are quite promising and demonstrate the feasibility of constructing largely accurate gene models strictly on the basis of sequence data.

Research Organization:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE; USDOE, Washington, DC (United States)
DOE Contract Number:
AC05-84OR21400
OSTI ID:
7160076
Report Number(s):
ORNL/TM-12174; ON: DE93000563
Country of Publication:
United States
Language:
English