Gene recognition and assembly in the GRAIL system: Progress and challenges

Uberbacher, E C; Einstein, J R; Guan, X; Mural, R J

Gene recognition and assembly in the GRAIL system: Progress and challenges

Conference · Thu Oct 01 04:00:00 EDT 1992

OSTI ID:10177796

Uberbacher, E C; Einstein, J R; Guan, X; Mural, R J

GRAIL is a comprehensive system being constructed to analyze and characterize the genetic structure of DNA sequences. A number of program modules supply information to the system including the Coding Recognition Module (CRM), which forms the basis of the current e-mail GRAIL server system. Additional modules determine the positions and scores of possible splice junctions, the positions of potential translation-initiation sites, the coding strand for each gene, and the probable-translation-frame function over the sequence. The Gene Assembly Program module (GAP) attempts to predict the sequence of the spliced mRNA for agene from the genomic DNA sequence. It constructs and scores I gene models, given a DNA sequence and the outputs of the other GRAIL modules for the sequence. GAP tests combinations of those splice junctions which are within acceptable distance from the initial predicted edges of the coding regions. Every complete gene model comprising translation-initiation site, splice junctions and stop codon, which agrees with GAP`s set of rules is scored, and the ten high-scoring models are saved. Each gene models score depends on the input scores of splice junctions used in the model, their positions relative to the initial predicted edges of the included coding regions, and the degree of agreement of the entire model with the probable-translation-frame function. If error conditions are detected, the present version of GAP attempts to correct them by the insertion and/or deletion of one or more coding regions. These insertions and deletions have resulted in a net improvement of gene models, and a particularly large improvement in the recognition and characterization of very short coding regions. The results of GRAIL including the GAP module for 26 sequences from GenBank, each with an experimentally characterized gene, are quite promising and demonstrate the feasibility of constructing largely accurate gene models strictly on the basis of DNA sequence data.

Research Organization:: Oak Ridge National Lab., TN (United States)

Sponsoring Organization:: USDOE, Washington, DC (United States)

DOE Contract Number:: AC05-84OR21400

OSTI ID:: 10177796

Report Number(s):: CONF-9206273--1; ON: DE92040709

Country of Publication:: United States

Language:: English

Similar Records

Gene recognition and assembly in the GRAIL system: Progress and challenges

Conference · Tue Dec 31 23:00:00 EST 1991 · OSTI ID:7263173

Gene recognition and assembly in the GRAIL system: Progress and challenges

Conference · Thu Dec 30 23:00:00 EST 1993 · OSTI ID:37553

Computer-based construction of gene models using the GRAIL Gene Assembly Program

Technical Report · Tue Sep 01 00:00:00 EDT 1992 · OSTI ID:10176476

Related Subjects

550200
550400
59 BASIC BIOLOGICAL SCIENCES
BIOCHEMISTRY
CODONS
DNA SEQUENCING
GENE OPERONS
GENES
GENETICS
MATHEMATICAL MODELS
PATTERN RECOGNITION
RNA PROCESSING

Gene recognition and assembly in the GRAIL system: Progress and challenges

Citation Formats

Similar Records

Related Subjects