skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: The gene identification problem: An overview for developers

Abstract

The gene identification problem is the problem of interpreting nucleotide sequences by computer, in order to provide tentative annotation on the location, structure, and functional class of protein-coding genes. This problem is of self-evident importance, and is far from being fully solved, particularly for higher eukaryotes, Thus it is not surprising that the number of algorithm and software developers working in this area is rapidly increasing. The present paper is an overview of the field, with an emphasis on eukaryotes, for such developers.

Authors:
Publication Date:
Research Org.:
Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
Sponsoring Org.:
National Insts. of Health, Bethesda, MD (United States)
OSTI Identifier:
64182
Report Number(s):
LA-UR-95-1163; CONF-9407180-1
Journal ID: ISSN 0097-8485; ON: DE95010887; TRN: 95:004367
DOE Contract Number:  
W-7405-ENG-36
Resource Type:
Conference
Resource Relation:
Journal Volume: 20; Journal Issue: 1; Conference: 4. international workshop on open problems in computational biology, Telluride, CO (United States), 10-17 Jul 1994; Other Information: PBD: 27 Mar 1995
Country of Publication:
United States
Language:
English
Subject:
55 BIOLOGY AND MEDICINE, BASIC STUDIES; NUCLEOTIDES; DNA SEQUENCING; COMPUTER CALCULATIONS; GENES

Citation Formats

Fickett, J W. The gene identification problem: An overview for developers. United States: N. p., 1995. Web. doi:10.1016/S0097-8485(96)80012-X.
Fickett, J W. The gene identification problem: An overview for developers. United States. https://doi.org/10.1016/S0097-8485(96)80012-X
Fickett, J W. 1995. "The gene identification problem: An overview for developers". United States. https://doi.org/10.1016/S0097-8485(96)80012-X. https://www.osti.gov/servlets/purl/64182.
@article{osti_64182,
title = {The gene identification problem: An overview for developers},
author = {Fickett, J W},
abstractNote = {The gene identification problem is the problem of interpreting nucleotide sequences by computer, in order to provide tentative annotation on the location, structure, and functional class of protein-coding genes. This problem is of self-evident importance, and is far from being fully solved, particularly for higher eukaryotes, Thus it is not surprising that the number of algorithm and software developers working in this area is rapidly increasing. The present paper is an overview of the field, with an emphasis on eukaryotes, for such developers.},
doi = {10.1016/S0097-8485(96)80012-X},
url = {https://www.osti.gov/biblio/64182}, journal = {},
issn = {0097-8485},
number = 1,
volume = 20,
place = {United States},
year = {Mon Mar 27 00:00:00 EST 1995},
month = {Mon Mar 27 00:00:00 EST 1995}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:

Works referenced in this record:

Complementary DNA sequencing: expressed sequence tags and human genome project
journal, June 1991


Basic local alignment search tool
journal, October 1990


Issues in searching molecular sequence databases
journal, February 1994


Prosite: a dictionary of sites and patterns in proteins
journal, May 1992


Quantitative analysis of ribosome binding sites in E.coli
journal, January 1994


Selection of DNA binding sites by regulatory proteins
journal, June 1988


dbEST — database for “expressed sequence tags”
journal, August 1993


Gene Discovery in dbEST
journal, September 1994


What's in a genome?
journal, July 1992


Comprehensive sequence analysis of the 182 predicted open reading frames of yeast chromosome III
journal, December 1992


New genes in old sequence: a strategy for finding genes in the bacterial genome
journal, August 1994


Intrinsic and extrinsic approaches for detecting genes in a bacterial genome
journal, January 1994


Organization and Expression of Eucaryotic Split Genes Coding for Proteins
journal, June 1981


The translational termination signal database
journal, January 1993


Prediction of human mRNA donor and acceptor sites from the DNA sequence
journal, July 1991


Eukaryotic start and stop translation sites
journal, January 1991


Isolation of genes from complex sources of mammalian genomic DNA using exon amplification
journal, January 1994


Database of ancient sequences
journal, July 1993


Detecting Frame Shifts by Amino Acid Sequence Comparison
journal, December 1993


Some useful statistical properties of position-weight matrices
journal, September 1994


A Streamlined Random Sequencing Strategy for Finding Coding Exons
journal, October 1994


[15] k-tuple frequency analysis: From intron/exon discrimination to T-cell epitope mapping
book, January 1990


Gene Structure Prediction by Linguistic Methods
journal, October 1994


Compilation of vertebrate-encoded transcription factors
journal, January 1992


Determination of eukaryotic protein coding regions using neural networks and information theory
journal, July 1992


Recognition of protein coding regions in DNA sequences
journal, January 1982


Inferring genes from open reading frames
journal, September 1994


ORFs and Genes: How Strong a Connection?
journal, January 1995


Assessment of protein coding measures
journal, January 1992


Base compositional structure of genomes
journal, August 1992


Statistical analysis of mammalian pre-mRNA splicing sites
journal, January 1989


Computer prediction of the exon-intron structure of mammalian pre-mRNAs
journal, January 1990


Prediction of Function in DNA Sequence Analysis
journal, January 1995


A relational database of transcription factors
journal, January 1990


Identification of protein coding regions by database similarity search
journal, March 1993


Approximations to Profile Score Distributions
journal, January 1994


Ancient conserved regions in gene sequences
journal, June 1994


Ancient Conserved Regions in New Gene Sequences and the Protein Databases
journal, March 1993


Profile analysis: detection of distantly related proteins.
journal, July 1987


Prediction of gene structure
journal, July 1992


Distinctive Sequence Features in Protein Coding Genic Non-coding, and Intergenic Human DNA
journal, October 1995


A survey on intron and exon lengths
journal, January 1988


Automated assembly of protein blocks for database searching
journal, January 1991


Prediction of structural and functional features of protein and nucleic acid sequences by artificial neural networks
journal, August 1992


The prediction of exons through an analysis of spliceable open reading frames
journal, January 1992


Prototypic sequences for human repetitive DNA
journal, October 1992


Software Trapping: A Strategy for Finding Genes in Large Genomic Regions
journal, April 1995


TRANSFAC Retrieval Program: A Network Model Database of Eukaryotic Transcription Regulating Sequences and Proteins
journal, January 1994


Complexity charts can be used to map functional domains in DNA
journal, April 1990


Distance analysis helps to establish characteristic motifs in intron sequences
journal, July 1987


Yeast chromosome III: new gene functions.
journal, February 1994


An analysis of vertebrate mRNA sequences: intimations of translational control.
journal, November 1991


A hidden Markov model that finds genes inE.coliDNA
journal, January 1994


Hidden Markov Models in Computational Biology
journal, February 1994


Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment
journal, October 1993


A transcribed gene in an intron of the human factor VIII gene
journal, May 1990


A dictionary of transcription control sequences
journal, January 1990


Evaluation of the Exon Predictions of the GRAIL Software
journal, November 1994


Alternative mRNA Splicing
journal, November 1992


A method for measuring the non-random bias of a codon usage table
journal, January 1984


Escherichia colipromoter sequences predictin vitroRNA polymerase selectivity
journal, January 1984


Predictlon of splice junctions in mRNA sequences
journal, January 1985


Relationship between the total size of exons and introns in protein-coding genes of higher eukaryotes.
journal, October 1982


Construction of a dictionary of sequence motifs that characterize groups of related proteins
journal, January 1992


Signals for the selection of a splice site in pre-mRNA
journal, May 1987


The complete DNA sequence of yeast chromosome III
journal, May 1992


Correlation approach to identify coding regions in DNA sequences
journal, July 1994


Predictive motifs derived from cytosine methyltransferases
journal, January 1989


The density of transcriptional elements in promoter and non-promoter sequences
journal, January 1993


Large scale bacterial gene discovery by similarity search
journal, June 1994


Construction of a facsimile data set for large genome sequence analysis
journal, September 1990


RNA splice junctions of different classes of eukaryotes: sequence statistics and functional implications in gene expression
journal, January 1987


Structure of vertebrate genes: A statistical analysis implicating selection
journal, March 1988


Finding sequence motifs in groups of functionally related proteins.
journal, January 1990


Automatic generation of primary sequence patterns from sets of related protein sequences.
journal, January 1990


Codon preference and its use in identifying protein coding regions in long DNA sequences
journal, January 1982


QGB: Combined Use of Sequence Similarity and Codon Bias for Coding Region Identification
journal, January 1994


[13] Consensus patterns in DNA
book, January 1990


The C. elegans genome sequencing project: a beginning
journal, March 1992


Analysis of the sequence-specific interactions between Cro repressor and operator DNA by systematic base substitution experiments.
journal, January 1989


A probabilistic model for detecting coding regions in DNA sequences
journal, January 1994


Locating protein-coding regions in human DNA sequences by a multiple sensor-neural network approach.
journal, December 1991


Protein-DNA Recognition: New Perspectives and Underlying Themes
journal, February 1994


The Biochemistry of 3′-End Cleavage and Polyadenylation of Messenger rna Precursors
journal, June 1992


2.2 Mb of contiguous nucleotide sequence from chromosome III of C. elegans
journal, March 1994