skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Pattern recognition in DNA sequences: The intron-exon junction problem

Conference ·
OSTI ID:7064434
;  [1];  [2]
  1. Oak Ridge National Lab., TN (USA) Tennessee Univ., Oak Ridge, TN (USA). Graduate School of Biomedical Sciences
  2. Oak Ridge National Lab., TN (USA)

One of the fundamental problems facing the field of genomic sequence analysis is the difficulty in locating relatively small coding regions of DNA within the much larger non-coding regions. Neural networks, linguistic analysis and various types of expert systems have been used with various degrees of success to address this problem. We have developed several methods for recognizing the presence of splice junctions and coding DNA which are based on artificial intelligence, linguistic and statistical approaches. The triplet vocabulary in and around splice junctions has been analyzed for primates, and the occurrences of preferred triplets in potential junctions seems to be a very selective method for distinguishing true junctions from otherwise similar sequences. given a 50% mix of true and false junctions, this method scores 93%--95% correct. Several approaches have been used to identify exons. These include a frame bias matrix algorithm and an algorithm which estimates the fractal dimension of dinucleotide usage. Attempts are underway to combine the outputs of the various methods using a rule-based approach to improve the overall performance of these predictors. 13 refs., 4 figs.

Research Organization:
Oak Ridge National Lab., TN (USA)
Sponsoring Organization:
DOE/ER
DOE Contract Number:
AC05-84OR21400
OSTI ID:
7064434
Report Number(s):
CONF-9004221-4; ON: DE90015182
Resource Relation:
Conference: 1. international conference on electrophoresis, superconducting and the human genome, Tallahassee, FL (USA), 10-13 Apr 1990
Country of Publication:
United States
Language:
English