Pattern recognition in DNA sequences: The intron-exon junction problem
- Oak Ridge National Lab., TN (USA) Tennessee Univ., Oak Ridge, TN (USA). Graduate School of Biomedical Sciences
- Oak Ridge National Lab., TN (USA)
One of the fundamental problems facing the field of genomic sequence analysis is the difficulty in locating relatively small coding regions of DNA within the much larger non-coding regions. Neural networks, linguistic analysis and various types of expert systems have been used with various degrees of success to address this problem. We have developed several methods for recognizing the presence of splice junctions and coding DNA which are based on artificial intelligence, linguistic and statistical approaches. The triplet vocabulary in and around splice junctions has been analyzed for primates, and the occurrences of preferred triplets in potential junctions seems to be a very selective method for distinguishing true junctions from otherwise similar sequences. given a 50% mix of true and false junctions, this method scores 93%--95% correct. Several approaches have been used to identify exons. These include a frame bias matrix algorithm and an algorithm which estimates the fractal dimension of dinucleotide usage. Attempts are underway to combine the outputs of the various methods using a rule-based approach to improve the overall performance of these predictors. 13 refs., 4 figs.
- Research Organization:
- Oak Ridge National Lab., TN (USA)
- Sponsoring Organization:
- DOE/ER
- DOE Contract Number:
- AC05-84OR21400
- OSTI ID:
- 7064434
- Report Number(s):
- CONF-9004221-4; ON: DE90015182
- Resource Relation:
- Conference: 1. international conference on electrophoresis, superconducting and the human genome, Tallahassee, FL (USA), 10-13 Apr 1990
- Country of Publication:
- United States
- Language:
- English
Similar Records
Intron-exon organization of the active human protein S gene PS. alpha. and its pseudogene PS. beta. : Duplication and silencing during primate evolution
The exon-intron organization of the human erythroid [beta]-spectrin gene
Related Subjects
99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE
DNA SEQUENCING
PATTERN RECOGNITION
EXPERT SYSTEMS
ALGORITHMS
CODONS
MAN
NEURAL NETWORKS
PRIMATES
PROMOTERS
RNA
ANIMALS
MAMMALS
MATHEMATICAL LOGIC
NUCLEIC ACIDS
ORGANIC COMPOUNDS
STRUCTURAL CHEMICAL ANALYSIS
VERTEBRATES
550200* - Biochemistry
990200 - Mathematics & Computers