Gene structure prediction by linguistic methods
- Univ. of Pennsylvania School of Medicine, Philadelphia, PA (United States)
The higher-order structure of genes and other features of biological sequences can be described by means of formal grammars. These grammars can then be used by general-purpose parsers to detect and to assemble such structures by means of syntactic pattern recognition. We describe a grammar and parser for eukaryotic protein-encoding genes, which by some measures is as effective as current connectionist and combinatorial algorithms in predicting gene structures for sequence database entries. Parameters of the grammar rules are optimized for several different species, and mixing experiments are performed to determine the degree of species specificity and the relative importance of compositional, signal-based, and syntactic components in gene prediction. 24 refs., 5 figs., 3 tabs.
- DOE Contract Number:
- FG02-92ER61371
- OSTI ID:
- 183690
- Journal Information:
- Genomics, Journal Name: Genomics Journal Issue: 3 Vol. 23; ISSN 0888-7543; ISSN GNMCEP
- Country of Publication:
- United States
- Language:
- English
Similar Records
A syntactic pattern recognition system for DNA sequences
A linguistic integration of a biological database
An application of syntactic pattern recognition to seismic discrimination
Conference
·
Thu Dec 30 23:00:00 EST 1993
·
OSTI ID:37525
A linguistic integration of a biological database
Conference
·
Thu Dec 30 23:00:00 EST 1993
·
OSTI ID:37555
An application of syntactic pattern recognition to seismic discrimination
Technical Report
·
Sat Aug 01 00:00:00 EDT 1981
·
OSTI ID:5591126