Application of neural networks and information theory to the identification of E. coli transcriptional promoters
- Du Pont Merck Pharmaceutical Co., Wilmington, DE (USA). Experimental Station
- National Center for Biotechnology Information, Bethesda, MD (USA)
- Los Alamos National Lab., NM (USA)
The Humane Genome Project has as its eventual goal the determination of the entire DNA sequence of man, which comprises approximately 3 billion base pairs. An important aspect of this project will be the analysis of the sequence to locate regions of biological importance. New computer methods will be needed to automate and facilitate this task. In this paper, we have investigated use of neural networks for the recognition of functional patterns in biological sequences. The prediction of Escherichia coli transcriptional promoters was chosen as a model system for these studies. Two approaches were employed. In the fist method, a mutual information analysis of promoter and nonpromoter sequences was carried out to demonstrate the informative base positions that help to distinguish promoter sequences from non-promoter sequences. These base positions were than used to train a Perceptron to predict new promoter sequences. In the second method, the experimental knowledge of promoters was used to indicate the important base positions in the sequence. These base positions were used to train a back propagation network with hidden units which represented regions of sequence conservation found in promoters. With both types of networks, prediction of new promoter sequences was greater than 96.9%. 12 refs., 1 fig., 4 tabs.
- Research Organization:
- Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
- Sponsoring Organization:
- DOE/AD
- DOE Contract Number:
- W-7405-ENG-36
- OSTI ID:
- 6036585
- Report Number(s):
- LA-UR-91-729; CONF-910295-1; ON: DE91008608
- Resource Relation:
- Conference: 8. international conference on mathematical and computer modelling, College Park, MD (USA), 14 Feb 1991
- Country of Publication:
- United States
- Language:
- English
Similar Records
Sort-Seq Approach to Engineering a Formaldehyde-Inducible Promoter for Dynamically Regulated Escherichia coli Growth on Methanol
Kinetics of the stages of transcription initiation at the Escherichia coli lac UV5 promoter
Related Subjects
59 BASIC BIOLOGICAL SCIENCES
DNA SEQUENCING
INFORMATION SYSTEMS
NEURAL NETWORKS
GENE REPRESSORS
ESCHERICHIA COLI
PROGRAMMING
PROMOTERS
TRANSCRIPTION
USES
BACTERIA
MICROORGANISMS
NUCLEOPROTEINS
ORGANIC COMPOUNDS
PROTEINS
STRUCTURAL CHEMICAL ANALYSIS
990200* - Mathematics & Computers
990300 - Information Handling
550400 - Genetics