Maximum entropy weighting of aligned sequences of proteins or DNA
Technical Report
·
OSTI ID:401848
- Nordita, Copenhagen (Denmark)
- Laboratory of Molecular Biology, Cambridge (United Kingdom)
In a family of proteins or other biological sequences like DNA the various subfamilies are often very unevenly represented. For this reason a scheme for assigning weights to each sequence can greatly improve performance at tasks such as database searching with profiles or other consensus models based on multiple alignments. A new weighing scheme for this type of database search is proposed. In a statistical description of the searching problem it is derived from the maximum entropy principle. It can be proved that, in a certain sense, it corrects for uneven representation. It is shown that finding the maximum entropy weights is an easy optimization problem for which standard techniques are applicable.
- Research Organization:
- Stanford Univ., CA (United States)
- OSTI ID:
- 401848
- Report Number(s):
- CONF-9507246--
- Country of Publication:
- United States
- Language:
- English
Similar Records
Aligning a DNA sequence with a protein sequence
Maximum entropy in the problem of moments
Mathematical methods for DNA sequences
Conference
·
Sun Nov 30 23:00:00 EST 1997
·
OSTI ID:549033
Maximum entropy in the problem of moments
Journal Article
·
Wed Aug 01 00:00:00 EDT 1984
· J. Math. Phys. (N.Y.); (United States)
·
OSTI ID:6948797
Mathematical methods for DNA sequences
Book
·
Sat Dec 31 23:00:00 EST 1988
·
OSTI ID:6624292