Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Maximum entropy weighting of aligned sequences of proteins or DNA

Technical Report ·
OSTI ID:401848
 [1];  [2]
  1. Nordita, Copenhagen (Denmark)
  2. Laboratory of Molecular Biology, Cambridge (United Kingdom)
In a family of proteins or other biological sequences like DNA the various subfamilies are often very unevenly represented. For this reason a scheme for assigning weights to each sequence can greatly improve performance at tasks such as database searching with profiles or other consensus models based on multiple alignments. A new weighing scheme for this type of database search is proposed. In a statistical description of the searching problem it is derived from the maximum entropy principle. It can be proved that, in a certain sense, it corrects for uneven representation. It is shown that finding the maximum entropy weights is an easy optimization problem for which standard techniques are applicable.
Research Organization:
Stanford Univ., CA (United States)
OSTI ID:
401848
Report Number(s):
CONF-9507246--
Country of Publication:
United States
Language:
English

Similar Records

Aligning a DNA sequence with a protein sequence
Conference · Sun Nov 30 23:00:00 EST 1997 · OSTI ID:549033

Maximum entropy in the problem of moments
Journal Article · Wed Aug 01 00:00:00 EDT 1984 · J. Math. Phys. (N.Y.); (United States) · OSTI ID:6948797

Mathematical methods for DNA sequences
Book · Sat Dec 31 23:00:00 EST 1988 · OSTI ID:6624292