Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
Computers and Chemistry 24 (2000) 4355 Sequence complexity for biological sequence analysis
 

Summary: Computers and Chemistry 24 (2000) 4355
Sequence complexity for biological sequence analysis
L. Allison a,
*, L. Stern b
, T. Edgoose a
, T.I. Dix a
a
School of Computer Science and Software Engineering, Monash Uni6ersity, Melbourne, 3168 Australia
b
Department of Computer Science and Software Engineering, The Uni6ersity of Melbourne, Melbourne, 3052 Australia
Received 7 August 1998; accepted 18 February 1999
Abstract
A new statistical model for DNA considers a sequence to be a mixture of regions with little structure and regions
that are approximate repeats of other subsequences, i.e. instances of repeats do not need to match each other exactly.
Both forward- and reverse-complementary repeats are allowed. The model has a small number of parameters which
are fitted to the data. In general there are many explanations for a given sequence and how to compute the total
probability of the data given the model is shown. Computer algorithms are described for these tasks. The model can
be used to compute the information content of a sequence, either in total or base by base. This amounts to looking
at sequences from a data-compression point of view and it is argued that this is a good way to tackle intelligent
sequence analysis in general. 2000 Elsevier Science Ltd. All rights reserved.

  

Source: Allison, Lloyd - Caulfield School of Information Technology, Monash University

 

Collections: Computer Technologies and Information Sciences