Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

CSE 241 Algorithms and Data Structures Assigned: 02/16/2011 Due Date: 03/09/2011

Summary: CSE 241 Algorithms and Data Structures
Assigned: 02/16/2011 Due Date: 03/09/2011
You must read the course collaboration policy before doing this lab. By doing the lab,
you are agreeding to abide by the policy.
1 Overview
The goal of this lab is to implement hashing as part of a tool for comparing genomic DNA sequences.
The approach to biosequence comparison that we'll use here is an important part of such well-known
tools as FASTA (Pearson & Lipman 1988) and BLAST (Altschul et al. 1990, 1997; Altschul & Gish
A DNA sequence is a string of characters, called bases, from the alphabet {a, c, g, t}. Genomic
DNA encodes a large collection of features, including:
genes the instructions for building proteins;
regulatory sites sequence markers recognized by cellular machinery that can increase or
decrease the rate at which a given gene is used to make protein;
repeats junk left behind by transposable elements, pieces of DNA that can autonomously
copy themselves and move around in the genome. Transposable elements proliferate, then
die, leaving behind many inactive copies of themselves in the genome as repeats.
All DNA is subject to mutations that alter its sequence over time. However, functional sequence
features like genes and regulatory sites are more resistant to mutation than DNA that doesn't code


Source: Agrawal, Kunal - Department of Computer Science and Engineering, Washington University in St. Louis


Collections: Computer Technologies and Information Sciences