Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

An iternative algorithm for correcting sequencing errors in DNA coding regions

Conference ·
OSTI ID:205860

Insertion and deletion (indel) sequencing errors in DNA coding regions disrupt DNA-to-protein translation frames, and hence make most frame-sensitive coding recognition approaches fail. This paper extends the authors` previous work on indel detection and `correction` algorithms, and presents a more effective algorithm for localizing indels that appear in DNA coding regions and `correcting` the located indels by inserting or deleting DNA bases. The algorithm localizes indels by discovering changes of the preferred translation frames within presumed coding regions, and then `corrects` the indel errors to restore a consistent translation frame within each coding region. An iterative strategy is exploited to repeatedly localize and `correct` indel errors until no more indels can be found. Test results have shown that the algorithm can accurately locate the positions of indels. The technology presented here has proved to be very useful for single pass EST/cDNA or genomic sequences, and is also often beneficial for higher quality sequences from large genomic clones.

Research Organization:
Argonne National Lab., IL (United States)
Sponsoring Organization:
USDOE, Washington, DC (United States)
DOE Contract Number:
AC05-84OR21400
OSTI ID:
205860
Report Number(s):
CONF-9510318--2; ON: DE96005359
Country of Publication:
United States
Language:
English

Similar Records

Correcting sequencing errors in DNA coding regions using a dynamic programming approach
Technical Report · Wed Nov 30 23:00:00 EST 1994 · OSTI ID:10105444

Alignment of DNA and protein sequences containing frameshift errors
Technical Report · Fri Mar 31 23:00:00 EST 1995 · OSTI ID:71519

Sensitive and error-tolerant annotation of protein-coding DNA with BATH
Journal Article · Fri Jun 14 00:00:00 EDT 2024 · Bioinformatics Advances · OSTI ID:2510958