skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: An editing environment for DNA sequence analysis and annotation

Technical Report ·
DOI:https://doi.org/10.2172/563243· OSTI ID:563243

This paper presents a computer system for analyzing and annotating large-scale genomic sequences. The core of the system is a multiple-gene structure identification program, which predicts the most probable gene structures based on the given evidence, including pattern recognition, EST and protein homology information. A graphics-based user interface provides an environment which allows the user to interactively control the evidence to be used in the gene identification process. To overcome the computational bottleneck in the database similarity search used in the gene identification process, the authors have developed an effective way to partition a database into a set of sub-databases of related sequences, and reduced the search problem on a large database to a signature identification problem and a search problem on a much smaller sub-database. This reduces the number of sequences to be searched from N to O({radical}N) on average, and hence greatly reduces the search time, where N is the number of sequences in the original database. The system provides the user with the ability to facilitate and modify the analysis and modeling in real time.

Research Organization:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE Office of Energy Research, Washington, DC (United States)
DOE Contract Number:
AC05-96OR22464
OSTI ID:
563243
Report Number(s):
ORNL/CP-94756; CONF-980118-; ON: DE98000574; BR: KP1103010; TRN: AHC29803%%80
Resource Relation:
Conference: 3. Pacific symposium on biocomputing, Kapalua, HI (United States), 5 Jan 1998; Other Information: PBD: [1998]
Country of Publication:
United States
Language:
English