skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Identification of cDNA sequences by specific oligonucleotide sets. Computer tool and application

Abstract

A computer tool has been developed for revealing sets of oligonucleotides invariant for isofunctional families of DNA (RNA) and for using these in functional identification of nucleotide sequences. The tool allows one to: build up vocabularies of invariant oligonucleotides for the families of isofunctional nucleotide sequences; assess significance of the vocabularies; identify nucleotide sequences with the vocabularies of invariant oligonucleotides; determine the most effective identification parameters to minimize first and second type, errors; assess the efficiency of identification of individual isofunctional families with the oligonucleotide vocabularies; determine the evolutionary characteristics of the families of isofunctional sequences on which vocabulary volume depends. Based on the system mentioned, we have analyzed a total of 322 protein-encoding gene families and have built up sets of invariant oligonucleotides, or again, oligonucleotide vocabularies that are characteristic of gene families and subfamilies. Identification of nucleotide sequences belonging to these families with the sets of invariant oligonucleotides revealed has been shown. Under the most effective identification parameters, the first type error (false negative) on control (independent) data was 10-15%, the second type error (false positive) was just 1-2 redundant sequences per sequence being examined. As has been shown, the volume of a vocabulary of invariant oligonucleotides dependsmore » on the percentage of variable positions in the multiple alignment within a family.« less

Authors:
; ; ;  [1]
  1. Inst. of Cytology and Genetics, Novosibirsk (Russian Federation)
Publication Date:
Research Org.:
Stanford Univ., CA (United States)
OSTI Identifier:
401847
Report Number(s):
CONF-9507246-
TRN: 96:005602-0025
Resource Type:
Technical Report
Resource Relation:
Conference: Intelligent Systems for Molecular Biology (ISMB) conference, Cambridge (United Kingdom), 16-19 Jul 1995; Other Information: PBD: 1995; Related Information: Is Part Of ISMB-95 -- Third international conference on intelligent systems for molecular biology: Proceedings; Rawlings, C.; Clark, D.; Altman, R.; Hunter, L.; Lengauer, T.; Wodak, S. [eds.]; PB: 427 p.
Country of Publication:
United States
Language:
English
Subject:
55 BIOLOGY AND MEDICINE, BASIC STUDIES; 99 MATHEMATICS, COMPUTERS, INFORMATION SCIENCE, MANAGEMENT, LAW, MISCELLANEOUS; DNA SEQUENCING; ACCURACY; ERRORS; OLIGONUCLEOTIDES; PROBABILITY; GENES

Citation Formats

Kolchanov, N A, Vishnevsky, O V, Babenko, V N, and Shindyalov, K A.E. Identification of cDNA sequences by specific oligonucleotide sets. Computer tool and application. United States: N. p., 1995. Web.
Kolchanov, N A, Vishnevsky, O V, Babenko, V N, & Shindyalov, K A.E. Identification of cDNA sequences by specific oligonucleotide sets. Computer tool and application. United States.
Kolchanov, N A, Vishnevsky, O V, Babenko, V N, and Shindyalov, K A.E. 1995. "Identification of cDNA sequences by specific oligonucleotide sets. Computer tool and application". United States.
@article{osti_401847,
title = {Identification of cDNA sequences by specific oligonucleotide sets. Computer tool and application},
author = {Kolchanov, N A and Vishnevsky, O V and Babenko, V N and Shindyalov, K A.E.},
abstractNote = {A computer tool has been developed for revealing sets of oligonucleotides invariant for isofunctional families of DNA (RNA) and for using these in functional identification of nucleotide sequences. The tool allows one to: build up vocabularies of invariant oligonucleotides for the families of isofunctional nucleotide sequences; assess significance of the vocabularies; identify nucleotide sequences with the vocabularies of invariant oligonucleotides; determine the most effective identification parameters to minimize first and second type, errors; assess the efficiency of identification of individual isofunctional families with the oligonucleotide vocabularies; determine the evolutionary characteristics of the families of isofunctional sequences on which vocabulary volume depends. Based on the system mentioned, we have analyzed a total of 322 protein-encoding gene families and have built up sets of invariant oligonucleotides, or again, oligonucleotide vocabularies that are characteristic of gene families and subfamilies. Identification of nucleotide sequences belonging to these families with the sets of invariant oligonucleotides revealed has been shown. Under the most effective identification parameters, the first type error (false negative) on control (independent) data was 10-15%, the second type error (false positive) was just 1-2 redundant sequences per sequence being examined. As has been shown, the volume of a vocabulary of invariant oligonucleotides depends on the percentage of variable positions in the multiple alignment within a family.},
doi = {},
url = {https://www.osti.gov/biblio/401847}, journal = {},
number = ,
volume = ,
place = {United States},
year = {1995},
month = {12}
}

Technical Report:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that may hold this item. Keep in mind that many technical reports are not cataloged in WorldCat.

Save / Share: