skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: SORFIND: A computer program that predicts exons in vertebrate genomic DNA

Abstract

Several computer programs now available will predict exons based upon naive genomic sequence data, but they generally require access to a unix workstation or e-mail access to Internet. The authors have developed a program, called SORFIND, which predicts vertebrate internal exons at 5 different confidence levels, and which runs on an IBM-PC computer. The program reads sequence data in several formats, identifies ``spliceable open reading frames`` (SORFs) possessing high consensus scores with known acceptor and donor splice junctions, and analyzes codon usage. Potential exons are filtered through successive stages, and in a data set of 130 human genes results in the identification of 89.6% of the internal exxon greater than 60 base pairs in length (62.5% predicted with exact splice junctions and reading frame, and a further 27.1% predicted with at least one exact splice junction and an average 77.3% overlap with true internal exons). Specificity (the percentage of SORFs that either completely or partially match a true exon) is 91.8%, 90%, 75.5%, 53.2% and 38.4% for the combined confidence levels 1, 1 and 2, 1 to 3, 1 to 4 and 1 to 5, respectively. The program`s output displays nucleotide position, confidence level, reading frame phase at the 5{prime}more » and 3{prime} ends, acceptor and donor sequences and scoring statistics. It also generates an amino acid translation which can be used in protein database homology searches. The program compares favorably with the CRM module of GRAIL and with the Geneld program on an analysis of a 105 kilobase contig from human chromosome 4. It also successfully predicts exons from other vertebrates.« less

Authors:
;  [1]
  1. Univ. of British Columbia, Vancouver (Canada). Dept. of Medical Genetics
Publication Date:
OSTI Identifier:
37557
Report Number(s):
CONF-9206273-
ISBN 981-02-1157-0; TRN: IM9519%%484
Resource Type:
Conference
Resource Relation:
Conference: 2. international conference on bioinformatics, supercomputing, and complex genome analysis, St. Petersburg, FL (United States), 4-7 Jun 1992; Other Information: PBD: 1993; Related Information: Is Part Of The second international conference on bioinformatics, supercomputing and complex genome analysis; Lim, H.A. [ed.] [Florida State Univ., Tallahassee, FL (United States). Supercomputer Computations Research Inst.]; Fickett, J.W. [ed.] [Los Alamos National Lab., Los Alamos, NM (United States). Center for Human Genome Studies]; Cantor, C.R. [ed.] [Boston Univ., MA (United States). Center for Advanced Research in Biotechnology]; Robbins, R.J. [ed.] [Johns Hopkins Univ., Baltimore, MD (United States). Applied Research Lab.]; PB: 672 p.
Country of Publication:
United States
Language:
English
Subject:
55 BIOLOGY AND MEDICINE, BASIC STUDIES; 99 MATHEMATICS, COMPUTERS, INFORMATION SCIENCE, MANAGEMENT, LAW, MISCELLANEOUS; DNA; MOLECULAR STRUCTURE; GENES; S CODES; MESSENGER-RNA; FORECASTING

Citation Formats

Hutchinson, G B, and Hayden, M R. SORFIND: A computer program that predicts exons in vertebrate genomic DNA. United States: N. p., 1993. Web.
Hutchinson, G B, & Hayden, M R. SORFIND: A computer program that predicts exons in vertebrate genomic DNA. United States.
Hutchinson, G B, and Hayden, M R. Fri . "SORFIND: A computer program that predicts exons in vertebrate genomic DNA". United States.
@article{osti_37557,
title = {SORFIND: A computer program that predicts exons in vertebrate genomic DNA},
author = {Hutchinson, G B and Hayden, M R},
abstractNote = {Several computer programs now available will predict exons based upon naive genomic sequence data, but they generally require access to a unix workstation or e-mail access to Internet. The authors have developed a program, called SORFIND, which predicts vertebrate internal exons at 5 different confidence levels, and which runs on an IBM-PC computer. The program reads sequence data in several formats, identifies ``spliceable open reading frames`` (SORFs) possessing high consensus scores with known acceptor and donor splice junctions, and analyzes codon usage. Potential exons are filtered through successive stages, and in a data set of 130 human genes results in the identification of 89.6% of the internal exxon greater than 60 base pairs in length (62.5% predicted with exact splice junctions and reading frame, and a further 27.1% predicted with at least one exact splice junction and an average 77.3% overlap with true internal exons). Specificity (the percentage of SORFs that either completely or partially match a true exon) is 91.8%, 90%, 75.5%, 53.2% and 38.4% for the combined confidence levels 1, 1 and 2, 1 to 3, 1 to 4 and 1 to 5, respectively. The program`s output displays nucleotide position, confidence level, reading frame phase at the 5{prime} and 3{prime} ends, acceptor and donor sequences and scoring statistics. It also generates an amino acid translation which can be used in protein database homology searches. The program compares favorably with the CRM module of GRAIL and with the Geneld program on an analysis of a 105 kilobase contig from human chromosome 4. It also successfully predicts exons from other vertebrates.},
doi = {},
url = {https://www.osti.gov/biblio/37557}, journal = {},
number = ,
volume = ,
place = {United States},
year = {1993},
month = {12}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: