skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: An Abstract Description Approach to the Discovery and Classification of Bioinformatics Web Sources

Conference ·
OSTI ID:15006274

The World Wide Web provides an incredible resource to genomics researchers in the form of dynamic data sources--e.g. BLAST sequence homology search interfaces. The growth rate of these sources outpaces the speed at which they can be manually classified, meaning that the available data is not being utilized to its full potential. Existing research has not addressed the problems of automatically locating, classifying, and integrating classes of bioinformatics data sources. This paper presents an overview of a system for finding classes of bioinformatics data sources and integrating them behind a unified interface. We examine an approach to classifying these sources automatically that relies on an abstract description format: the service class description. This format allows a domain expert to describe the important features of an entire class of services without tying that description to any particular Web source. We present the features of this description format in the context of BLAST sources to show how the service class description relates to Web sources that are being described. We then show how a service class description can be used to classify an arbitrary Web source to determine if that source is an instance of the described service. To validate the effectiveness of this approach, we have constructed a prototype that can correctly classify approximately two-thirds of the BLAST sources we tested. We then examine these results, consider the factors that affect correct automatic classification, and discuss future work.

Research Organization:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Organization:
US Department of Energy (US)
DOE Contract Number:
W-7405-ENG-48
OSTI ID:
15006274
Report Number(s):
UCRL-JC-152980; TRN: US200407%%171
Resource Relation:
Conference: The Fourth Georgia Tech and University of Georgia International Conference on Bioinformatics, Atlanta, GA (US), 11/13/2003--11/16/2003; Other Information: PBD: 1 May 2003
Country of Publication:
United States
Language:
English

Similar Records

Discovery and Classification of Bioinformatics Web Services
Conference · Mon Sep 02 00:00:00 EDT 2002 · OSTI ID:15006274

Automatic Discovery and Inferencing of Complex Bioinformatics Web Interfaces
Journal Article · Mon Dec 22 00:00:00 EST 2003 · World Wide Web · OSTI ID:15006274

Automatic Generation of Data Types for Classification of Deep Web Sources
Conference · Mon Feb 14 00:00:00 EST 2005 · OSTI ID:15006274