| | |
Summary: A Tool for Supporting Integration Across Multiple Flat-File
Datasets
Xuan Zhang Gagan Agrawal
Department of Computer Science and Engineering
Ohio State University
Columbus, OH, 43220
{zhangx,agrawal}@cse.ohio-state.edu
ABSTRACT
Traditionally, biologists focused on a single research subject. New
high-throughput experimental and analytical technologies, such as mi-
croarray and BLAST programs, have changed this. An important func-
tionality required now is the ability to process queries about multiple
data entries with little user intervention. This paper presents the de-
sign, implementation, and evaluation of a data integration tool that sup-
ports database-like query operations across flat-file biological datasets.
Compared with the existing solutions, our system has several advan-
tages, i.e., no database management system is required, users can still
use declarative languages to communicate with the system, and no data
parsing, loading, or indexing utility programs need to be written.
We have used the system on three biological queries, each of which
|