Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

Future Generation Computer Systems 23 (2007) 485496 www.elsevier.com/locate/fgcs

Summary: Future Generation Computer Systems 23 (2007) 485­496
From bioinformatic web portals to semantically integrated Data Grid
Adriana Budura, Philippe Cudr´e-Mauroux, Karl Aberer
School Of Computer and Communication Sciences, EPFL ­ Switzerland
Received 31 January 2006; accepted 26 March 2006
Available online 4 May 2006
We propose a semi-automated method for redeploying bioinformatic databases indexed in a Web portal as a decentralized, semantically
integrated and service-oriented Data Grid. We generate peer-to-peer schema mappings leveraging on cross-referenced instances and instance-
based schema matching algorithms. Analyzing real-world data extracted from an existing portal, we show how a rather trivial combination of
lexicographical measures with set distance measures yields surprisingly good results in practice. Finally, we propose data models for redeploying
all instances, schemas and schema mappings in the Data Grid, relying on standard Semantic Web technologies.
c 2006 Elsevier B.V. All rights reserved.
Keywords: Data sharing; Data mapping; Distributed databases; Semantics; Biology
1. Introduction
In the past, biologists used to collect and analyze data
in isolation, creating proprietary schemas to annotate and
store their information in various ways. Today, with the


Source: Aberer, Karl - Faculté Informatique et Communications, Ecole Polytechnique Fédérale de Lausanne
Massachusetts Institute of Technology (MIT), Department of Electrical Engineering and Computer Science, Database Group


Collections: Computer Technologies and Information Sciences