Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
Jedi: Extracting and Synthesizing Information from the Web Gerald Huck, Peter Fankhauser, Karl Aberer, Erich Neuhold
 

Summary: Jedi: Extracting and Synthesizing Information from the Web
Gerald Huck, Peter Fankhauser, Karl Aberer, Erich Neuhold
GMD - German National Research Center for Information Technology
Integrated Publication and Information Systems Institute IPSI
Dolivostr. 15, 64293 Darmstadt, Germany
{huck, fankhaus, aberer, neuhold}@darmstadt.gmd.de
Abstract
Jedi (Java based Extraction and Dissemination of Informa-
tion) is a lightweight tool for the creation of wrappers and
mediators to extract, combine, and reconcile information
from several independent information sources. For wrap-
pers it uses attributed grammars, which are evaluated with
a fault-tolerant parsing strategy to cope with ambiguous
grammars and irregular sources. For mediation it uses a
simple generic object-model that can be extended with
Java-libraries for specific models such as HTML, XML or
the relational model. This paper describes the architecture
of Jedi, and then focuses on Jedi's wrapper generator.
1. Introduction
The World Wide Web has evolved into a general pur-

  

Source: Aberer, Karl - Faculté Informatique et Communications, Ecole Polytechnique Fédérale de Lausanne

 

Collections: Computer Technologies and Information Sciences