Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
Combining Strategies for Extracting Relations from Text Collections Eugene Agichtein Eleazar Eskin Luis Gravano
 

Summary: Combining Strategies for Extracting Relations from Text Collections
Eugene Agichtein Eleazar Eskin Luis Gravano
Department of Computer Science
Columbia University
{eugene,eeskin,gravano}@cs.columbia.edu
Abstract
Text documents often contain valuable structured data
that is hidden in regular English sentences. This data is
best exploited if available as a relational table that we
could use for answering precise queries or for running
data mining tasks. Our Snowball system extracts these
relations from document collections starting with only
a handful of user-provided example tuples. Based on
these tuples, Snowball generates patterns that are used,
in turn, to find more tuples. In this paper we introduce a
new pattern and tuple generation scheme for Snowball,
with different strengths and weaknesses than those of
our original system. We also show preliminary results
on how we can combine the two versions of Snowball
to extract tuples more accurately.

  

Source: Agichtein, Eugene - Department of Mathematics and Computer Science, Emory University
Gravano, Luis - Department of Computer Science, Columbia University

 

Collections: Computer Technologies and Information Sciences