Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

Snowball: A Prototype System for Extracting Relations from Large Text Collections

Summary: Snowball: A Prototype System for Extracting Relations
from Large Text Collections
Eugene Agichtein, Luis Gravano, Jeff Pavel, Viktoriya Sokolova, Aleksandr Voskoboynik
Computer Science Department
Columbia University
Text documents often hide valuable structured data. For
example, a collection of newspaper articles might contain
information on the location of the headquarters of a number
of organizations. If we need to find the location of the head-
quarters of, say, Microsoft, we could try and use traditional
information-retrieval techniques for finding documents that
contain the answer to our query. Alternatively, we could an-
swer such a query more precisely if we somehow had avail-
able a table listing all the organization-location pairs that
are mentioned in our document collection. One could view
the extraction process as automatically building a material-
ized view over the unstructured text data. In this demo we
present an interactive prototype of our Snowball system for
extracting relations from collections of plain-text documents


Source: Agichtein, Eugene - Department of Mathematics and Computer Science, Emory University


Collections: Computer Technologies and Information Sciences