Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
QXtract: A Building Block for Efficient Information Extraction from Text Databases
 

Summary: QXtract: A Building Block for
Efficient Information Extraction from Text Databases
Eugene Agichtein Luis Gravano
Columbia University
{eugene, gravano}@cs.columbia.edu
Background: A wealth of information is hidden within unstruc-
tured text. This information is often best utilized in structured or
relational form, which is suited for sophisticated query processing,
for integration with relational databases, and for data mining. For
example, newspaper and e-mail archives contain information that
could be useful to analysts and govenment agencies. Information
extraction systems produce a structured representation of the infor-
mation that is "buried" in text documents. Unfortunately, process-
ing each document is computationally expensive, and is not feasible
for large text databases or for the web. With many database sizes
exceeding millions of documents, processing time is becoming a
bottleneck for exploiting information extraction technology.
Text Database
Search
Engine

  

Source: Agichtein, Eugene - Department of Mathematics and Computer Science, Emory University
Gravano, Luis - Department of Computer Science, Columbia University

 

Collections: Computer Technologies and Information Sciences