Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
Towards a Query Optimizer for Text-Centric Tasks
 

Summary: 21
Towards a Query Optimizer for
Text-Centric Tasks
PANAGIOTIS G. IPEIROTIS
New York University
EUGENE AGICHTEIN
Emory University
and
PRANAY JAIN and LUIS GRAVANO
Columbia University
Text is ubiquitous and, not surprisingly, many important applications rely on textual data for
a variety of tasks. As a notable example, information extraction applications derive structured
relations from unstructured text; as another example, focused crawlers explore the Web to locate
pages about specific topics. Execution plans for text-centric tasks follow two general paradigms for
processing a text database: either we can scan, or "crawl," the text database or, alternatively, we can
exploit search engine indexes and retrieve the documents of interest via carefully crafted queries
constructed in task-specific ways. The choice between crawl- and query-based execution plans can
have a substantial impact on both execution time and output "completeness" (e.g., in terms of
recall). Nevertheless, this choice is typically ad hoc and based on heuristics or plain intuition.
In this article, we present fundamental building blocks to make the choice of execution plans

  

Source: Agichtein, Eugene - Department of Mathematics and Computer Science, Emory University
Gravano, Luis - Department of Computer Science, Columbia University
Ipeirotis, Panagiotis G. - Department of Information, Operations, and Management Sciences, Leonard N. Stern School of Business, New York University

 

Collections: Computer Technologies and Information Sciences