Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
Methods for Domain-Independent Information Extraction from the Web: An Experimental Comparison
 

Summary: Methods for Domain-Independent Information Extraction
from the Web: An Experimental Comparison
Oren Etzioni, Michael Cafarella, Doug Downey, Ana-Maria Popescu
Tal Shaked, Stephen Soderland, Daniel S. Weld, and Alexander Yates
Department of Computer Science and Engineering
University of Washington
Seattle, WA 98195-2350
etzioni@cs.washington.edu
Abstract
Our KNOWITALL system aims to automate the tedious
process of extracting large collections of facts (e.g.,
names of scientists or politicians) from the Web in an
autonomous, domain-independent, and scalable man-
ner. In its first major run, KNOWITALL extracted over
50,000 facts with high precision, but suggested a chal-
lenge: How can we improve KNOWITALL's recall and
extraction rate without sacrificing precision?
This paper presents three distinct ways to address this
challenge and evaluates their performance. Rule Learn-
ing learns domain-specific extraction rules. Subclass

  

Source: Anderson, Richard - Department of Computer Science and Engineering, University of Washington at Seattle
Cafarella, Michael J. - Department of Electrical Engineering and Computer Science, University of Michigan
Weld, Daniel S.- Department of Computer Science and Engineering, University of Washington at Seattle
Yates, Alexander - Department of Computer and Information Sciences, Temple University

 

Collections: Computer Technologies and Information Sciences