Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
Automatic Reduction of a Document-Derived Noun Vocabulary Sven Anderson and S. Rebecca Thomas and Camden Segal
 

Summary: Automatic Reduction of a Document-Derived Noun Vocabulary
Sven Anderson and S. Rebecca Thomas and Camden Segal
Bard College, Annandale-on-Hudson, NY 12504
{sanderso, thomas, cs471}@bard.edu
Yu Wu
Stanford University, Stanford, CA 94305
ywu2@stanford.edu
Abstract
We propose and evaluate five related algorithms that automat-
ically derive limited-size noun vocabularies from text doc-
uments of 2,000-30,000 words. The proposed algorithms
combine Personalized Page Rank and principles of informa-
tion maximization, and are applied to the WordNet graph
for nouns. For the best-performing algorithm the difference
between automatically generated reduced noun lexicons and
those created by human writers is approximately 1-2 Word-
Net edges per lexical item. Our results also indicate the
importance of performing word-sense disambiguation with
sentence-level context information at the earliest stage of
analysis.

  

Source: Anderson, Sven - Computer Science Program, Bard College

 

Collections: Computer Technologies and Information Sciences