| | |
Summary: Named Entity Recognition in Tweets:
An Experimental Study
Alan Ritter, Sam Clark, Mausam and Oren Etzioni
Computer Science and Engineering
University of Washington
Seattle, WA 98125, USA
{aritter,ssclark,mausam,etzioni}@cs.washington.edu
Abstract
People tweet more than 100 Million times
daily, yielding a noisy, informal, but some-
times informative corpus of 140-character
messages that mirrors the zeitgeist in an un-
precedented manner. The performance of
standard NLP tools is severely degraded on
tweets. This paper addresses this issue by
re-building the NLP pipeline beginning with
part-of-speech tagging, through chunking, to
named-entity recognition. Our novel T-NER
system doubles F1 score compared with the
Stanford NER system. T-NER leverages the
|