skip to main content

Title: Leveraging Paraphrase Labels to Extract Synonyms from Twitter

We present an approach for automatically learning synonyms from a paraphrase corpus of tweets. This work shows improvement on the task of paraphrase detection when we substitute our extracted synonyms into the training set. The synonyms are learned by using chunks from a shallow parse to create candidate synonyms and their context windows, and the synonyms are incorporated into a paraphrase detection system that uses machine translation metrics as features for a classifier. We demonstrate a 2.29% improvement in F1 when we train and test on the paraphrase training set, providing better coverage than previous systems, which shows the potential power of synonyms that are representative of a specific topic.
Authors:
; ;
Publication Date:
OSTI Identifier:
1222097
Report Number(s):
PNNL-SA-106823
400470000
DOE Contract Number:
AC05-76RL01830
Resource Type:
Conference
Resource Relation:
Conference: Proceedings of the 28th International Florida Artificial Intelligence Research Society Conference (FLAIRS-28), May 18-20, 2015, Hollywood, Florida, 3-7
Publisher:
AAAI Press, Palo Alto, CA, United States(US).
Research Org:
Pacific Northwest National Laboratory (PNNL), Richland, WA (US)
Sponsoring Org:
USDOE
Country of Publication:
United States
Language:
English