skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Leveraging Paraphrase Labels to Extract Synonyms from Twitter

Abstract

We present an approach for automatically learning synonyms from a paraphrase corpus of tweets. This work shows improvement on the task of paraphrase detection when we substitute our extracted synonyms into the training set. The synonyms are learned by using chunks from a shallow parse to create candidate synonyms and their context windows, and the synonyms are incorporated into a paraphrase detection system that uses machine translation metrics as features for a classifier. We demonstrate a 2.29% improvement in F1 when we train and test on the paraphrase training set, providing better coverage than previous systems, which shows the potential power of synonyms that are representative of a specific topic.

Authors:
; ;
Publication Date:
Research Org.:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1222097
Report Number(s):
PNNL-SA-106823
400470000
DOE Contract Number:  
AC05-76RL01830
Resource Type:
Conference
Resource Relation:
Conference: Proceedings of the 28th International Florida Artificial Intelligence Research Society Conference (FLAIRS-28), May 18-20, 2015, Hollywood, Florida, 3-7
Country of Publication:
United States
Language:
English

Citation Formats

Antoniak, Maria A., Bell, Eric B., and Xia, Fei. Leveraging Paraphrase Labels to Extract Synonyms from Twitter. United States: N. p., 2015. Web.
Antoniak, Maria A., Bell, Eric B., & Xia, Fei. Leveraging Paraphrase Labels to Extract Synonyms from Twitter. United States.
Antoniak, Maria A., Bell, Eric B., and Xia, Fei. Mon . "Leveraging Paraphrase Labels to Extract Synonyms from Twitter". United States.
@article{osti_1222097,
title = {Leveraging Paraphrase Labels to Extract Synonyms from Twitter},
author = {Antoniak, Maria A. and Bell, Eric B. and Xia, Fei},
abstractNote = {We present an approach for automatically learning synonyms from a paraphrase corpus of tweets. This work shows improvement on the task of paraphrase detection when we substitute our extracted synonyms into the training set. The synonyms are learned by using chunks from a shallow parse to create candidate synonyms and their context windows, and the synonyms are incorporated into a paraphrase detection system that uses machine translation metrics as features for a classifier. We demonstrate a 2.29% improvement in F1 when we train and test on the paraphrase training set, providing better coverage than previous systems, which shows the potential power of synonyms that are representative of a specific topic.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2015},
month = {5}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: