RuSentiment: An Enriched Sentiment Analysis Dataset for Social Media in Russian
Conference
·
OSTI ID:1529951
- University of Massachusetts at Lowell
- BATTELLE (PACIFIC NW LAB)
- Dartmouth College
This paper presents RuSentiment, a new dataset for sentiment analysis of social media posts in Russian, and a new set of comprehensive annotation guidelines that are extensible to other languages. RuSentiment is currently the largest in its class for Russian, with 30,521 posts annotated with Cohen’s kappa of 0.58 (3 annotations per post). To diversify the dataset, 6,749 posts were pre-selected with an active learning-style strategy. We report baseline classification results, and release the bestperforming embeddings trained on 3.2B tokens in Russian VKontakte posts.
- Research Organization:
- Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- AC05-76RL01830
- OSTI ID:
- 1529951
- Report Number(s):
- PNNL-SA-134041
- Resource Relation:
- Conference: Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018), August, 2018, Santa Fe, NM
- Country of Publication:
- United States
- Language:
- English
Similar Records
Multilingual Connotation Frames: A Case Study on Social Media for Targeted Sentiment Analysis and Forecast
Using Social Media to Measure Student Wellbeing: A Large-Scale Study of Emotional Response in Academic Discourse
Studying Military Community Health, Well-being, and Discourse through the Social Media Lens
Conference
·
Sun Jul 30 00:00:00 EDT 2017
·
OSTI ID:1529951
+1 more
Using Social Media to Measure Student Wellbeing: A Large-Scale Study of Emotional Response in Academic Discourse
Conference
·
Tue Nov 15 00:00:00 EST 2016
·
OSTI ID:1529951
Studying Military Community Health, Well-being, and Discourse through the Social Media Lens
Book
·
Fri Sep 15 00:00:00 EDT 2017
·
OSTI ID:1529951
+5 more