Creating a Human Annotated Emotion Corpus for the Detection of Actor-related Emotions
DOI:
https://doi.org/10.31400/dh-hun.2022.6.4576Keywords:
sentiment detection, emotion detection, text classification, BERT, supervised model, human annotationAbstract
In our study, we present an ongoing research project in which our goal is to create a language model capable of classifying sentiments and specific emotions related to actors (e.g., institutions, persons). The training database of the model is a human-annotated text corpus consisting of ten thousand articles from online newspapers, compiled using statistical sampling methods. In the project, we employ a two-phase annotation design. First, we annotate named entities and common names that function as actors. Second, we annotate sentiments and specific emotions found in the context of the previously marked actors. Such a database of annotated texts can provide excellent input for creating supervised classification models. In this article, we describe the corpus of the project, the characteristics of supervised and unsupervised text classification procedures, and possible methods for sentiment and emotion detection. After that, we present the two-phase annotation methodology used in our research, the problems and challenges that arose during its development, as well as the research decisions that we made to create a model that can be used as a capable research tool in social sciences.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2022 the author(s)
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.