Humán annotált emóciókorpusz létrehozása aktorokhoz köthető érzelmek detektálására

Árpád Knap; Tímea Emese Tóth; Zsófia Rakovics

doi:10.31400/dh-hun.2022.6.4576

Authors

Árpád Knap ELTE TÁTK https://orcid.org/0000-0002-4290-6025
Tímea Emese Tóth ELTE TÁTK https://orcid.org/0000-0002-3584-118X
Zsófia Rakovics ELTE TÁTK https://orcid.org/0000-0002-9903-9348

DOI:

https://doi.org/10.31400/dh-hun.2022.6.4576

Keywords:

sentiment detection, emotion detection, text classification, BERT, supervised model, human annotation

Abstract

In our study, we present an ongoing research project in which our goal is to create a language model capable of classifying sentiments and specific emotions related to actors (e.g., institutions, persons). The training database of the model is a human-annotated text corpus consisting of ten thousand articles from online newspapers, compiled using statistical sampling methods. In the project, we employ a two-phase annotation design. First, we annotate named entities and common names that function as actors. Second, we annotate sentiments and specific emotions found in the context of the previously marked actors. Such a database of annotated texts can provide excellent input for creating supervised classification models. In this article, we describe the corpus of the project, the characteristics of supervised and unsupervised text classification procedures, and possible methods for sentiment and emotion detection. After that, we present the two-phase annotation methodology used in our research, the problems and challenges that arose during its development, as well as the research decisions that we made to create a model that can be used as a capable research tool in social sciences.

Creating a Human Annotated Emotion Corpus for the Detection of Actor-related Emotions

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Language

Keywords

Information

szerzoindex

donate