DepressionEmo: A novel dataset for multilabel classification of depression emotions

Emotions are integral to human social interactions, with diverse responses elicited by various situational contexts. Particularly, the prevalence of negative emotional states has been correlated with negative outcomes for mental health, necessitating a comprehensive analysis of their occurrence and...

Mô tả đầy đủ

Đã lưu trong:
Chi tiết về thư mục
Những tác giả chính: Rahman, Abu Bakar Siddiqur, Tạ, Hoàng Thắng, Najjar, Lotfollah, Azadmanesh, Azad, Gönul, Ali Saffet
Định dạng: Journal article
Ngôn ngữ:English
Được phát hành: 2024
Những chủ đề:
Truy cập trực tuyến:https://scholar.dlu.edu.vn/handle/123456789/3552
Các nhãn: Thêm thẻ
Không có thẻ, Là người đầu tiên thẻ bản ghi này!
Thư viện lưu trữ: Thư viện Trường Đại học Đà Lạt
id oai:scholar.dlu.edu.vn:123456789-3552
record_format dspace
spelling oai:scholar.dlu.edu.vn:123456789-35522024-09-06T08:19:07Z DepressionEmo: A novel dataset for multilabel classification of depression emotions Rahman, Abu Bakar Siddiqur Tạ, Hoàng Thắng Najjar, Lotfollah Azadmanesh, Azad Gönul, Ali Saffet Dataset; Depression identification; Emotion analysis; Psycholinguistic analysis; Text classification Emotions are integral to human social interactions, with diverse responses elicited by various situational contexts. Particularly, the prevalence of negative emotional states has been correlated with negative outcomes for mental health, necessitating a comprehensive analysis of their occurrence and impact on individuals. In this paper, we introduce a novel dataset named DepressionEmo designed to detect 8 emotions associated with depression by 6037 examples of long Reddit user posts. This dataset was created through a majority vote over inputs by zero-shot classifications from pre-trained models and validating the quality by annotators and ChatGPT, exhibiting an acceptable level of inter-rater reliability between annotators. The correlation between emotions, and linguistic analysis are conducted on DepressionEmo. Besides, we provide several text classification methods classified into two groups: machine learning methods such as SVM, XGBoost, and LightGBM; and deep learning methods such as BERT, BART, GAN-BERT, and T5. Despite achieving the same F1 Macro score of 0.76 as BART, the pretrained BERT model, bert-base-uncased, stands out as the most efficient model in our experiments due to its lower number of parameters. Across all emotions, the highest F1 Macro value is achieved by suicide intent, indicating a certain value of our dataset in identifying emotions in individuals with depression symptoms through text analysis. The curated dataset is publicly available at: https://github.com/abuBakarSiddiqurRahman/DepressionEmo. Khoa Công nghệ Thông tin 5 Tạ Hoàng Thắng JOURNAL OF AFFECTIVE DISORDERS, dòng 1057, trang 30/57 2024-09-04T01:46:48Z 2024-09-04T01:46:48Z 2024-08-28 Journal article Bài báo đăng trên tạp chí thuộc ISI, bao gồm book chapter https://scholar.dlu.edu.vn/handle/123456789/3552 10.1016/j.jad.2024.08.013 en Journal of affective disorders
institution Thư viện Trường Đại học Đà Lạt
collection Thư viện số
language English
topic Dataset; Depression identification; Emotion analysis; Psycholinguistic analysis; Text classification
spellingShingle Dataset; Depression identification; Emotion analysis; Psycholinguistic analysis; Text classification
Rahman, Abu Bakar Siddiqur
Tạ, Hoàng Thắng
Najjar, Lotfollah
Azadmanesh, Azad
Gönul, Ali Saffet
DepressionEmo: A novel dataset for multilabel classification of depression emotions
description Emotions are integral to human social interactions, with diverse responses elicited by various situational contexts. Particularly, the prevalence of negative emotional states has been correlated with negative outcomes for mental health, necessitating a comprehensive analysis of their occurrence and impact on individuals. In this paper, we introduce a novel dataset named DepressionEmo designed to detect 8 emotions associated with depression by 6037 examples of long Reddit user posts. This dataset was created through a majority vote over inputs by zero-shot classifications from pre-trained models and validating the quality by annotators and ChatGPT, exhibiting an acceptable level of inter-rater reliability between annotators. The correlation between emotions, and linguistic analysis are conducted on DepressionEmo. Besides, we provide several text classification methods classified into two groups: machine learning methods such as SVM, XGBoost, and LightGBM; and deep learning methods such as BERT, BART, GAN-BERT, and T5. Despite achieving the same F1 Macro score of 0.76 as BART, the pretrained BERT model, bert-base-uncased, stands out as the most efficient model in our experiments due to its lower number of parameters. Across all emotions, the highest F1 Macro value is achieved by suicide intent, indicating a certain value of our dataset in identifying emotions in individuals with depression symptoms through text analysis. The curated dataset is publicly available at: https://github.com/abuBakarSiddiqurRahman/DepressionEmo.
format Journal article
author Rahman, Abu Bakar Siddiqur
Tạ, Hoàng Thắng
Najjar, Lotfollah
Azadmanesh, Azad
Gönul, Ali Saffet
author_facet Rahman, Abu Bakar Siddiqur
Tạ, Hoàng Thắng
Najjar, Lotfollah
Azadmanesh, Azad
Gönul, Ali Saffet
author_sort Rahman, Abu Bakar Siddiqur
title DepressionEmo: A novel dataset for multilabel classification of depression emotions
title_short DepressionEmo: A novel dataset for multilabel classification of depression emotions
title_full DepressionEmo: A novel dataset for multilabel classification of depression emotions
title_fullStr DepressionEmo: A novel dataset for multilabel classification of depression emotions
title_full_unstemmed DepressionEmo: A novel dataset for multilabel classification of depression emotions
title_sort depressionemo: a novel dataset for multilabel classification of depression emotions
publishDate 2024
url https://scholar.dlu.edu.vn/handle/123456789/3552
_version_ 1813142630541819904