Mining Hidden Topics from Newspaper Quotations: The COVID-19 Pandemic

In this paper, we extract quotations from Al Jazeera’s news articles containing keywords related to the COVID-19 pandemic. We apply Latent Dirichlet allocation (LDA), coherence measures, and clustering algorithms to unsupervisedly explore latent topics from the dataset of about 3400 quotations to se...

Mô tả đầy đủ

Đã lưu trong:
Chi tiết về thư mục
Tác giả chính: Tạ, Hoàng Thắng
Định dạng: Conference paper
Ngôn ngữ:English
Được phát hành: 2023
Truy cập trực tuyến:http://scholar.dlu.edu.vn/handle/123456789/2006
Các nhãn: Thêm thẻ
Không có thẻ, Là người đầu tiên thẻ bản ghi này!
Thư viện lưu trữ: Thư viện Trường Đại học Đà Lạt
id oai:scholar.dlu.edu.vn:123456789-2006
record_format dspace
spelling oai:scholar.dlu.edu.vn:123456789-20062023-05-08T04:18:34Z Mining Hidden Topics from Newspaper Quotations: The COVID-19 Pandemic Tạ, Hoàng Thắng In this paper, we extract quotations from Al Jazeera’s news articles containing keywords related to the COVID-19 pandemic. We apply Latent Dirichlet allocation (LDA), coherence measures, and clustering algorithms to unsupervisedly explore latent topics from the dataset of about 3400 quotations to see how coronavirus impacts human beings. By combining noun phrases as inputs before the training and Cv measure for coherence values, we obtain an average coherence value of 0.66 with a least average number of topics of 24.8. The result covers some of the top issues that our world has been facing against the COVID-19 pandemic. 2023-04-20T04:52:47Z 2023-04-20T04:52:47Z 2020-10 Conference paper Bài báo đăng trên KYHT quốc tế (có ISBN) http://scholar.dlu.edu.vn/handle/123456789/2006 en
institution Thư viện Trường Đại học Đà Lạt
collection Thư viện số
language English
description In this paper, we extract quotations from Al Jazeera’s news articles containing keywords related to the COVID-19 pandemic. We apply Latent Dirichlet allocation (LDA), coherence measures, and clustering algorithms to unsupervisedly explore latent topics from the dataset of about 3400 quotations to see how coronavirus impacts human beings. By combining noun phrases as inputs before the training and Cv measure for coherence values, we obtain an average coherence value of 0.66 with a least average number of topics of 24.8. The result covers some of the top issues that our world has been facing against the COVID-19 pandemic.
format Conference paper
author Tạ, Hoàng Thắng
spellingShingle Tạ, Hoàng Thắng
Mining Hidden Topics from Newspaper Quotations: The COVID-19 Pandemic
author_facet Tạ, Hoàng Thắng
author_sort Tạ, Hoàng Thắng
title Mining Hidden Topics from Newspaper Quotations: The COVID-19 Pandemic
title_short Mining Hidden Topics from Newspaper Quotations: The COVID-19 Pandemic
title_full Mining Hidden Topics from Newspaper Quotations: The COVID-19 Pandemic
title_fullStr Mining Hidden Topics from Newspaper Quotations: The COVID-19 Pandemic
title_full_unstemmed Mining Hidden Topics from Newspaper Quotations: The COVID-19 Pandemic
title_sort mining hidden topics from newspaper quotations: the covid-19 pandemic
publishDate 2023
url http://scholar.dlu.edu.vn/handle/123456789/2006
_version_ 1768306336541442048