Mining Hidden Topics from Newspaper Quotations: The COVID-19 Pandemic
In this paper, we extract quotations from Al Jazeera’s news articles containing keywords related to the COVID-19 pandemic. We apply Latent Dirichlet allocation (LDA), coherence measures, and clustering algorithms to unsupervisedly explore latent topics from the dataset of about 3400 quotations to se...
Đã lưu trong:
Tác giả chính: | |
---|---|
Định dạng: | Conference paper |
Ngôn ngữ: | English |
Được phát hành: |
2023
|
Truy cập trực tuyến: | http://scholar.dlu.edu.vn/handle/123456789/2006 |
Các nhãn: |
Thêm thẻ
Không có thẻ, Là người đầu tiên thẻ bản ghi này!
|
Thư viện lưu trữ: | Thư viện Trường Đại học Đà Lạt |
---|
id |
oai:scholar.dlu.edu.vn:123456789-2006 |
---|---|
record_format |
dspace |
spelling |
oai:scholar.dlu.edu.vn:123456789-20062023-05-08T04:18:34Z Mining Hidden Topics from Newspaper Quotations: The COVID-19 Pandemic Tạ, Hoàng Thắng In this paper, we extract quotations from Al Jazeera’s news articles containing keywords related to the COVID-19 pandemic. We apply Latent Dirichlet allocation (LDA), coherence measures, and clustering algorithms to unsupervisedly explore latent topics from the dataset of about 3400 quotations to see how coronavirus impacts human beings. By combining noun phrases as inputs before the training and Cv measure for coherence values, we obtain an average coherence value of 0.66 with a least average number of topics of 24.8. The result covers some of the top issues that our world has been facing against the COVID-19 pandemic. 2023-04-20T04:52:47Z 2023-04-20T04:52:47Z 2020-10 Conference paper Bài báo đăng trên KYHT quốc tế (có ISBN) http://scholar.dlu.edu.vn/handle/123456789/2006 en |
institution |
Thư viện Trường Đại học Đà Lạt |
collection |
Thư viện số |
language |
English |
description |
In this paper, we extract quotations from Al Jazeera’s news articles containing keywords related to the COVID-19 pandemic. We apply Latent Dirichlet allocation (LDA), coherence measures, and clustering algorithms to unsupervisedly explore latent topics from the dataset of about 3400 quotations to see how coronavirus impacts human beings. By combining noun phrases as inputs before the training and Cv measure for coherence values, we obtain an average coherence value of 0.66 with a least average number of topics of 24.8. The result covers some of the top issues that our world has been facing against the COVID-19 pandemic. |
format |
Conference paper |
author |
Tạ, Hoàng Thắng |
spellingShingle |
Tạ, Hoàng Thắng Mining Hidden Topics from Newspaper Quotations: The COVID-19 Pandemic |
author_facet |
Tạ, Hoàng Thắng |
author_sort |
Tạ, Hoàng Thắng |
title |
Mining Hidden Topics from Newspaper Quotations: The COVID-19 Pandemic |
title_short |
Mining Hidden Topics from Newspaper Quotations: The COVID-19 Pandemic |
title_full |
Mining Hidden Topics from Newspaper Quotations: The COVID-19 Pandemic |
title_fullStr |
Mining Hidden Topics from Newspaper Quotations: The COVID-19 Pandemic |
title_full_unstemmed |
Mining Hidden Topics from Newspaper Quotations: The COVID-19 Pandemic |
title_sort |
mining hidden topics from newspaper quotations: the covid-19 pandemic |
publishDate |
2023 |
url |
http://scholar.dlu.edu.vn/handle/123456789/2006 |
_version_ |
1768306336541442048 |