Efficient Algorithms for Mining Closed and Maximal High Utility Itemsets
Closed high utility itemsets (CHUIs) and maximal high utility itemsets (MaxHUIs) are two important concise representations of HUIs. Discovering these itemsets is important because they are lossless and compact, i.e., they provide a concise summary of all HUIs that can be orders of magnitude smaller....
Đã lưu trong:
Những tác giả chính: | , , |
---|---|
Định dạng: | Journal article |
Ngôn ngữ: | Vietnamese |
Được phát hành: |
2024
|
Những chủ đề: | |
Truy cập trực tuyến: | https://doi.org/10.1016/j.knosys.2022.109921 https://www.sciencedirect.com/science/article/abs/pii/S0950705122010140 |
Các nhãn: |
Thêm thẻ
Không có thẻ, Là người đầu tiên thẻ bản ghi này!
|
Thư viện lưu trữ: | Thư viện Trường Đại học Đà Lạt |
---|
id |
oai:scholar.dlu.edu.vn:123456789-3524 |
---|---|
record_format |
dspace |
spelling |
oai:scholar.dlu.edu.vn:123456789-35242024-07-22T07:41:59Z Efficient Algorithms for Mining Closed and Maximal High Utility Itemsets Dương, Văn Hải Hoàng, Minh Tiến Trần, Thống Closed high utility itemset,High utility itemset,Maximal high utility itemset,Pruning strategy,Upper bound,Utility mining,Weak upper bound Closed high utility itemsets (CHUIs) and maximal high utility itemsets (MaxHUIs) are two important concise representations of HUIs. Discovering these itemsets is important because they are lossless and compact, i.e., they provide a concise summary of all HUIs that can be orders of magnitude smaller. In addition, it can be more efficient to extract these representations than it would be to extract all HUIs. Mining the concise representations of HUIs is also an important step toward the generation of nonredundant high utility association rules that can reveal meaningful information to decision-makers. However, although several algorithms have been designed to mine these representations, such as EFIM-Closed, HMiner-Closed, and CHUI-Miner(Max), they have long runtimes, high memory usage, and scalability issues, especially for dense and large datasets. To address this issue, this paper proposes two efficient algorithms named C-HUIM and MaxC-HUIM for mining CHUIs, and simultaneously mining both CHUIs and MaxHUIs, respectively. These algorithms use a novel weak upper bound on the utility, which is strictly tighter than traditional upper bounds, and a corresponding pruning strategy called í µí²®í µí²«í µí²²í µí²°ℬ to quickly eliminate low utility itemsets. The algorithms also include two novel search space reduction strategies named í µí²«í µí²®í µí± í µí± í µí± í µí° ¶í µí°»í µí± ℬ and ℒí µí²«í µí²®í µí± í µí± í µí± í µí° ¶í µí°»í µí± ℬ. The í µí²«í µí²®í µí± í µí± í µí± í µí° ¶í µí°»í µí± ℬ strategy only requires checking the inclusion relationship among a small number of itemsets, while ℒí µí²«í µí²®í µí± í µí± í µí± í µí° ¶í µí°»í µí± ℬ does not perform any inclusion check. In addition, the algorithms adopt a structure named MPUN-list to efficiently store and calculate information about each itemset's utility and support. Experimental results show that the proposed algorithms can be more than 100 times faster, are more memory efficient, and have better scalability than the state-of-the-art algorithms. 2024-07-08T04:33:17Z 2024-07-08T04:33:17Z 2022 Journal article Bài báo đăng trên tạp chí thuộc SCOPUS, bao gồm book chapter https://doi.org/10.1016/j.knosys.2022.109921 https://www.sciencedirect.com/science/article/abs/pii/S0950705122010140 vi Knowledge-Based Systems |
institution |
Thư viện Trường Đại học Đà Lạt |
collection |
Thư viện số |
language |
Vietnamese |
topic |
Closed high utility itemset,High utility itemset,Maximal high utility itemset,Pruning strategy,Upper bound,Utility mining,Weak upper bound |
spellingShingle |
Closed high utility itemset,High utility itemset,Maximal high utility itemset,Pruning strategy,Upper bound,Utility mining,Weak upper bound Dương, Văn Hải Hoàng, Minh Tiến Trần, Thống Efficient Algorithms for Mining Closed and Maximal High Utility Itemsets |
description |
Closed high utility itemsets (CHUIs) and maximal high utility itemsets (MaxHUIs) are two important concise representations of HUIs. Discovering these itemsets is important because they are lossless and compact, i.e., they provide a concise summary of all HUIs that can be orders of magnitude smaller. In addition, it can be more efficient to extract these representations than it would be to extract all HUIs. Mining the concise representations of HUIs is also an important step toward the generation of nonredundant high utility association rules that can reveal meaningful information to decision-makers. However, although several algorithms have been designed to mine these representations, such as EFIM-Closed, HMiner-Closed, and CHUI-Miner(Max), they have long runtimes, high memory usage, and scalability issues, especially for dense and large datasets. To address this issue, this paper proposes two efficient algorithms named C-HUIM and MaxC-HUIM for mining CHUIs, and simultaneously mining both CHUIs and MaxHUIs, respectively. These algorithms use a novel weak upper bound on the utility, which is strictly tighter than traditional upper bounds, and a corresponding pruning strategy called í µí²®í µí²«í µí²²í µí²°ℬ to quickly eliminate low utility itemsets. The algorithms also include two novel search space reduction strategies named í µí²«í µí²®í µí± í µí± í µí± í µí° ¶í µí°»í µí± ℬ and ℒí µí²«í µí²®í µí± í µí± í µí± í µí° ¶í µí°»í µí± ℬ. The í µí²«í µí²®í µí± í µí± í µí± í µí° ¶í µí°»í µí± ℬ strategy only requires checking the inclusion relationship among a small number of itemsets, while ℒí µí²«í µí²®í µí± í µí± í µí± í µí° ¶í µí°»í µí± ℬ does not perform any inclusion check. In addition, the algorithms adopt a structure named MPUN-list to efficiently store and calculate information about each itemset's utility and support. Experimental results show that the proposed algorithms can be more than 100 times faster, are more memory efficient, and have better scalability than the state-of-the-art algorithms. |
format |
Journal article |
author |
Dương, Văn Hải Hoàng, Minh Tiến Trần, Thống |
author_facet |
Dương, Văn Hải Hoàng, Minh Tiến Trần, Thống |
author_sort |
Dương, Văn Hải |
title |
Efficient Algorithms for Mining Closed and Maximal High Utility Itemsets |
title_short |
Efficient Algorithms for Mining Closed and Maximal High Utility Itemsets |
title_full |
Efficient Algorithms for Mining Closed and Maximal High Utility Itemsets |
title_fullStr |
Efficient Algorithms for Mining Closed and Maximal High Utility Itemsets |
title_full_unstemmed |
Efficient Algorithms for Mining Closed and Maximal High Utility Itemsets |
title_sort |
efficient algorithms for mining closed and maximal high utility itemsets |
publishDate |
2024 |
url |
https://doi.org/10.1016/j.knosys.2022.109921 https://www.sciencedirect.com/science/article/abs/pii/S0950705122010140 |
_version_ |
1813142620677865472 |