Parallel algorithms of random forests for classifying very large datasets

The random forests algorithm proposed by Breiman is an ensemble-based approach with very high accuracy. The learning and classification tasks of a set of decision trees take a lot of time, make it intractable when dealing with ve ry large datasets. There is a need to scale up the random forests a...

全面介绍

Đã lưu trong:
书目详细资料
Những tác giả chính: Do, Thanh Nghi, Pham, Nguyen Khang
格式: Bài viết
语言:English
出版: Trường Đại học Đà Lạt 2014
主题:
在线阅读:https://scholar.dlu.edu.vn/thuvienso/handle/DLU123456789/37523
标签: 添加标签
没有标签, 成为第一个标记此记录!
Thư viện lưu trữ: Thư viện Trường Đại học Đà Lạt
实物特征
总结:The random forests algorithm proposed by Breiman is an ensemble-based approach with very high accuracy. The learning and classification tasks of a set of decision trees take a lot of time, make it intractable when dealing with ve ry large datasets. There is a need to scale up the random forests algorithm to handle massive datasets. We propose parallel algorithms of random forests to take into account the benefits of Grids computing. These algorithms improve training and classification time compared with the original ones. The experimental results on large datasets including Forest cover type, KDD Cup 1999, Connect-4 from the UCI data repository showed that the training and classification time of parallel algorithms are significantly reduced.