Multi-Scale Convolutions Meet Group Attention for Dense Prediction Tasks

Lecture Notes in Networks and Systems (LNNS,volume 882); The 13th Conference on Information Technology and Its Applications (CITA 2024) ; pp: 360-371.

Wedi'i Gadw mewn:
Manylion Llyfryddiaeth
Prif Awduron: Vo, Xuan Thuy, Nguyen, Duy Linh, Priadana, Adri, Choi, Jehwan, Hyun Jo, Kang
Fformat: Bài viết
Iaith:English
Cyhoeddwyd: Springer Nature 2024
Pynciau:
Mynediad Ar-lein:https://elib.vku.udn.vn/handle/123456789/4293
https://doi.org/10.1007/978-3-031-74127-2_30
Tagiau: Ychwanegu Tag
Dim Tagiau, Byddwch y cyntaf i dagio'r cofnod hwn!
Thư viện lưu trữ: Trường Đại học Công nghệ Thông tin và Truyền thông Việt Hàn - Đại học Đà Nẵng
id oai:elib.vku.udn.vn:123456789-4293
record_format dspace
spelling oai:elib.vku.udn.vn:123456789-42932024-12-06T08:47:55Z Multi-Scale Convolutions Meet Group Attention for Dense Prediction Tasks Vo, Xuan Thuy Nguyen, Duy Linh Priadana, Adri Choi, Jehwan Hyun Jo, Kang Multi-Scale Convolutions Meet Group Attention for Dense Prediction Tasks On ImageNet-1K image classification, the proposed method achieves 77.6% Top-1 accuracy at 0.7 GFLOPs, surpassing other methods under similar computational costs Lecture Notes in Networks and Systems (LNNS,volume 882); The 13th Conference on Information Technology and Its Applications (CITA 2024) ; pp: 360-371. Self-attention can capture long-range dependencies from input sequences without inductive biases, resulting in quadratic complexity. When transferring Vision Transformers to dense prediction tasks, the models suffer huge computational costs. Recent methods have drawn sparse attention to approximate attention regions and injected convolution into self-attention layers. Motivated by this line of research, this paper introduces group attention that has linear complexity with input resolution while modeling global context features. Group attention shares information across channels, and convolution is spatial sharing. Both operations are complementary, and multi-scale convolution can capture multiple views of the input. Merging multi-scale convolution into group attention layers can help improve feature representation and modeling abilities. To verify the effectiveness of the proposed method, extensive experiments are conducted on benchmark datasets for various vision tasks. On ImageNet-1K image classification, the proposed method achieves 77.6% Top-1 accuracy at 0.7 GFLOPs, surpassing other methods under similar computational costs. When transferring pre-trained model on ImageNet-1K to dense prediction tasks, the proposed method attains consistent improvements across visual tasks. 2024-12-06T08:46:22Z 2024-12-06T08:46:22Z 2024-11 Working Paper 978-3-031-74126-5 https://elib.vku.udn.vn/handle/123456789/4293 https://doi.org/10.1007/978-3-031-74127-2_30 en application/pdf Springer Nature
institution Trường Đại học Công nghệ Thông tin và Truyền thông Việt Hàn - Đại học Đà Nẵng
collection DSpace
language English
topic Multi-Scale Convolutions Meet Group Attention for Dense Prediction Tasks
On ImageNet-1K image classification, the proposed method achieves 77.6% Top-1 accuracy at 0.7 GFLOPs, surpassing other methods under similar computational costs
spellingShingle Multi-Scale Convolutions Meet Group Attention for Dense Prediction Tasks
On ImageNet-1K image classification, the proposed method achieves 77.6% Top-1 accuracy at 0.7 GFLOPs, surpassing other methods under similar computational costs
Vo, Xuan Thuy
Nguyen, Duy Linh
Priadana, Adri
Choi, Jehwan
Hyun Jo, Kang
Multi-Scale Convolutions Meet Group Attention for Dense Prediction Tasks
description Lecture Notes in Networks and Systems (LNNS,volume 882); The 13th Conference on Information Technology and Its Applications (CITA 2024) ; pp: 360-371.
format Working Paper
author Vo, Xuan Thuy
Nguyen, Duy Linh
Priadana, Adri
Choi, Jehwan
Hyun Jo, Kang
author_facet Vo, Xuan Thuy
Nguyen, Duy Linh
Priadana, Adri
Choi, Jehwan
Hyun Jo, Kang
author_sort Vo, Xuan Thuy
title Multi-Scale Convolutions Meet Group Attention for Dense Prediction Tasks
title_short Multi-Scale Convolutions Meet Group Attention for Dense Prediction Tasks
title_full Multi-Scale Convolutions Meet Group Attention for Dense Prediction Tasks
title_fullStr Multi-Scale Convolutions Meet Group Attention for Dense Prediction Tasks
title_full_unstemmed Multi-Scale Convolutions Meet Group Attention for Dense Prediction Tasks
title_sort multi-scale convolutions meet group attention for dense prediction tasks
publisher Springer Nature
publishDate 2024
url https://elib.vku.udn.vn/handle/123456789/4293
https://doi.org/10.1007/978-3-031-74127-2_30
_version_ 1849199052808257536