Multi-layer dynamic and asymmetric convolutions

LUO Chunjie (罗纯杰); ZHAN Jianfeng

文章摘要

LUO Chunjie (罗纯杰),ZHAN Jianfeng.[J].高技术通讯(英文),2022,28(3):227~236

Multi-layer dynamic and asymmetric convolutions

DOI：10.3772/j.issn.1006-6748.2022.03.001

中文关键词:

英文关键词: neural network, dynamic network, attention, image classification

基金项目:

Author Name	Affiliation
LUO Chunjie (罗纯杰)	(Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, P.R.China) (University of Chinese Academy of Sciences, Beijing 100049, P.R.China)
ZHAN Jianfeng	(Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, P.R.China) (University of Chinese Academy of Sciences, Beijing 100049, P.R.China)

Hits: 568

Download times: 534

中文摘要:

英文摘要:

Dynamic networks have become popular to enhance the model capacity while maintaining efficient inference by dynamically generating the weight based on over-parameters. They bring much more parameters and increase the difficulty of the training. In this paper, a multi-layer dynamic convolution (MDConv) is proposed, which scatters the over-parameters over multi-layers with fewer parameters but stronger model capacity compared with scattering horizontally; it uses the expanding form where the attention is applied to the features to facilitate the training; it uses the compact form where the attention is applied to the weights to maintain efficient inference. Moreover, a multi-layer asymmetric convolution (MAConv) is proposed, which has no extra parameters and computation cost at inference time compared with static convolution. Experimental results show that MDConv achieves better accuracy with fewer parameters and significantly facilitates the training; MAConv enhances the accuracy without any extra cost of storage or computation at inference time compared with static convolution.

View Full Text View/Add Comment Download reader