解耦卷积提升小规模CNN性能：空间与通道的协同设计

199 浏览量更新于2024-08-12 1 收藏 558KB PDF 举报

本文主要探讨了在卷积神经网络（CNN）设计中引入的"解耦卷积"这一创新方法。传统的卷积操作是将滤波器应用于输入数据的不同空间位置，而这篇研究则关注于通过分离空间和通道维度来优化卷积过程，以创建更小型、高效且具有竞争力的CNN架构。首先，当前的主流解耦技术倾向于通过分解滤波器矩阵来实现，但这可能并不一定能充分理解标准卷积的内在工作原理。作者提供了一个独特的视角，即从单个滤波器的位置应用扩展到所有位置，这个过程与标准卷积等价，强调了基础操作的重要性。作者的贡献在于，他们根据解耦视图中的观察，提出了一个新颖的策略。这种方法旨在放松滤波器在空间上的聚集特性，通过学习自适应的空间配置，允许滤波器在保持功能的同时减少在空间上的固定模式。这种灵活性有助于提升模型在处理复杂场景时的适应性。另一方面，为了减少模型的冗余，研究者提出通过减少中间通道的数量来优化计算效率。这样做的目的是在保持足够信息流动的同时，降低模型的参数量和计算负荷，从而达到减小模型尺寸的目标。值得注意的是，尽管采用了非耦合卷积，该方法在CIFAR-100，CIFAR-10和ImageNet等常用基准上的分类性能仍然能够与传统方法相媲美，显示出其在保持性能的同时具备显著的效率优势。这篇研究不仅深化了对标准卷积的理解，而且提出了一种创新的解耦方法，它在不影响性能的前提下，有效地简化了CNN的结构，对于提高模型的效率和轻量化设计具有重要的理论价值和实践意义。未来的研究可能会进一步探索如何在各种深度学习任务中广泛应用这种解耦策略，以推动计算机视觉和其他领域的技术发展。

展开

Decoupled Convolutions for CNNs

Guotian Xie,

1,2∗

Ting Zhang,

Kuiyuan Yang,

Jianhuang Lai,

1,2

Jingdong Wang

School of Data and Computer Science, Sun Yat-Sen University

Guangdong Province Key Laboratory of Information Security

DeepMotion,

Microsoft Research

xieguotian1990@gmail.com,{Ting.Zhang, jingdw}@microsoft.com

kuiyuanyang@deepmotion.ai, stsljh@mail.sysu.edu.cn

Abstract

In this paper, we are interested in designing small CNNs by

decoupling the convolution along the spatial and channel do-

mains. Most existing decoupling techniques focus on approx-

imating the ﬁlter matrix through decomposition. In contrast,

we provide a two-step interpretation of the standard convolu-

tion from the ﬁlter at a single location to all locations, which

is exactly equivalent to the standard convolution. Motivated

by the observations in our decoupling view, we propose an

effective approach to relax the sparsity of the ﬁlter in spa-

tial aggregation by learning a spatial conﬁguration, and re-

duce the redundancy by reducing the number of intermedi-

ate channels. Our approach achieves comparable classiﬁca-

tion performance with the standard uncoupled convolution,

but with a smaller model size over CIFAR-100, CIFAR-10

and ImageNet.

Introduction

Since AlexNet (Krizhevsky, Sutskever, and Hinton 2012)

successfully applied Convolutional Neural Network (CNN)

to ImageNet and won the challenge by a large margin in

2012, CNNs become the most widely used model for im-

age classiﬁcation (He et al. 2016), object detection (Ren et

al. 2015; Redmon and Farhadi 2016) and image segmenta-

tion (Long, Shelhamer, and Darrell 2015; Kolesnikov and

Lampert 2016) and so on. CNNs have become deeper and

deeper (Simonyan and Zisserman 2014; Szegedy et al. 2015;

He et al. 2015; 2016; Huang et al. 2016), ranging from

tens of layers to thousands of layers to pursue better per-

formance, and have become wider and wider as well, such

as Wide Residual Networks (Zagoruyko and Komodakis

2016).

Another research direction is designing more effective ﬁl-

ters. There have been many works on ﬁlter design, and most

of them can be categorized into two types. One is to decom-

pose the ﬁlter matrix into several low rank matrices (Ioan-

nou et al. 2015; Denton et al. 2014; Zhang et al. 2015; Kim

et al. 2015; Tai et al. 2015; Jaderberg, Vedaldi, and Zisser-

man 2014; Mamalet and Garcia 2012), the other is to view

the ﬁlter as a sparse matrix, where some works sparsify the

∗

This work was done when Guotian Xie was an intern at Mi-

crosoft Research, Beijing, P.R. China.

 2018, Association for the Advancement of Artiﬁcial

channel extent, e.g., group convolution (Ioannou et al. 2016;

Zhang et al. 2017), channel-wise convolution or separable

ﬁlters (Chollet 2016) and other works sparsify the spatial ex-

tent with smaller ﬁlters, e.g., 3 × 3, 1 ×3 and 3× 1 (Szegedy

et al. 2016). In this paper, in contrast to design the ﬁlters, we

are interested in decoupling the convolution along the spa-

tial and channel domains and propose an effective approach

based on the decoupled interpretation.

We start from analyzing the process of convolution on the

input, and decompose this process into two steps. First each

location in the input is projected across the channel domain.

In this way, the projection along channel domain is not re-

lated to the spatial information of the input. Second, we ac-

cumulate the projections of the locations across spatial do-

main, and this process is only related to the spatial relation-

ship. We reformulate the decoupled two steps in a convolu-

tion form, ﬁrst conducting 1×1 across channel-domain con-

volution, and then conducting across spatial-domain convo-

lution with a spatial conﬁguration. This process is denoted

as decoupling spatial convolution.

From this decoupling view, we found that the decoupled

structure of standard spatial convolution is unbalance, in

which the 1 × 1 across channel-domain convolution is in

a high dimensional space that might lead to redundancy,

whereas the across spatial-domain convolution is a struc-

tured sparse group convolution. To solve this problem, we

propose a balance decoupling spatial convolution (BDSC)

to relax the sparsity of across spatial-domain convolution

by learning a spatial conﬁguration, and to reduce the redun-

dancy of across channel-domain convolution by reducing the

intermediate output channels. In this way, we found in our

experiments that, the performance of the models using our

decoupling convolution drops slightly comparing with the

standard spatial convolution, yet the model size is smaller

than models of standard spatial convolution.

Our contributions in this paper are:

1. We decouple the standard spatial convolution of CNN into

two parts, an across channel-domain convolution and an

across spatial-domain convolution.

2. We propose the balance decoupling spatial convolution to

relax the sparsity of the ﬁlter in spatial aggregation by

learning a spatial conﬁguration, and to reduce the redun-

dancy of 1 × 1 across channel-domain convolution by re-

The Thirty-Second AAAI Conference

on Artificial Intelligence (AAAI-18)

4284

下载后可阅读完整内容，剩余7页未读，立即下载

身份认证购VIP最低享 7 折!

30元优惠券

weixin_38643407

粉丝: 13

解耦卷积提升小规模CNN性能：空间与通道的协同设计

CNN的平衡解耦空间卷积

算法剪枝-基于解耦Remembering+Forgetting实现卷积神经网络CNN的算法剪枝-附项目源码+流程教程-优质项目

Python-单镜头时间动作检测中的解耦定位与分类

ResRep:无损CNN剪枝的解耦记忆与遗忘策略

利用卷积神经网络实现强化学习探索

神经网络解耦控制平台自动调平系统研究

RepVGG：解析VGG型卷积网络的优化策略

视点不变坐标变换提升点云卷积的鲁棒性

神经网络解耦控制在自动调平系统中的应用研究

对抗训练驱动的3D人脸模型：解耦与多样性提升

最新资源