胶囊网络中的动态路由机制：实现MNIST数据集上的顶尖性能

1星需积分: 9 116 浏览量更新于2024-09-09 收藏 1.82MB PDF 举报

Dynamic Routing Between Capsules Capsule 网络是一种基于神经网络的结构，它由多个 capsule 组成，每个 capsule 都是一组神经元的集合。这些神经元的活动向量（activity vector）表示特定实体类型的实例化参数，如对象或对象部分。 Capsule 网络的关键思想是使用活动向量的长度来表示实体存在的概率，向量方向表示实例化参数。 _capsule_之间的动态路由（Dynamic Routing）机制是指较低级别的 _capsule_ 通过变换矩阵对更高级别的 _capsule_ 的实例化参数进行预测。当多个预测一致时，更高级别的 _capsule_ 变得活跃。这种机制使得 Capsule 网络能够学习到更 abstract 的实体表示，从而提高了网络的表达能力和泛化能力。在 MNIST 数据集上的实验结果表明，使用 Dynamic Routing 机制的多层 Capsule 系统达到了顶尖的性能，远远优越于卷积网络在识别高度重叠数字时的性能。这种结果证明了 Capsule 网络的强大表达能力和泛化能力。 Dynamic Routing 机制的关键是 routing-by-agreement 机制，即较低级别的 _capsule_ 偏向于将输出发送至高级别的 _capsule_，有了来自低级别 _capsule_ 的预测，高级别 _capsule_ 的活动向量具备较大的标量积。这使得高级别 _capsule_ 能够根据低级别 _capsule_ 的预测选择合适的实例化参数，从而提高了网络的表达能力和泛化能力。在实际应用中，Dynamic Routing 机制可以用于图像识别、自然语言处理、语音识别等领域，提高网络的表达能力和泛化能力，从而提高了模型的性能和鲁棒性。 Dynamic Routing Between Capsules 是一种强大且灵活的神经网络结构，它能够学习到更 abstract 的实体表示，提高了网络的表达能力和泛化能力。这种结构可以应用于多种领域，提高模型的性能和鲁棒性。

should be coupled to capsule j.

exp(b

)

exp(b

)

(3)

The log priors can be learned discriminatively at the same time as all the other weights. They depend

on the location and type of the two capsules but not on the current input image

. The initial coupling

coefﬁcients are then iteratively reﬁned by measuring the agreement between the current output

each capsule, j, in the layer above and the prediction ˆu

j|i

made by capsule i.

The agreement is simply the scalar product

= v

.ˆu

j|i

. This agreement is treated as if it were a log

likelihood and is added to the initial logit,

before computing the new values for all the coupling

coefﬁcients linking capsule i to higher level capsules.

In convolutional capsule layers each unit in a capsule is a convolutional unit. Therefore, each capsule

will output a grid of vectors rather than a single vector output.

Procedure 1 Routing algorithm.

1: procedure ROUTING(ˆu

j|i

, r, l)

2: for all capsule i in layer l and capsule j in layer (l + 1): b

3: for r iterations do

4: for all capsule i in layer l: c

softmax(b

) . softmax computes Eq. 3

5: for all capsule j in layer ( l + 1): s

ˆu

j|i

6: for all capsule j in layer ( l + 1): v

squash(s

) . squash computes Eq. 1

7: for all capsule i in layer l and capsule j in layer (l + 1): b

+ ˆu

j|i

return v

3 Margin loss for digit existence

We are using the length of the instantiation vector to represent the probability that a capsule’s entity

exists, so we would like the top-level capsule for digit class

to have a long instantiation vector if

and only if that digit is present in the image. To allow for multiple digits, we use a separate margin

loss, L

for each digit capsule, k:

= T

max(0,m

 ||v

||)

+  (1  T

) max(0, ||v

||  m



)

(4)

where

iff a digit of class

is present

and

=0.9

and



=0.1

. The



down-weighting

of the loss for absent digit classes stops the initial learning from shrinking the lengths of the activity

vectors of all the digit capsules. We suggest

 =0.5

. The total loss is simply the sum of the losses of

all digit capsules.

4 CapsNet architecture

A simple CapsNet architecture is shown in Fig. 1. The architecture is shallow with only two

convolutional layers and one fully connected layer. Conv

has

256

9 ⇥ 9

convolution kernels with a

stride of 1 and ReLU activation. This layer converts pixel intensities to the activities of local feature

detectors that are then used as inputs to the primary capsules.

The primary capsules are the lowest level of multi-dimensional entities and, from an inverse graphics

perspective, activating the primary capsules corresponds to inverting the rendering process. This is a

very different type of computation than piecing instantiated parts together to make familiar wholes,

which is what capsules are designed to be good at.

The second layer (PrimaryCapsules) is a convolutional capsule layer with

channels of convolutional

D capsules (i.e. each primary capsule contains 8 convolutional units with a

9 ⇥ 9

kernel and a stride

of 2). Each primary capsule output sees the outputs of all

256⇥ 81

Conv

units whose receptive ﬁelds

overlap with the location of the center of the capsule. In total PrimaryCapsules has

[32, 6, 6]

capsule

For MNIST we found that it was sufﬁcient to set all of these priors to be equal.

We do not allow an image to contain two instances of the same digit class. We address this weakness of

capsules in the discussion section.

剩余10页未读，继续阅读

攻城无数

粉丝: 178
资源: 44

胶囊网络中的动态路由机制：实现MNIST数据集上的顶尖性能

Dynamic Routing Between Capsules - by.Hinton.2017

CapsuleNetWork：从TensorFlow复现代码理解胶囊网络（DynamicRoutingBetweenCapsules）

Dynamic Routing Between Capsules 下载

dynamic routing between capsules

1300张图片训练效果

springboot116基于java的教学辅助平台.zip

yolo算法-火灾探测数据集-3466张图像带标签-火灾fire_detect-oqlpv.zip

基于go语言的参数解析校验器项目资源.zip

matlab主成分分析代码

华南农业大学在四川2020-2024各专业最低录取分数及位次表.pdf

最新资源