深度学习中的dropout策略：作为集成学习的分析

需积分: 5 173 浏览量更新于2024-08-03 收藏 185KB PDF 举报

"本文分析了丢弃学习（dropout learning）作为集成学习的一种形式，探讨了其在深度学习中防止过拟合的作用。" 在深度学习领域，如视觉对象识别和语音识别，模型通常包含大量层、单元和连接。由于模型的复杂性，过拟合成为一个严重的问题，即模型在训练数据上表现良好，但在未见过的新数据上性能下降。为了解决这个问题，丢弃学习（dropout learning）被提出作为一种有效的正则化技术。丢弃学习的基本思想是在训练过程中以一定的概率p随机忽略部分输入和隐藏层单元。这种“丢弃”策略减少了神经元之间的依赖性，防止了网络中的特征互相过度适应。当训练过程结束时，所有单元都参与预测，但每个单元的权重被调整为在丢弃时的期望值，这样可以确保在测试阶段所有神经元的贡献都被正确地考虑在内。论文作者发现，这个结合被丢弃的隐藏单元与学习到的网络的过程可以类比为集成学习。集成学习是一种机器学习方法，它结合多个弱学习器（如决策树或神经网络）来创建一个强学习器。每个弱学习器负责学习不同的数据子集或模型的变体，最终通过投票或平均等方式得出整体的预测结果。从集成学习的角度看丢弃学习，可以理解为每个训练迭代中构建了一个不同的子网络，这些子网络在测试时通过加权平均的方式共同决定输出。每个子网络在训练期间独立学习，减少了模型的复杂度，从而降低了过拟合的风险。这种随机丢弃的机制可以视为一种软的模型平均，与硬的模型平均（如 Bagging）相比，它允许在训练过程中进行更灵活的参数共享。此外，丢弃学习还具有一些其他优点。例如，它能够有效地减少神经网络中的协变量漂移，即当模型过于依赖某些特定输入特征时，丢弃学习可以帮助模型更加泛化。同时，丢弃学习也能够提高模型的鲁棒性，因为它强制网络学习更多的一般性特征，而不是过度依赖于训练数据中的噪声或异常值。丢弃学习是深度学习中一种强大的正则化工具，通过模拟集成学习的原理，它有助于防止过拟合，提升模型的泛化能力，并且在实际应用中表现出色。特别是在大型神经网络中，丢弃学习已经成为防止过拟合的标准实践之一。通过深入理解丢弃学习与集成学习的关系，可以进一步优化深度学习模型的设计，提高其在各种任务上的性能。

arXiv:1706.06859v1 [cs.LG] 20 Jun 2017

Analysis of dropout learning regarded as

ensemble learning

Kazuyuki Hara

Daisuke Saitoh

Hayaru Shouno

College of Industrial Technology, Nihon University,

1-2-1 Izumi-cho, Narashino-shi, Chiba, 275-8575 Japan.

Graduate School of Industrial Technology, Nihon University

Graduate School of Informatics and Engineering,

The University of Electro-Communications

1-5-1 Chofugaoka, Chofu-shi, Tokyo, 182-8585 Japan.

Abstract

Deep learning is the state-of-the-art in ﬁelds such as visual object

recognition and speech recognition. This learning uses a large number of

layers, huge number of units, and connections. Therefore, overﬁ tting is

a serious problem. To avoid this problem, dropout learning is proposed.

Dropout learning neglects some inputs and hidden units in the learning

process with a probability, p, and then, the neglected inputs and hidden

units are combined with the learned n etwork to express the ﬁnal output.

We ﬁnd t hat the process of combining th e neglected hidden units with

the learned network can be regarded as ensemble learning, so we analyze

dropout learning from this point of view.

keywords: Dropout learning, over ﬁtting, regularization, ensemble learning,

soft-committee machine, teacher-student formulation

1 Introduction

Deep learning [1, 2] is attracting much a ttention in the ﬁeld of vis ual object

recognition, speech recognition, object detection, and many other domains. It

provides automatic feature extr action and has the ability to achieve outstanding

performance [3, 4].

Deep le arning uses a very deep layered networ k and a huge number o f data,

so overﬁtting is a serio us problem. To avo id overﬁtting, regularizatio n is used.

Hinton et al. proposed a regularization method called “dropout learning ” [5]

for this purpose. Dropout learning follows two processes. At learning time,

some hidden units are neglected with a probability p, and this process reduces

the network size. At test time, learned hidden units and those not learned are

下载后可阅读完整内容，剩余8页未读，立即下载

weixin_40191861_zj

粉丝: 84
资源: 1万+

深度学习中的dropout策略：作为集成学习的分析

藏经阁-5-使用TensorFl...1506522894.pdf

Python库 | keras-targeted-dropout-0.3.0.tar.gz

lesson35-Early-stopping-Dropout.zip

藏经阁-Multi-Task Learning for E-comm.pdf

TI-TPS92624-Q1.pdf

TI-TPS71202-EP.pdf

TI-TPS92611-Q1.pdf

TI-TLV710-Q1.pdf

TI-TPS92623-Q1.pdf

TI-UCC28950-Q1.pdf

最新资源