Python深度学习计算机视觉实战：多GPU训练与ImageNet

需积分: 0 178 浏览量更新于2024-07-01 收藏 52.7MB PDF 举报

"深度学习与计算机视觉：Python实践者捆绑包(包含ImageNet内容)" 这本书，"Deep Learning for Computer Vision with Python"，由Adrian Rosebrock博士编写，旨在教授读者如何利用Python进行深度学习，并专注于计算机视觉领域的应用。这个捆绑包特别提到了ImageNet，这是一个在深度学习领域非常重要的大型图像数据集。 ImageNet是一个基于WordNet层次结构（每个节点代表一个类别）的大型视觉数据库，其中每个类别都由数百或数千个图像实例代表。ImageNet的主要目标是提供一个用于研究物体识别的广泛而多样化的数据集。书中可能涵盖了如何利用ImageNet进行模型训练和性能评估。在书中的"Training Networks Using Multiple GPUs"章节，作者探讨了如何利用多GPU来加速深度学习网络的训练过程。这一部分首先讨论了你需要多少个GPU，接着分析了使用多个GPU所能带来的性能提升。这通常涉及到数据并行化和模型并行化等技术，使得计算任务可以分布到多个GPU上，从而显著提高训练速度。 3.1节" How Many GPUs Do I Need?"指出选择GPU的数量取决于你的具体需求，如模型复杂度、可用计算资源和训练时间。作者可能会提供一些指导原则，帮助读者根据自己的情况做出决策。 3.2节"Performance Gains Using Multiple GPUs"详细讨论了使用多个GPU可以带来多大的性能提升。这可能包括并行计算的优势，以及如何通过有效的数据分发策略优化训练效率。接下来的章节" What Is ImageNet?"深入介绍了ImageNet数据集。4.1.1节"ILSVRC"（ImageNet Large Scale Visual Recognition Challenge）提到了ImageNet的一个重要组成部分，这是一个年度竞赛，推动了深度学习在图像分类和检测上的发展。4.2节"Obtaining ImageNet"则讲述了如何申请和获取这个数据集，可能包括了必要的步骤和注意事项，例如访问权限请求。 4.2.1节"Requesting Access"可能提供了关于如何申请访问ImageNet数据集的具体信息，这通常涉及学术用途的声明、数据使用协议和可能的费用等。总体而言，这本书不仅提供了深度学习的基础知识，还深入到高级主题，如多GPU训练和大规模数据集的使用，对想要在计算机视觉领域用Python进行深度学习的实践者来说，是一份宝贵的资源。

14 Chapter 1. Introduction

Kaggle: Dogs vs. Cats recognition challenge [3] as well as the cs231n Tiny ImageNet challenge

[4], the exact same task Stanford CNN students compete in. As we’ll ﬁnd out, we’ll be able to

obtain a top-25 position on the Kaggle Dogs vs. Cats leaderboard and top the cs231n challenge for

our technique type.

The ﬁnal part of this book covers applications of deep learning for computer vision outside of

image classiﬁcation, including basic object detection, deep dreaming and neural style, Generative

Adversarial Networks (GANs), and Image Super Resolution. Again, the techniques covered in this

volume are meant to be much more advanced than the Starter Bundle – this is where you’ll start to

separate yourself from a deep learning novice and transform into a true

deep learning practitioner

To start your transformation to deep learning expert, just ﬂip the page.

2. Introduction

Welcome to the ImageNet Bundle of Deep Learning for Computer Vision with Python, the ﬁnal

volume in the series. This volume is meant to be the most advanced in terms of content, covering

techniques that will enable you to reproduce results of state-of-the-art publications, papers, and

talks. To help keep this work organized, I’ve structured the ImageNet Bundle in two parts.

In the ﬁrst part, we’ll explore the ImageNet dataset in detail and learn how to train state-of-the-

art deep networks including AlexNet, VGGNet, GoogLeNet, ResNet, and SqueezeNet from scratch,

obtaining as similar accuracies as possible as their respective original works. In order to accomplish

this goal, we’ll need to call on all of our skills from the Starter Bundle and Practitioner Bundle.

We’ll need to ensure we understand the fundamentals of Convolutional Neural Networks,

especially layer types and regularization, as we implement some of these more “exotic” architectures.

Luckily, you have already seen more shallow implementations of these deeper architectures inside

the Practitioner Bundle so implementing networks such as VGGNet, GoogLeNet, and ResNet will

feel somewhat familiar.

We’ll also need to ensure we are comfortable with babysitting the training process as we

can easily overﬁt our network architectures on the ImageNet dataset, especially during the later

epochs. Learning how to correctly monitor loss and accuracy plots to determine if/when parameter

updates should be updated is an acquired skill, so to help you develop this skill faster and train

deep architectures on large, challenging datasets, I’ve written each of these chapters as “experiment

journals” that apply the scientiﬁc method.

Inside each chapter for a given network you’ll ﬁnd:

1. The exact process I used when training the network.

2. The particular results.

3. The changes I decided to make in the next experiment.

Thus, each chapter reads like a “story”: you’ll ﬁnd out what worked for me, what didn’t, and

ultimately what obtained the best results and enabled me to replicate the work of a given publication.

After reading this book, you’ll be able to use this knowledge to train your own network architectures

from scratch on ImageNet without spinning your wheels and wasting weeks (or even months) of

time trying to tune your parameters.

注意：学习节省调参的时间的方法

3. Training Networks Using Multiple GPUs

Training deep neural networks on large scale datasets can take a long time, even single experiments

can take days to ﬁnish. In order to speed up the training process, we can use multiple GPUs. While

backends such as Theano and TensorFlow (and therefore Keras) do support multiple GPU training,

the process to set up a multiple GPU experiment is arduous and non-trivial. I do expect this process

to change for the better in the future and become substantially easier.

Therefore, for deep neural networks and large datasets, I highly recommend using the mxnet

library [5] which we will be using for the majority of experiments in the remainder of this book. The

mxnet deep learning library (written in C++) provides bindings to the Python programming language

and specializes in distributed, multi-machine learning – the ability to parallelize training across

GPUs/devices/nodes is critical when training state-of-the-art deep neural network architectures on

massive datasets (such as ImageNet).

The mxnet library is also very easy to work with – given your background using the Keras

library from previous chapters in this book, you’ll ﬁnd working with mxnet to be easy, straightfor-

ward, and even quite natural.

It’s important to note that all neural networks in the ImageNet Bundle can be trained using a

single GPU – the only caveat is time. Some networks, such as AlexNet and SqueezeNet, require

only a few days time to be trained on a single GPU. Other architectures, such as VGGNet and

ResNet, may take over a month to train on a single GPU.

In the ﬁrst part of this chapter, I’ll highlight the network architectures we’ll be discussing that

can easily be trained on a single GPU and which architectures should use multiple GPUs if at all

possible. Then, in the second half of this chapter, we’ll examine some of the performance gains we

can expect when training Convolutional Neural Networks using multiple GPUs.

3.1 How Many GPUs Do I Need?

If you were to ask any seasoned deep learning practitioner how many GPUs you need to train a

reasonably deep neural network on a large dataset, their answer would almost always be “The more,

the better”. The beneﬁt of using multiple GPUs is obvious – parallelization. The more GPUs we

18 Chapter 3. Training Networks Using Multiple GPUs

can throw at the problem, the faster we can train a given network. However, some of us may only

have one GPU when working through this book. That raises the questions:

• Is using just one GPU a fruitless exercise?

• Is reading through this chapter a waste a time?

• Was purchasing the ImageNet Bundle a poor investment?

The answer to all of these questions is a resounding no – you are in good hands, and the knowl-

edge you learn here will be applicable to your own deep learning projects. However, you do need

to manage your expectations and realize you are crossing a threshold, one that separates educational

deep learning problems from advanced, real-world applications.

You are now entering the world

of state-of-the-art deep learning

where experiments can take days, weeks, or even in some rare

cases, months to complete – this timeline is totally and completely normal.

Regardless if you have one GPU or eight GPUs, you’ll be able to replicate the performance of

the networks detailed in this chapter, but again, keep in mind the caveat of time. The more GPUs

you have, the faster the training will be. If you have a single GPU, don’t be frustrated – simply be

patient and understand this is part of the process. The primary goal of the ImageNet Bundle is to

provide you with actual case studies and detailed information on how to train state-of-the-art deep

neural networks on the challenging ImageNet dataset (along with a few additional applications

as well). No matter if you have one GPU or eight GPUs, you’ll be able to learn from these case

studies and use this knowledge in your own applications.

For readers using a single GPU, I highly recommend spending most of your time training

AlexNet and SqueezeNet on the ImageNet dataset. These networks are more shallow and can be

trained much faster on a single GPU system (in the order of 3-6 days for AlexNet and 7-10 days

for SqueezeNet, depending on your machine). Deeper Convolutional Neural Networks such as

GoogLeNet can also be trained on a single GPU but can take up to 7-14 days.

Smaller variations of ResNet can also be trained on a single GPU as well, but for the deeper

version covered in this book, I would recommend multiple GPUs.

The only network architecture I do not recommend attempting to train using one GPU is

VGGNet – not only can it be a pain to tune the network hyperparameters (as we’ll see later in this

book), but the network is extremely slow due to its depth and number of fully-connected nodes. If

you decide to train VGGNet from scratch, keep in mind that it can take up to 14 days to train the

network, even using four GPUs.

Again, as I mentioned earlier in this section, you are now crossing the threshold from deep

learning practitioner to deep learning expert. The datasets we are examining are large and challeng-

ing – and the networks we will train on these datasets are deep. As depth increases, so does the

computation required to perform the forward and backward pass. Take a second now to set your

expectations that these experiments are not ones you can leave running overnight and gather the

results the next morning – your experiments will take longer to run.

This is a fact that every deep

learning researcher must accept.

But even if you are training your own state-of-the-art deep learning models on a single GPU,

don’t fret. The same techniques we use for multiple GPUs can also be applied to single GPUs. The

sole purpose of the ImageNet Bundle is to give you the knowledge and experience you need to be

successful applying deep learning to your own projects.

3.2 Performance Gains Using Multiple GPUs

In an ideal world, if a single epoch for a given dataset and network architecture takes

seconds

to complete on a single GPU, then we would expect the same epoch with two GPUs to complete

N/2

seconds. However, this expectation isn’t the actual case. Training performance is heavily

dependent on the PCIe bus on your system, the speciﬁc architecture you are training, the number of

layers in the network, and whether your network is bound via computation or communication.

注意：

剩余322页未读，继续阅读

滕扬Lance

粉丝: 28

Python深度学习计算机视觉实战：多GPU训练与ImageNet

深度学习在计算机视觉中的Python应用三部曲

精通Python图像识别深度学习从入门到实践

Python深度学习计算机视觉实践：入门与进阶

Deep_Learning_for_Computer_Vision_with_Python-1-Starter Bundle(已

Deep_Learning_for_Computer_Vision_with_Python_Practitioner_Bundle.pdf

Deep_Learning_for_Computer_Vision_with_Python-2-Practitioner Bun

deep learning for computer vision with python-imagenet bundle

Deep_Learning_for_Computer_Vision_with_Python_Adrian Rosebrock

Deep Learning for Computer Vision with Python - Practitioner Bundle

Deep Learning for Computer Vision with_Python_Practitioner Bundle【完整版】

最新资源