没有合适的资源?快使用搜索试试~ 我知道了~
首页Convolutional neural networks for matlab
资源详情
资源评论
资源推荐

MatConvNet User Manual
Convolutional Neural Networks for MATLAB
Version 1.0-beta
Andrea Vedaldi Karel Lenc
Contents
1 Introduction 1
2 Introduction to convolutional neural netowrks 2
2.1 MatConvNet on a glance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2 The structure and evaluation of CNNs . . . . . . . . . . . . . . . . . . . . . 3
2.3 CNN derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.4 CNN modularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Computational blocks 6
3.1 Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2 Pooling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.3 ReLU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.4 Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.5 Softmax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.6 Log-loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.7 Softmax log-loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4 Network wrappers and examples 11
4.1 Pre-trained models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.2 Learning models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.3 Running large scale experiments . . . . . . . . . . . . . . . . . . . . . . . . . 13
5 About MatConvNet 13
5.1 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1 Introduction
MatConvNet is a simple MATLAB toolbox implementing Convolutional Neural Networks
(CNN) for computer vision applications. Section 2 provides a brief introduction to CNNs,
explaining their modular structure and fundamental concepts such as back-propagation. Sec-
tion 3 lists all the computational building blocks implemented in MatConvNet that can be
1

combined to create CNNs. Section 4 discusses more abstract CNN wrappers and example
code and models.
2 Introduction to convolutional neural netowrks
A Convolutional Neural Network (CNN) is can be viewed as a function f mapping some data
x, for example an image, on some output vector y. The function f is the composition of
a sequence (or directed acyclic graph) of simpler functional blocks f
1
, . . . , f
L
. Furthermore,
all or some of these blocks are convolutional, in the sense that they take an image as input
and produce an image as output by applying a translation invariant and local operator,
such as a linear filter. MatConvNet contains implementation for the most commonly used
computational blocks (Section 3). These can be used stand-alone in your own code, or
through a few simple wrappers. New blocks are also easy to create and combine with the
existing ones.
Blocks in the CNN usually contain parameters w
1
, . . . , w
n
. These are discriminatively
learned from example data such that the resulting function f does something useful. A typical
example is image classification; in this case the output of the CNN is a vector y = f(x) ∈ R
C
containing the confidence that x belong to any of C possible classes. Given training data
(x
(i)
, y
(i)
) (where y
(i)
is the indicator vector of the class of x
(i)
, the parameters are learned
by solving
argmin
w
1
,...w
n
1
n
n
X
i=1
`
f(x
(i)
; w
1
, . . . , w
L
), y
(i)
(1)
where ` is a suitable loss function (e.g. the hinge or log loss).
The optimization problem (1) is usually non-convex and very large as complex CNN archi-
tectures need to be trained from hundred-thousands or even million of examples. Therefore
efficiency is a paramount. The objective is usually optimized using a variant of stochastic gra-
dient descent. The algorithm is, conceptually, very simple: at each iteration a training point
is selected at random, the derivative of the loss term for that training sample is computed
resulting in a gradient vector, and parameters are incrementally updated by stepping down
the gradient. The key operation here is to compute the derivative of the objective function,
which is obtained by an application of the chain rule known as back-propagation. MatCon-
vNet implements the ability of evaluating derivatives in all the computational blocks. It also
contain several examples of training small and large models using these features, although it
is easy to write customised solvers.
While CNNs are relatively efficient to compute, training requires iterating many times
through vast data collections. Therefore a high evaluation speed is a practical requirement.
Larger models, in particular, may require in practice to run calculations on a GPU. MatCon-
vNet has integrated GPU support based on nVidia CUDA.
2.1 MatConvNet on a glance
MatConvNet has a simple design philosophy. Rather than wrapping CNNs around complex
layers of software, it exposes simple functions to compute CNN building blocks, such as
convolution and ReLU operators. These building blocks are easy to combine into a complete
2

CNNs or learning algorithms. While several real-world examples of small and large CNN
architectures and training routines are provided, it is always possible to go back to the basics
and build your own, using the efficiency of MATLAB in prototyping. Often no C coding is
required at all to try a new architectures. As such, MatConvNet is an ideal playground for
research.
MatConvNet contains the following elements:
• CNN computational blocks. A set of optimised routines computing fundamental build-
ing blocks of a CNN. For example, a convolution block is implemented by y=vl_nnconv(x,f,b)
where x is an image, f a filter bank, and b a vector of biases (Section ??). The deriva-
tives are computed as [dzdx,dzdf,dzdb] = vl_nnconv(x,f,b,dzdy) where dzdy is
the derivative of the CNN output w.r.t y (Section ??). Section 3 describes all the blocks
in detail.
• CNN wrappers. MatConvNet provides a simple wrapper, suitably invoked by vl_simplenn,
that implements a CNN with a linear topology (a chain of blocks). This is good enough
to run most of current state-of-the-art models for image classification. You are invited
to look at the implementation of this function, as it is a great starting point to under-
stand how to implement more complex CNNs.
• Example applications. MatConvNet provides several example of learning CNNs with
stochastic gradient descent and CPU or GPU, on MNIST, CIFAR10, and ImageNet
data.
• Pre-trained models. MatConvNet provides several state-of-the-art pre-trained CNN
models that can be used off-the-shelf, either to classify images or to produce image
encodings in the spirit of Caffe or DeCAF.
2.2 The structure and evaluation of CNNs
CNNs are obtained by connecting one or more computational blocks. Each block y = f(x, w)
takes an image x and a set of parameters w as input and produces a new image y as output.
An image is a real 4D array; the first two dimensions index spatial coordinates (image rows
and columns respectively), the third dimension feature channels (there can be any number),
and the last dimension image instances. A computational block f is therefore represented as
follows:
x
f
y
w
Formally, x is a 4D tensor stacking N 3D images
x ∈ R
H×W ×D×N
3
剩余13页未读,继续阅读

















安全验证
文档复制为VIP权益,开通VIP直接复制

评论0