没有合适的资源？快使用搜索试试~ 我知道了~

首页Recent Advances in Deep Learning: An Overview

资源详情

资源评论

资源推荐

arXiv:1807.08169v1 [cs.LG] 21 Jul 2018

Recent Advances in Deep Learning: An Overview

Matiur Rahman Minar minar09.bd@gmail.com

Jibon Naher jibon.naher09@gmail.com

Department of Computer Science and Engineering

Chittagong University of Engineering and Technology

Chittagong-4349, Bangladesh

Editor:

Abstract

Deep Learning is one of the newest trends in Machine Learning and Artiﬁcial Intelligence

resear ch. It is also one of the most popular scientiﬁc research trends now-a-days. Deep

learning methods have brought revolutionary advances in computer vision and machine

learning. Every now and then, new and new deep lea rning techniques are being born,

outp erforming state-of-the-art machine learning and even existing deep learning techniques.

In recent years, the world has seen many major breakthroughs in this ﬁeld. Since deep

learning is evolving at a huge speed, its kind of hard to keep track of the regular advances

especially for new researchers. In this pa per, we are going to brieﬂy discuss about recent

advances in Deep Learning for past few years.

Keywords: Neural Networks, Machine Learning, Deep Learning, Recent Advances,

Overview.

1. Introduction

The term ”Deep Learning” (DL) was ﬁrst introduced to Machine Learning (ML) in 1986,

and later used for Artiﬁcial Neural Networks (ANN) in 2000 (Schmidhuber, 2015). Deep

learning methods are composed of multiple layers to learn f eatures of data with multiple

levels of abstr action (LeCun et al., 2015). DL approaches allow computers to learn compli-

cated concepts by building them out of s im pler ones (Goodfellow et al., 2016). For Artiﬁ-

cial Neural Networks (ANN), Deep Learning (DL) aka hierarchical learning (Deng and Yu ,

2014) is about assigning credits in many computational stages accurately, to transform the

aggregate activation of the network (Schmidhuber, 2014). To learn complicated functions,

deep architectures are used with multiple levels of abstractions i.e. non-linear operations;

e.g. ANNs with many hidden layers (Bengio, 2009). To sum it accurately, Deep L earning

is a sub-ﬁeld of Machine Learning, which u ses many levels of non -linear information pro-

cessing and abstraction, for supervised or unsupervised feature learning and representation,

classiﬁcation and pattern recognition (Deng and Yu, 2014).

Deep Learning i.e. Representation Learning is class or sub-ﬁeld of Machine Learn ing.

Recent deep learning methods are mostly said to be developed since 2006 (Deng, 2011).

This paper is an overview of most recent techniques of deep learning, mainly recommended

for upcoming researchers in this ﬁeld. This article includes the basic idea of DL, major

approaches and methods, recent breakthroughs and applications.

1

Overv iew p apers are foun d to be very beneﬁcial, especially for new researchers in a

particular ﬁeld. It is often hard to keep track with contemporary advances in a research

area, prov ided that ﬁeld has great value in near futu re and related applications. Now-a-days,

scientiﬁc research is an attractive profession since knowledge and education are more shared

and available than ever. For a technological research trend, its only normal to assume that

there will be numerous advances and imp rovements in various ways. An overview of an

particular ﬁeld from couple years back, may turn out to be obsolete today.

Considering the popularity and expansion of Deep Learning in recent years, we present

a brief overview of Deep Learning as well as Neural Network s (NN), and its major advances

and critical breakthroughs from past few years. We hope that this paper will help many

novice researchers in this ﬁeld, getting an overall picture of recent Deep Learning researches

and techniques, and guidin g them to the right way to start with. Also we hope to p ay

some tributes by this work, to the top DL and ANN research er s of this era, Geoﬀrey Hin-

ton (Hinton), Juergen Schmidhuber (Schmidhuber), Yann LeCun (LeCun), Yoshua Bengio

(Bengio) and many others who worked meticulously to shape the modern Artiﬁcial Intel-

ligence (AI). Its also important to follow their works to stay updated with state-of-the-art

in DL and ML research.

In this paper, ﬁrstly we will provide short descriptions of the past overview papers on

deep learning models and approaches. Then, we will start describing the recent advances of

this ﬁeld. We are going to discuss Deep Learning (DL) approaches, deep architectures i.e.

Deep Neural Networks (DNN) and Deep Generative Models (DGM), followed by important

regularization and optimization methods. Also, there are two brief sections for open-source

DL frameworks and signiﬁcant DL applications. Finally, we will discuss about current status

and the futur e of Deep L earning in the last two sections i.e. Discussion and Conclusion.

2. Related works

There were many overview papers on Deep Learning (DL) in the past years. They described

DL methods and approaches in great way s as well as their applications and directions for

future research. Here, we are going to brief some outstanding overview papers on deep

learning.

Young et al. (2017) talked about DL models and architectures, mainly used in Natural

Language Processing (NLP). They showed DL applications in various NLP ﬁelds, compared

DL mod els, and discussed possible fu ture trends.

Zhang et al. (2017) discussed state-of-the-art deep learning techniques for front-end and

back-end speech recognition systems.

Zhu et al. (2017) presented overview on state-of-the-art of DL for remote sensing. They

also discussed open-source DL frameworks and other technical details for deep learning.

Wang et al. (2017a) described the evolution of deep learning models in time-series man-

ner. The briefed the models graphically along with the breakthroughs in DL research. This

paper would be a good read to know the origin of the Deep Learning in evolutionary manner.

They also mentioned optimization and future research of neural networks.

Goodfellow et al. (2016) discussed deep networks and generative models in details.

Starting from Machine Learning (ML) basics, pros and cons for deep architectures, they

concluded recent DL researches and applications thoroughly.

2

LeCun et al. (2015) published a overview of Deep Learning (DL) models with Convo-

lutional Neural Networks (CNN) and Recurrent Neural Networks (RNN). They described

DL from the perspective of Representation Learning, showing how DL techniques work and

getting used successfully in various applications, and predicting future learning based on

Unsupervised Learning (UL). They also pointed out the articles of major advances in DL

in the bibliography.

Schmidhuber (2015) did a generic and historical overview of Deep Learning along with

CNN, RNN and Deep R einforcement Learning (RL). He emphasized on sequence-processing

RNNs, while pointing out the limitations of fundamental DL and NNs, and the tricks to

improve them.

Nielsen (2015) described the neural networks in d etails along with codes and examples.

He also discussed deep neural networks and deep learning to some extent.

Schmidhuber (2014) covered history and evolution of neural networks based on time

progression, categorized with machine learning approaches, and uses of deep learning in th e

neural networks.

Deng and Yu (2014) described deep learning classes and techniques, and applications of

DL in several areas.

Bengio (2013) did quick overview on DL algorithms i.e. superv ised and unsupervised

networks, optimization and training models from the perspective of representation learning.

He focused on m any challenges of Deep Learning e.g. scaling algorithms for larger models

and data, reducing optimization diﬃculties, designing eﬃcient scaling methods etc. along

with optimistic DL researches.

Bengio et al. (2013) discussed on Representation and Feature Learning aka Deep Learn-

ing. They explored various methods and models f rom the per spectives of applications,

techniques and challenges.

Deng (2011) gave an overview of deep s tructured learning and its architectures from the

perspectives of information processing and related ﬁelds .

Arel et al. (2010) provided a short overview on r ecent DL techniqu es.

Bengio (2009) discussed deep architectures i.e. neural networks and generative models

for AI.

All recent overview papers on Deep Learning (DL) discussed important things from

several per spectives. It is necessary to go through them for a DL researcher. However, DL

is a highly ﬂourishing ﬁeld right now. Many new techniques and architectures are invented,

even after the most recently published overview paper on DL. Also, previous papers focus

from d iﬀerent perspectives. Our paper is mainly for the new learners and novice researchers

who are new to this ﬁeld. For that purpose, we will try to give a basic and clear idea of

deep learning to the new researchers and anyone interested in this ﬁeld.

3. Recent Advances

In this section, we will discuss the main recent Deep Learning (DL) app roaches derived

from Machine Learning and brief evolution of Artiﬁ cial Neural Networks (ANN), which is

the most common form used for deep learning.

3

3.1 Evolution of Deep Architectures

Artiﬁcial Neural Networks (ANN) have come a long way, as well as other deep models.

First generation of ANNs was composed of simple neural layers for Perceptron . They

were limited in simples computations. Second generation used Backpropagation to update

weights of neurons according to er ror rates. Then Support Vector Machine (SVM) surfaced,

and surpassed ANNs for a while. To overcome the limitations of backpropagation, Restricted

Boltzmann Machine was proposed, making the learning easier. Other techniques and neural

networks came as well e.g. Feedforward Neural Networks (FNN), Convolutional Neural

Netowrks (CNN), Recurrent Neural Networks (RNN) etc. along with Deep Belief Networks,

Autoencoders and such (Hinton, The next generation of neural networks). From that point,

ANNs got improved and designed in various ways and for various pu rposes.

Schmidhuber (2014), Bengio (2009), Deng and Yu (2014), Goodfellow et al. (2016),

Wang et al. (2017a) etc. provided detailed overview on the evolution and history of Deep

Neural Networks (DNN) as well as Deep Learning (DL). Deep architectures are multilayer

non-linear repetition of s im ple architectures in most of the cases, which helps to obtain

highly complex functions out of the inputs (LeCun et al., 2015).

4. Deep Learning Approaches

Deep Neural Networks (DNN) gained hu ge success in Supervised Learning (SL). Also, Deep

Learning (DL) models are immensely successful in Unsupervised, Hybrid and Reinforcement

Learning as well (LeCun et al., 2015).

4.1 Deep Supervised Learning

Supervised learning are ap plied when data is labeled and the classiﬁer is used for class

or numeric prediction. LeCun et al. (2015) provided a brief yet very good explanation of

supervised learning approach and how deep architectures are formed. Deng and Yu (2014)

mentioned many deep networks for s uperv ised and hybrid learning and explained them e.g.

Deep Stackin g Network (DSN) and its variants. Schmidhuber (2014) covered all neural

networks starting from early neur al networks to recently successful Convolutional Neural

Networks (CNN), Recurrent Neural Networks (RNN), Long S hort Term Memory (LSTM)

and their improvements.

4.2 Deep Unsupervised Learning

When input data is not labeled, u nsupervised learning approach is applied to extract fea-

tures from data and classify or label them. LeCun et al. (2015) pr edicted future of deep

learning in unsupervised learning. Schmidhuber (2014) described neu ral networks for unsu-

pervised learning as well. Deng and Yu (2014) briefed deep architectures for unsupervised

learning and explained deep Autoencoders in detail.

4.3 Deep Reinforcement Learning

Reinforcement learning uses reward and punishment system for the next move generated by

the learning model. This is mostly used for games and robots, solves usually decision making

4

problems (Li, 2017). Schmidhuber (2014) described advances of deep learning in Reinforce-

ment Learnin g (RL) and uses of Deep Feedforward Neural Netowrk (FNN) and Recurrent

Neural Network (RNN) for RL. Li (2017) discussed Deep Reinforcement Learning(DRL),

its architectures e.g. Deep Q-Network (DQN), and applications in various ﬁelds.

Mnih et al. (2016) proposed a DRL framework using asynchronous gradient descent for

DNN optimization.

van Hasselt et al. (2015) proposed a DRL architecture using deep neural network (DNN).

5. Deep Neural Networks

In this section, we will brieﬂy discuss about the deep neur al networks (DNN), and recent

improvements and breakthroughs of them. Neural networks work with functionalities simi-

lar to human brain. These are composed on neurons and connections mainly. When we are

saying deep neural network, we can assume there should be quite a number of hidden layers,

which can be used to extract features f rom the inputs and to compute complex functions.

Bengio (2009) exp lained neural networks for deep architectures e.g. Convolutional Neural

Networks (CNN), Auto-Encoders (AE) etc. and their variants. Deng and Yu (2014) de-

tailed some neural network architectures e.g. AE and its variants. Goodfellow et al. (2016)

wrote and skillfully explained ab ou t Deep Feedforward Networks, Convolutional Networks,

Recurrent and Recursive Networks an d their improvements. Schm idhuber (2014) mentioned

full history of neural networks from early neural networks to recent successful techniques.

5.1 Deep Autoencoders

Autoencoders (AE) are neural networks (NN) where outputs are the inputs. AE takes

the original input, encodes for compressed representation and then decodes to reconstruct

the input (Wang). In a deep AE, lower hidden layers are used for en coding and higher

ones for decoding, and error back-propagation is used for training (Deng and Yu, 2014).

Goodfellow et al. (2016)

5.1.1 Variational Autoencoders

Variational Auto-Encoders (VAE) can be counted as decod er s (Wang). VAEs are built upon

standard n eural networks and can be trained with stochastic gradient descent (Doersch,

2016)

5.1.2 Stacked Denoising Autoencoders

In early Auto-Encoders (AE), encoding layer had smaller dim ensions than the input layer.

In Stacked Denoising Auto-Encoders (SDAE), encoding layer is wider than the input layer

(Deng and Yu, 2014).

5.1.3 Transforming Autoencoders

Deep Auto-Encoders (DAE) can be transformation-variant, i.e., the extracted features

from multilayers of non-linear processing could be changed due to learner. Transform-

ing Auto-Encoders (TAE) work with both input vector and target output vector to apply

5

剩余30页未读，继续阅读

安全验证

文档复制为VIP权益，开通VIP直接复制

信息提交成功

## 评论0