没有合适的资源?快使用搜索试试~ 我知道了~
首页Recent Advances in Deep Learning: An Overview
资源详情
资源评论
资源推荐

arXiv:1807.08169v1 [cs.LG] 21 Jul 2018
Recent Advances in Deep Learning: An Overview
Matiur Rahman Minar minar09.bd@gmail.com
Jibon Naher jibon.naher09@gmail.com
Department of Computer Science and Engineering
Chittagong University of Engineering and Technology
Chittagong-4349, Bangladesh
Editor:
Abstract
Deep Learning is one of the newest trends in Machine Learning and Artificial Intelligence
resear ch. It is also one of the most popular scientific research trends now-a-days. Deep
learning methods have brought revolutionary advances in computer vision and machine
learning. Every now and then, new and new deep lea rning techniques are being born,
outp erforming state-of-the-art machine learning and even existing deep learning techniques.
In recent years, the world has seen many major breakthroughs in this field. Since deep
learning is evolving at a huge speed, its kind of hard to keep track of the regular advances
especially for new researchers. In this pa per, we are going to briefly discuss about recent
advances in Deep Learning for past few years.
Keywords: Neural Networks, Machine Learning, Deep Learning, Recent Advances,
Overview.
1. Introduction
The term ”Deep Learning” (DL) was first introduced to Machine Learning (ML) in 1986,
and later used for Artificial Neural Networks (ANN) in 2000 (Schmidhuber, 2015). Deep
learning methods are composed of multiple layers to learn f eatures of data with multiple
levels of abstr action (LeCun et al., 2015). DL approaches allow computers to learn compli-
cated concepts by building them out of s im pler ones (Goodfellow et al., 2016). For Artifi-
cial Neural Networks (ANN), Deep Learning (DL) aka hierarchical learning (Deng and Yu ,
2014) is about assigning credits in many computational stages accurately, to transform the
aggregate activation of the network (Schmidhuber, 2014). To learn complicated functions,
deep architectures are used with multiple levels of abstractions i.e. non-linear operations;
e.g. ANNs with many hidden layers (Bengio, 2009). To sum it accurately, Deep L earning
is a sub-field of Machine Learning, which u ses many levels of non -linear information pro-
cessing and abstraction, for supervised or unsupervised feature learning and representation,
classification and pattern recognition (Deng and Yu, 2014).
Deep Learning i.e. Representation Learning is class or sub-field of Machine Learn ing.
Recent deep learning methods are mostly said to be developed since 2006 (Deng, 2011).
This paper is an overview of most recent techniques of deep learning, mainly recommended
for upcoming researchers in this field. This article includes the basic idea of DL, major
approaches and methods, recent breakthroughs and applications.
1

Overv iew p apers are foun d to be very beneficial, especially for new researchers in a
particular field. It is often hard to keep track with contemporary advances in a research
area, prov ided that field has great value in near futu re and related applications. Now-a-days,
scientific research is an attractive profession since knowledge and education are more shared
and available than ever. For a technological research trend, its only normal to assume that
there will be numerous advances and imp rovements in various ways. An overview of an
particular field from couple years back, may turn out to be obsolete today.
Considering the popularity and expansion of Deep Learning in recent years, we present
a brief overview of Deep Learning as well as Neural Network s (NN), and its major advances
and critical breakthroughs from past few years. We hope that this paper will help many
novice researchers in this field, getting an overall picture of recent Deep Learning researches
and techniques, and guidin g them to the right way to start with. Also we hope to p ay
some tributes by this work, to the top DL and ANN research er s of this era, Geoffrey Hin-
ton (Hinton), Juergen Schmidhuber (Schmidhuber), Yann LeCun (LeCun), Yoshua Bengio
(Bengio) and many others who worked meticulously to shape the modern Artificial Intel-
ligence (AI). Its also important to follow their works to stay updated with state-of-the-art
in DL and ML research.
In this paper, firstly we will provide short descriptions of the past overview papers on
deep learning models and approaches. Then, we will start describing the recent advances of
this field. We are going to discuss Deep Learning (DL) approaches, deep architectures i.e.
Deep Neural Networks (DNN) and Deep Generative Models (DGM), followed by important
regularization and optimization methods. Also, there are two brief sections for open-source
DL frameworks and significant DL applications. Finally, we will discuss about current status
and the futur e of Deep L earning in the last two sections i.e. Discussion and Conclusion.
2. Related works
There were many overview papers on Deep Learning (DL) in the past years. They described
DL methods and approaches in great way s as well as their applications and directions for
future research. Here, we are going to brief some outstanding overview papers on deep
learning.
Young et al. (2017) talked about DL models and architectures, mainly used in Natural
Language Processing (NLP). They showed DL applications in various NLP fields, compared
DL mod els, and discussed possible fu ture trends.
Zhang et al. (2017) discussed state-of-the-art deep learning techniques for front-end and
back-end speech recognition systems.
Zhu et al. (2017) presented overview on state-of-the-art of DL for remote sensing. They
also discussed open-source DL frameworks and other technical details for deep learning.
Wang et al. (2017a) described the evolution of deep learning models in time-series man-
ner. The briefed the models graphically along with the breakthroughs in DL research. This
paper would be a good read to know the origin of the Deep Learning in evolutionary manner.
They also mentioned optimization and future research of neural networks.
Goodfellow et al. (2016) discussed deep networks and generative models in details.
Starting from Machine Learning (ML) basics, pros and cons for deep architectures, they
concluded recent DL researches and applications thoroughly.
2

LeCun et al. (2015) published a overview of Deep Learning (DL) models with Convo-
lutional Neural Networks (CNN) and Recurrent Neural Networks (RNN). They described
DL from the perspective of Representation Learning, showing how DL techniques work and
getting used successfully in various applications, and predicting future learning based on
Unsupervised Learning (UL). They also pointed out the articles of major advances in DL
in the bibliography.
Schmidhuber (2015) did a generic and historical overview of Deep Learning along with
CNN, RNN and Deep R einforcement Learning (RL). He emphasized on sequence-processing
RNNs, while pointing out the limitations of fundamental DL and NNs, and the tricks to
improve them.
Nielsen (2015) described the neural networks in d etails along with codes and examples.
He also discussed deep neural networks and deep learning to some extent.
Schmidhuber (2014) covered history and evolution of neural networks based on time
progression, categorized with machine learning approaches, and uses of deep learning in th e
neural networks.
Deng and Yu (2014) described deep learning classes and techniques, and applications of
DL in several areas.
Bengio (2013) did quick overview on DL algorithms i.e. superv ised and unsupervised
networks, optimization and training models from the perspective of representation learning.
He focused on m any challenges of Deep Learning e.g. scaling algorithms for larger models
and data, reducing optimization difficulties, designing efficient scaling methods etc. along
with optimistic DL researches.
Bengio et al. (2013) discussed on Representation and Feature Learning aka Deep Learn-
ing. They explored various methods and models f rom the per spectives of applications,
techniques and challenges.
Deng (2011) gave an overview of deep s tructured learning and its architectures from the
perspectives of information processing and related fields .
Arel et al. (2010) provided a short overview on r ecent DL techniqu es.
Bengio (2009) discussed deep architectures i.e. neural networks and generative models
for AI.
All recent overview papers on Deep Learning (DL) discussed important things from
several per spectives. It is necessary to go through them for a DL researcher. However, DL
is a highly flourishing field right now. Many new techniques and architectures are invented,
even after the most recently published overview paper on DL. Also, previous papers focus
from d ifferent perspectives. Our paper is mainly for the new learners and novice researchers
who are new to this field. For that purpose, we will try to give a basic and clear idea of
deep learning to the new researchers and anyone interested in this field.
3. Recent Advances
In this section, we will discuss the main recent Deep Learning (DL) app roaches derived
from Machine Learning and brief evolution of Artifi cial Neural Networks (ANN), which is
the most common form used for deep learning.
3

3.1 Evolution of Deep Architectures
Artificial Neural Networks (ANN) have come a long way, as well as other deep models.
First generation of ANNs was composed of simple neural layers for Perceptron . They
were limited in simples computations. Second generation used Backpropagation to update
weights of neurons according to er ror rates. Then Support Vector Machine (SVM) surfaced,
and surpassed ANNs for a while. To overcome the limitations of backpropagation, Restricted
Boltzmann Machine was proposed, making the learning easier. Other techniques and neural
networks came as well e.g. Feedforward Neural Networks (FNN), Convolutional Neural
Netowrks (CNN), Recurrent Neural Networks (RNN) etc. along with Deep Belief Networks,
Autoencoders and such (Hinton, The next generation of neural networks). From that point,
ANNs got improved and designed in various ways and for various pu rposes.
Schmidhuber (2014), Bengio (2009), Deng and Yu (2014), Goodfellow et al. (2016),
Wang et al. (2017a) etc. provided detailed overview on the evolution and history of Deep
Neural Networks (DNN) as well as Deep Learning (DL). Deep architectures are multilayer
non-linear repetition of s im ple architectures in most of the cases, which helps to obtain
highly complex functions out of the inputs (LeCun et al., 2015).
4. Deep Learning Approaches
Deep Neural Networks (DNN) gained hu ge success in Supervised Learning (SL). Also, Deep
Learning (DL) models are immensely successful in Unsupervised, Hybrid and Reinforcement
Learning as well (LeCun et al., 2015).
4.1 Deep Supervised Learning
Supervised learning are ap plied when data is labeled and the classifier is used for class
or numeric prediction. LeCun et al. (2015) provided a brief yet very good explanation of
supervised learning approach and how deep architectures are formed. Deng and Yu (2014)
mentioned many deep networks for s uperv ised and hybrid learning and explained them e.g.
Deep Stackin g Network (DSN) and its variants. Schmidhuber (2014) covered all neural
networks starting from early neur al networks to recently successful Convolutional Neural
Networks (CNN), Recurrent Neural Networks (RNN), Long S hort Term Memory (LSTM)
and their improvements.
4.2 Deep Unsupervised Learning
When input data is not labeled, u nsupervised learning approach is applied to extract fea-
tures from data and classify or label them. LeCun et al. (2015) pr edicted future of deep
learning in unsupervised learning. Schmidhuber (2014) described neu ral networks for unsu-
pervised learning as well. Deng and Yu (2014) briefed deep architectures for unsupervised
learning and explained deep Autoencoders in detail.
4.3 Deep Reinforcement Learning
Reinforcement learning uses reward and punishment system for the next move generated by
the learning model. This is mostly used for games and robots, solves usually decision making
4

problems (Li, 2017). Schmidhuber (2014) described advances of deep learning in Reinforce-
ment Learnin g (RL) and uses of Deep Feedforward Neural Netowrk (FNN) and Recurrent
Neural Network (RNN) for RL. Li (2017) discussed Deep Reinforcement Learning(DRL),
its architectures e.g. Deep Q-Network (DQN), and applications in various fields.
Mnih et al. (2016) proposed a DRL framework using asynchronous gradient descent for
DNN optimization.
van Hasselt et al. (2015) proposed a DRL architecture using deep neural network (DNN).
5. Deep Neural Networks
In this section, we will briefly discuss about the deep neur al networks (DNN), and recent
improvements and breakthroughs of them. Neural networks work with functionalities simi-
lar to human brain. These are composed on neurons and connections mainly. When we are
saying deep neural network, we can assume there should be quite a number of hidden layers,
which can be used to extract features f rom the inputs and to compute complex functions.
Bengio (2009) exp lained neural networks for deep architectures e.g. Convolutional Neural
Networks (CNN), Auto-Encoders (AE) etc. and their variants. Deng and Yu (2014) de-
tailed some neural network architectures e.g. AE and its variants. Goodfellow et al. (2016)
wrote and skillfully explained ab ou t Deep Feedforward Networks, Convolutional Networks,
Recurrent and Recursive Networks an d their improvements. Schm idhuber (2014) mentioned
full history of neural networks from early neural networks to recent successful techniques.
5.1 Deep Autoencoders
Autoencoders (AE) are neural networks (NN) where outputs are the inputs. AE takes
the original input, encodes for compressed representation and then decodes to reconstruct
the input (Wang). In a deep AE, lower hidden layers are used for en coding and higher
ones for decoding, and error back-propagation is used for training (Deng and Yu, 2014).
Goodfellow et al. (2016)
5.1.1 Variational Autoencoders
Variational Auto-Encoders (VAE) can be counted as decod er s (Wang). VAEs are built upon
standard n eural networks and can be trained with stochastic gradient descent (Doersch,
2016)
5.1.2 Stacked Denoising Autoencoders
In early Auto-Encoders (AE), encoding layer had smaller dim ensions than the input layer.
In Stacked Denoising Auto-Encoders (SDAE), encoding layer is wider than the input layer
(Deng and Yu, 2014).
5.1.3 Transforming Autoencoders
Deep Auto-Encoders (DAE) can be transformation-variant, i.e., the extracted features
from multilayers of non-linear processing could be changed due to learner. Transform-
ing Auto-Encoders (TAE) work with both input vector and target output vector to apply
5
剩余30页未读,继续阅读


















安全验证
文档复制为VIP权益,开通VIP直接复制

评论0