没有合适的资源?快使用搜索试试~ 我知道了~
首页《深度元学习》综述论文(2020年)
资源详情
资源评论
资源推荐
A Survey of Deep Meta-Learning
Mike Huisman m.huisman.8@umail.leidenuniv.nl
Jan N. van Rijn j.n.van.rijn@liacs.leidenuniv.nl
Aske Plaat a.plaat@liacs.leidenuniv.nl
Leiden Institute of Advanced Computer Science
Leiden University
Niels Bohrweg 1, 2333CA Leiden, The Netherlands
Abstract
Deep neural networks can achieve great successes when presented with large data sets and
sufficient computational resources. However, their ability to learn new concepts quickly is
quite limited. Meta-learning is one approach to address this issue, by enabling the network
to learn how to learn. The exciting field of Deep Meta-Learning advances at great speed,
but lacks a unified, insightful overview of current techniques. This work presents just that.
After providing the reader with a theoretical foundation, we investigate and summarize key
methods, which are categorized into i) metric-, ii) model-, and iii) optimization-based tech-
niques. In addition, we identify the main open challenges, such as performance evaluations
on heterogeneous benchmarks, and reduction of the computational costs of meta-learning.
Keywords: Meta-learning, Learning to learn, Few-shot learning, Transfer learning, Deep
learning
1. Introduction
In recent years, deep learning techniques have achieved remarkable successes on various
tasks, including game-playing (Mnih et al., 2013; Silver et al., 2016), image recognition
(Krizhevsky et al., 2012; He et al., 2015), and machine translation (Wu et al., 2016). Despite
these advances, ample challenges remain to be solved, such as the large amounts of data and
training that are needed to achieve good performance. These requirements severely constrain
the ability of deep neural networks to learn new concepts quickly, one of the defining aspects
of human intelligence (Jankowski et al., 2011; Lake et al., 2017).
Meta-learning has been suggested as one strategy to overcome this challenge (Naik and
Mammone, 1992; Schmidhuber, 1987; Thrun, 1998). The key idea is that meta-learning
agents improve their own learning ability over time, or equivalently, learn to learn. The
learning process is primarily concerned with tasks (set of observations) and takes place at
two different levels: an inner- and an outer-level. At the inner-level, a new task is presented,
and the agent tries to quickly learn the associated concepts from the training observations.
This quick adaptation is facilitated by knowledge that it has accumulated across earlier
tasks at the outer-level. Thus, whereas the inner-level concerns a single task, the outer-level
concerns a multitude of tasks.
Historically, the term meta-learning has been used with various scopes. In its broadest
sense, it encapsulates all systems that leverage prior learning experience in order to learn new
arXiv:2010.03522v1 [cs.LG] 7 Oct 2020
Huisman, van Rijn, and Plaat
tasks more quickly (Vanschoren, 2018). This broad notion includes more traditional algo-
rithm selection and hyperparameter optimization techniques for Machine Learning (Brazdil
et al., 2008). In this work, however, we focus on a subset of the meta-learning field which de-
velops meta-learning procedures to learn a good inductive bias for (deep) neural networks.
1
Henceforth, we use the term Deep Meta-Learning to refer to this subfield of meta-learning.
The field of Deep Meta-Learning is advancing at a quick pace, while it lacks a coherent,
unifying overview, providing detailed insights into the key techniques. Vanschoren (2018)
has surveyed meta-learning techniques, where meta-learning was used in the broad sense,
limiting its account of Deep Meta-Learning techniques. Also, many exciting developments
in deep meta-learning have happened after the survey was published. A more recent survey
by Hospedales et al. (2020) adopts the same notion of deep meta-learning as we do, but
aims for a broad overview, omitting technical details of the various techniques.
We attempt to fill this gap by providing detailed explications of contemporary Deep
Meta-Learning techniques, using a unified notation. In addition, we identify current chal-
lenges and directions for future work. More specifically, we cover modern techniques in the
field for supervised and reinforcement learning, that have achieved state-of-the-art perfor-
mance, obtained popularity in the field, and presented novel ideas. Extra attention is paid
to MAML (Finn et al., 2017), and related techniques, because of their impact on the field.
This work can serve as educational introduction to the field of Deep Meta-Learning, and as
reference material for experienced researchers in the field. Throughout, we will adopt the
taxonomy used by Vinyals (2017), which identifies three categories of Deep Meta-Learning
approaches: i) metric-, ii) model-, and iii) optimization-based meta-learning techniques.
The remainder of this work is structured as follows. Section 2 builds a common founda-
tion on which we will base our overview of Deep Meta-Learning techniques. Section 3, 4, and
5 cover the main metric-, model-, and optimization-based meta-learning techniques, respec-
tively. Section 6 provides a helicopter view of the field, and summarizes the key challenges
and open questions. Table 1 gives an overview of notation that we will use throughout this
paper.
2. Foundation
In this section, we build the necessary foundation for investigating Deep Meta-Learning
techniques in a consistent manner. To begin with, we contrast regular learning and meta-
learning. Afterwards, we briefly discuss how Deep Meta-Learning relates to different fields,
what the usual training and evaluation procedure looks like, and which benchmarks are often
used for this purpose. We finish this section by describing some applications and context of
the meta-learning field.
2.1 The Meta Abstraction
In this subsection, we contrast base-level (regular) learning and meta-learning for two dif-
ferent paradigms, i.e., supervised and reinforcement learning.
1. Here, inductive bias refers to the assumptions of a model which guide predictions on unseen data
(Mitchell, 1980).
2
A Survey of Deep Meta-Learning
Expression Meaning
Meta-learning Learning meta-knowledge that can be used to learn new tasks more quckly
T
j
= (D
tr
T
j
, D
test
T
j
) A task consisting of a labeled train and test set
Support set The train set D
tr
T
j
associated with a task T
j
Query set The test set D
test
T
j
associated with a task T
j
x
i
Example input vector i in the support set
y
i
(One-hot encoded) label of example input x
i
from the support set
x Input in the query set
y A (one-hot encoded) label for input x
(f/g/h)
◦
Neural network function with parameters ◦
Inner-level At the level of a single task
Outer-level At meta-level: across tasks
Fast weights Parameters that were generated for a specific task/example
Base-learner Learner that works at the inner-level
Meta-learner Learner that operates at the outer-level
Input embedding Activation pattern in the final layer of a neural network caused by the input
Task embedding An internal representation of a task in a network/system
SL Supervised Learning
RL Reinforcement Learning
Table 1: Some notation and meaning, which we use throughout this paper.
2.1.1 Regular Supervised Learning
In supervised learning, we wish to learn a function f
θ
: X → Y that learns to map inputs
x
i
∈ X to their corresponding outputs y
i
∈ Y . Here, θ are model parameters (e.g. weights
in a neural network) that determine the function’s behavior. To learn these parameters, we
are given a data set of m observations: D = {(x
i
, y
i
)}
m
i=1
. Thus, given a data set D, learning
boils down to finding the correct setting for θ that minimizes an empirical loss function L
D
,
which must capture how the model is performing, such that appropriate adjustments to its
parameters can be made. In short, we wish to find
θ
SL
:= arg min
θ
L
D
(θ), (1)
where SL stands for “supervised learning". Note that this objective is specific to data set
D, meaning that our model f
θ
may not generalize to examples outside of D. To measure
generalization, one could evaluate the performance on a separate test data set, which contains
unseen examples. A popular way to do this is through cross-validation, where one repeatedly
creates train and test splits D
tr
, D
test
⊂ D and uses these to train and evaluate a model
respectively (Hastie et al., 2009).
Finding globally optimal parameters θ
SL
is often computationally infeasible. We can,
however, approximate them, guided by pre-defined meta-knowledge ω (Hospedales et al.,
2020), which includes, e.g., the initial model parameters θ, choice of optimizer, and learning
3
Huisman, van Rijn, and Plaat
rate schedule. As such, we approximate
θ
SL
≈ g
ω
(D, L
D
), (2)
where g
ω
is an optimization procedure that uses pre-defined meta-knowledge ω, data set D,
and loss function L
D
, to produce updated weights g
ω
(D, L
D
) that (presumably) perform
well on D.
2.1.2 Supervised Meta-Learning
In contrast, supervised meta-learning does not assume that any meta-knowledge ω is given,
or pre-defined. Instead, the goal of meta-learning is to find the best ω, such that our
(regular) base-learner can learn new tasks (data sets) as quickly as possible. Thus, whereas
supervised regular learning involves one data set, supervised meta-learning involves a group
of data sets. The goal is to learn meta-knowledge ω such that our model can learn many
different tasks well. Thus, our model is learning to learn.
More formally, we have a probability distribution of tasks p(T ), and wish to find optimal
meta-knowledge
ω
∗
:= arg min
ω
E
T
j
vp(T )
| {z }
Outer-level
[L
T
j
(g
ω
(T
j
, L
T
j
))
| {z }
Inner-level
]. (3)
Here, the inner-level concerns task-specific learning, while the outer-level concerns multiple
tasks. One can now easily see why this is meta-learning: we learn ω, which allows for quick
learning of tasks T
j
at the inner-level. Hence, we are learning to learn.
2.1.3 Regular Reinforcement Learning
In reinforcement learning, we have an agent that learns from experience. That is, it interacts
with an environment, modeled by a Markov Decision Process (MDP) M = (S, A, P, r, p
0
, γ, T ).
Here, S is the set of states, A the set of actions, P the transition probability distribution
defining P (s
t+1
|s
t
, a
t
), r : S × A → R the reward function, p
0
the probability distribution
over initial states, γ ∈ [0, 1] the discount factor, and T the time horizon (maximum number
of time steps) (Sutton and Barto, 2018; Duan et al., 2016).
At every time step t, the agent finds itself in state s
t
, in which the agent performs an
action a
t
, computed by a policy function π
θ
(i.e., a
t
= π
θ
(s
t
)), which is parameterized by
weights θ. In turn, it receives a reward r
t
= r(s
t
, π
θ
(s
t
)) ∈ R and a new state s
t+1
. This
process of interactions continues until a termination criterion is met (e.g. fixed time horizon
T reached). The goal of the agent is to learn how to act in order to maximize its expected
reward. The reinforcement learning (RL) goal is to find
θ
RL
:= arg min
θ
E
traj
T
X
t=0
γ
t
r(s
t
, π
θ
(s
t
)), (4)
where we take the expectation over the possible trajectories traj = (s
0
, π
θ
(s
0
), ...s
T
, π
θ
(s
T
))
due to the random nature of MDPs (Duan et al., 2016). Note that γ is a hyperparameter
that can prioritize short- or long-term rewards by decreasing or increasing it, respectively.
4
A Survey of Deep Meta-Learning
Also in case of reinforcement learning it is often infeasible to find the global optimum θ
RL
,
and thus we settle for approximations. In short, given a learning method ω, we approximate
θ
RL
≈ g
ω
(T
j
, L
T
j
), (5)
where again T
j
is the given MDP, and g
ω
is the optimization algorithm, guided by pre-defined
meta-knowledge ω.
Note that in a Markov Decision Process (MDP), the agent knows the state at any given
time step t. When this is not the case, it becomes a Partially Observable Markov Decision
Process (POMDP), where the agent receives only observations O, and uses these to update
its belief with regard to the state it is in (Sutton and Barto, 2018).
2.1.4 Meta Reinforcement Learning
The meta abstraction has as its object a group of tasks, or Markov Decision Processes
(MDPs) in the case of reinforcement learning. Thus, instead of maximizing the expected
reward on a single MDP, the meta reinforcement learning objective is to maximize the
expected reward over various MDPs, by learning meta-knowledge ω. Here, the MDPs are
sampled from some distribution p(T ). So now, we wish to find a set of parameters
ω
∗
:= arg min
ω
E
T
j
vp(T )
| {z }
Outer-level
E
traj
T
X
t=0
γ
t
r(s
t
, π
g
ω
(T
j
,L
T
j
)
(s
t
))
| {z }
Inner-level
. (6)
2.1.5 Contrast with other Fields
Now that we have provided a formal basis for our discussion for both supervised and rein-
forcement meat-learning, it is time to contrast meta-learning briefly with two related areas
of machine learning that also have the goal to improve the speed of learning. We will start
with transfer learning.
Transfer Learning In Transfer Learning, one tries to transfer knowledge of previous
tasks to new, unseen tasks (Pan and Yang, 2009; Taylor and Stone, 2009). As such, it
subsumes meta-learning, where we attempt to leverage meta-knowledge to learn new tasks
more quickly. A key property of meta-learning techniques is their meta-objective, which
explicitly aims to optimize performance across a distribution over tasks (as seen in previous
sections by taking the expected loss over a distribution of tasks). This objective need not
always be present in Transfer Learning techniques, e.g., when one pre-trains a model on a
large data set, and fine-tunes the learned weights on a smaller data set.
Multi-task learning An other, closely related field, is that of multi-task learning.
In multi-task learning a model is jointly trained to perform well on multiple fixed tasks
(Hospedales et al., 2020). Meta-learning, in contrast, aims to find a model that can learn
new (previously unseen) tasks quickly. This difference is illustrated in Figure 1.
5
剩余58页未读,继续阅读
syp_net
- 粉丝: 158
- 资源: 1196
上传资源 快速赚钱
- 我的内容管理 收起
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
会员权益专享
最新资源
- RTL8188FU-Linux-v5.7.4.2-36687.20200602.tar(20765).gz
- c++校园超市商品信息管理系统课程设计说明书(含源代码) (2).pdf
- 建筑供配电系统相关课件.pptx
- 企业管理规章制度及管理模式.doc
- vb打开摄像头.doc
- 云计算-可信计算中认证协议改进方案.pdf
- [详细完整版]单片机编程4.ppt
- c语言常用算法.pdf
- c++经典程序代码大全.pdf
- 单片机数字时钟资料.doc
- 11项目管理前沿1.0.pptx
- 基于ssm的“魅力”繁峙宣传网站的设计与实现论文.doc
- 智慧交通综合解决方案.pptx
- 建筑防潮设计-PowerPointPresentati.pptx
- SPC统计过程控制程序.pptx
- SPC统计方法基础知识.pptx
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0