元学习中的可微分凸优化：提升 Few-Shot 识别性能

需积分: 6 150 浏览量更新于2024-08-11 收藏 2.64MB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

资源详情

资源推荐

Meta-Learning with Differentiable Convex Optimization

Kwonjoon Lee

Subhransu Maji

1,3

Avinash Ravichandran

Stefano Soatto

1,4

Amazon Web Services

UC San Diego

UMass Amherst

UCLA

kwl042@ucsd.edu {smmaji,ravinash,soattos}@amazon.com

Abstract

Many meta-learning approaches for few-shot learning

rely on simple base learners such as nearest-neighbor clas-

siﬁers. However, even in the few-shot regime, discrimina-

tively trained linear predictors can offer better generaliza-

tion. We propose to use these predictors as base learners to

learn representations for few-shot learning and show they

offer better tradeoffs between feature size and performance

across a range of few-shot recognition benchmarks. Our

objective is to learn feature embeddings that generalize well

under a linear classiﬁcation rule for novel categories. To

efﬁciently solve the objective, we exploit two properties of

linear classiﬁers: implicit differentiation of the optimality

conditions of the convex problem and the dual formulation

of the optimization problem. This allows us to use high-

dimensional embeddings with improved generalization at a

modest increase in computational overhead. Our approach,

named MetaOptNet, achieves state-of-the-art performance

on miniImageNet, tieredImageNet, CIFAR-FS, and FC100

few-shot learning benchmarks. Our code is available on-

line

1. Introduction

The ability to learn from a few examples is a hallmark

of human intelligence, yet it remains a challenge for mod-

ern machine learning systems. This problem has received

signiﬁcant attention from the machine learning community

recently where few-shot learning is cast as a meta-learning

problem (e.g., [22, 8, 33, 28]). The goal is to minimize gen-

eralization error across a distribution of tasks with few train-

ing examples. Typically, these approaches are composed of

an embedding model that maps the input domain into a fea-

ture space and a base learner that maps the feature space

to task variables. The meta-learning objective is to learn

an embedding model such that the base learner generalizes

well across tasks.

While many choices for base learners exist, nearest-

neighbor classiﬁers and their variants (e.g., [28, 33]) are

https://github.com/kjunelee/MetaOptNet

popular as the classiﬁcation rule is simple and the approach

scales well in the low-data regime. However, discrimina-

tively trained linear classiﬁers often outperform nearest-

neighbor classiﬁers (e.g., [4, 16]) in the low-data regime

as they can exploit the negative examples which are often

more abundant to learn better class boundaries. Moreover,

they can effectively use high dimensional feature embed-

dings as model capacity can be controlled by appropriate

regularization such as weight sparsity or norm.

Hence, in this paper, we investigate linear classiﬁers as

the base learner for a meta-learning based approach for few-

shot learning. The approach is illustrated in Figure 1 where

a linear support vector machine (SVM) is used to learn a

classiﬁer given a set of labeled training examples and the

generalization error is computed on a novel set of examples

from the same task. The key challenge is computational

since the meta-learning objective of minimizing the gener-

alization error across tasks requires training a linear classi-

ﬁer in the inner loop of optimization (see Section 3). How-

ever, the objective of linear models is convex and can be

solved efﬁciently. We observe that two additional properties

arising from the convex nature that allows efﬁcient meta-

learning: implicit differentiation of the optimization [2, 11]

and the low-rank nature of the classiﬁer in the few-shot set-

ting. The ﬁrst property allows the use of off-the-shelf con-

vex optimizers to estimate the optima and implicitly differ-

entiate the optimality or Karush-Kuhn-Tucker (KKT) con-

ditions to train embedding model. The second property

means that the number of optimization variables in the dual

formation is far smaller than the feature dimension for few-

shot learning.

To this end, we have incorporated a differentiable

quadratic programming (QP) solver [1] which allows end-

to-end learning of the embedding model with various linear

classiﬁers, e.g., multiclass support vector machines (SVMs)

[5] or linear regression, for few-shot classiﬁcation tasks.

Making use of these properties, we show that our method

is practical and offers substantial gains over nearest neigh-

bor classiﬁers at a modest increase in computational costs

(see Table 3). Our method achieves state-of-the-art perfor-

mance on 5-way 1-shot and 5-shot classiﬁcation for popu-

arXiv:1904.03758v1 [cs.CV] 7 Apr 2019

下载后可阅读完整内容，剩余8页未读，立即下载

liz_lee

粉丝: 68
资源: 36

元学习中的可微分凸优化：提升 Few-Shot 识别性能

A library for differentiable nonlinear optimization.7z

Minimization Methods for Non-Differentiable Functions.N.Z.Shor.1985.pdf

Model-Agnostic Meta-Learning

Find the values of 𝑎 and 𝑏 so that the following function is differentiable at 𝑥 = 0. 𝑓 (𝑥) = { 𝑥 + 𝑎 if 𝑥 < 0 𝑒 𝑏𝑥 − 1 if 𝑥 ≥ 0

DDPM是指Diffusion-Driven Probability Model还是Differentiable Density Parameter Estimation

InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions

pytorch3d camera_position_optimization_with_differentiable_rendering

optnet: differentiable optimization as a layer in neural networks

transformer模型下载地址

csdn differentiable learning of logical rules for knowledge base reasoning

DBnet推理获取threshold map

composition of bijective sub-differentiable

robust cost function

SDFDiff: Differentiable Rendering of Signed Distance Fields for 3D Shape Optimization

differentiable svd tensorflow

nn.L2Loss()

Differentiable Density Parameter Estimation和Diffusion-Driven Probability Model差异及联系

2023年最新的神经网络结构

最新资源