联合概率分布适应（JPDA）在迁移学习中的应用

迁移学习

需积分: 9 140 浏览量更新于2024-09-05 收藏 560KB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

资源详情

资源推荐

arXiv:1912.00320v1 [cs.LG] 1 Dec 2019

Transferability versus Discriminability:

Joint Probability Distribution Adap tation (JPDA)

Wen Zhang

and Dongrui Wu

Abstract. Transfer learning makes use of data or knowledge in

one task to help solve a different, yet related, task. Many existing

TL approaches are based on a joint probability distribution metric,

which is a weighted sum of the marginal distribution and the condi-

tional distribution; however, they optimize the two distributions in-

dependently, and ignore their intrinsic dependency. This paper pro-

poses a novel and frustratingly easy Joint Probability Distribution

Adaptation (JPDA) approach, to replace the frequently-used joint

maximum mean discrepancy metric in transfer learning. During the

distribution adaptation, JPDA improves the transferability between

the source and the target domains by minimizing the joint prob-

ability discrepancy of the corresponding class, and also increases

the discriminability between different classes by maximizing their

joint probability discrepancy. Experiments on six image classiﬁca-

tion datasets demonstrated that JPDA outperforms several state-of-

the-art metric-based transfer learning approaches.

1 INTRODUCTION

A basic assumption in statistical machine learning is that the train-

ing and the test data come from the same distri bution. However, this

assumption does not hold in many real-world applications. For exam-

ple, in image recognition, the distributions in training and testing can

be different due to varying scene, lighting, view angle, image r eso-

lution, etc. Annotating data for a new domain is often expensive and

time-consuming, thus t here are application scenarios that we have

plenty of data, but none or a very small amount of them are labeled

[25]. Transfer learning (TL) has shown promising performance in

handling such a challenge, by transferring knowl edge from a labeled

source domain to a new (unlabeled) target domain [24, 25]. In the

last decade, it has been widely used in image recognition [6, 10, 19],

emotion recognition [23], brain-comput er interfaces [14, 29], and so

on [15, 16, 26].

Typical TL approaches can be categorized into parameter-based

transfers [25], instance-based transfers, and feature transformation

based transfers. Parameter-based transfers need some labeled data,

whereas this paper focuses on unsupervised domain adaptation, in

which the target domain does not have any labeled data at all.

Instance-based transfers assume that the source and the target do-

mains share the same conditional distri bution [25, 30], which usually

does not hold in practice. Feature transformation based transfers re-

lax this assumption, and only assume that there exists a common

subspace, in which the source and t he target domains have similar

School of Artiﬁcial Intelligence and Automation, Huazhong University of

Science and Technology, Wuhan, China. Email: wenz@hust.edu.cn

School of Artiﬁcial Intelligence and Automation, Huazhong University of

Science and Technology, Wuhan, China. Email: drwu@hust.edu.cn

distributions. This paper considers feature transformation based T L.

According to Pan and Yang [25], TL can be applied when the

source and the target domains have different feature spaces, label

spaces, marginal probability distributions, and/or conditional prob-

ability distributions. Existing feature transformation based TL ap-

proaches mainly focus on minimizing the distribution divergence

between the source and the target domains by a distribution met-

ric. Frequently used such metrics include maximum mean discrep-

ancy (MMD) [11], KullbackLeibler divergence [27], Wasserstein dis-

tance [17], etc. MMD on marginal and/or conditional distribution is

probably the most popular metric in TL. Existing MMD based distri-

bution adaptation approaches consider either the marginal distribu-

tion only [24], or both the marginal and the conditional distributions

with equal weight [5, 19, 20] or different weights [28], even in deep

learning [8, 18] and adversarial learning [7, 21].

Among them, joint distribution adaptation (JDA) [19] is the most

widely used baseline in TL, whose idea is to measure the distribu-

tion shift between two domains using the marginal and the condi-

tional MMD. Some works extended JDA by adding a regularization

term [20], structural consistency [13], source domain discriminabil-

ity [30], etc. For JDA based approaches, the marginal and conditional

distributions are often treated equally, which may not be optimal; so,

balanced distribution adaptation (BDA) [28] was proposed to give

them different weights. However, both JDA and BDA consider the

marginal and conditional distributions separately, ignoring the intrin-

sic dependency between them. The performance may be improved if

this dependency can be taken into consideration.

Two measures need to be considered during feature transformation

to facilitate domain adaptation [4]. One is transferability, which mea-

sures how capable the feature r epresentation can minimize the cross-

domain discrepancies. The other is discriminability, which measures

how easily different classes can be distinguished by a supervised

classiﬁer. Traditional distribution adaptation approaches usually seek

to achieve high transferability [3,5,19], so that the knowledge learned

from the source domain can be effectively transferred to the target do-

main; however, the feature discriminability has been l argely ignored.

This paper considers the scenario that the source and the target

domains share the same feature and label spaces, which is the most

common assumption in TL. Different from joint MMD based ap-

proaches, we do not use (weighted) sum of the marginal and con-

ditional MMDs to estimate the distribution discrepancy; instead, we

use the joint probability distribution directly, which in theory can

better leverage the relationship between different distri butions. To

consider both transferability and discriminability simultaneously, we

propose joint probability MMD for distribution adaptation, which

minimizes the distribution discrepancy of the same class between dif-

ferent domains and maximizes the distribution discrepancy between

下载后可阅读完整内容，剩余6页未读，立即下载

amazingftt

粉丝: 0
资源: 1

联合概率分布适应（JPDA）在迁移学习中的应用

JPDA的matlab程序.pdf

基于JPDA的Java软件性能测试.pdf

7_目标跟踪_jpda.zip

多目标jpda matlab

jpda粒子滤波算法

基于卡尔曼滤波的jpda算法的matlab代码

matlab多目标跟踪jpda

jpda matlab

matlab jpda的粒子实现

nnda jpda 仿真实验

imm-ukf-jpda

jpda联合概率数据互联

jpda算法matlab目标跟踪

JPDA算法matlab

写一个JPDA算法的matalb 代码

jpda 联合概率数据关联z

JPDA算法与PDA算法

JPDA和PDA计算概率时的区别

传统jpda算法模型流程图

数据关联算法代码 pda jpda mht

最新资源