没有合适的资源?快使用搜索试试~ 我知道了~
对抗鉴别领域适应:一种改进的无监督方法用于泛化性能的提高
Judy HoffmanStanford Universityjhoffman@cs.stanford.eduKate SaenkoBoston Universitysaenko@bu.eduTrevor DarrellUniversity of California, Berkeleytrevor@eecs.berkeley.eduAbstractAdversarial learning methods are a promising approachto training robust deep networks, and can generate complexsamples across diverse domains. They can also improverecognition despite the presence of domain shift or datasetbias: recent adversarial approaches to unsupervised domainadaptation reduce the difference between the training andtest domain distributions and thus improve generalizationperformance. However, while generative adversarial net-works (GANs) show compelling visualizations, they are notoptimal on discriminative tasks and can be limited to smallershifts. On the other hand, discriminative approaches canhandle larger domain shifts, but impose tied weights on themodel and do not exploit a GAN-based loss. In this work,we first outline a novel generalized framework for adver-sarial adaptation, which subsumes recent state-of-the-artapproaches as special cases, and use this generalized viewto better relate prior approaches. We then propose a previ-ously unexplored instance of our general framework whichcombines discriminative modeling, untied weight sharing,and a GAN loss, which we call Adversarial DiscriminativeDomain Adaptation (ADDA). We show that ADDA is moreeffective yet considerably simpler than competing domain-adversarial methods, and demonstrate the promise of ourapproach by exceeding state-of-the-art unsupervised adapta-tion results on standard domain adaptation tasks as well asa difficult cross-modality object classification task.1. IntroductionDeep convolutional networks, when trained on large-scaledatasets, can learn representations which are generically use-full across a variety of tasks and visual domains [1, 2]. How-ever, due to a phenomenon known as dataset bias or domainshift [3], recognition models trained along with these rep-resentations on one large dataset do not generalize well tosourcetargettarget*encoderdomain*discriminatorinto a common feature space. This is generally achieved byoptimizing the representation to minimize some measure ofdomain shift such as maximum mean discrepancy [5, 6] orcorrelation distances [7, 8]. An alternative is to reconstructthe target domain from the source representation [9].Adversarial adaptation methods have become an increas-ingly popular incarnation of this type of approach whichseeks to minimize an approximate domain discrepancy dis-tance through an adversarial objective with respect to a do-main discriminator. These methods are closely related togenerative adversarial learning [10], which pits two networksagainst each other—a generator and a discriminator. Thegenerator is trained to produce images in a way that confusesthe discriminator, which in turn tries to distinguish themfrom real image examples. In domain adaptation, this prin-ciple has been employed to ensure that the network cannotdistinguish between the distributions of its training and testdomain examples [11, 12, 13]. However, each algorithmmakes different design choices such as whether to use a gen-erator, which loss function to employ, or whether to shareweights across domains. For example, [11, 12] share weightsand learn a symmetric mapping of both source and target im-ages to the shared feature space, while [13] decouple somelayers thus learning a partially asymmetric mapping.In this work, we propose a novel unified framework foradversarial domain adaptation, allowing us to effectivelyexamine the different factors of variation between the exist-ing approaches and clearly view the similarities they eachshare. Our framework unifies design choices such as weight-sharing, base models, and adversarial losses and subsumesprevious work, while also facilitating the design of novelinstantiations that improve upon existing ones.In particular, we observe that generative modeling of in-put image distributions is not necessary, as the ultimate taskis to learn a discriminative representation. On the other hand,asymmetric mappings can better model the difference in lowlevel features than symmetric ones. We therefore proposea previously unexplored unsupervised adversarial adapta-tion method, Adversarial Discriminative Domain Adapta-tion (ADDA), illustrated in Figure 1. ADDA first learns adiscriminative representation using the labels in the sourcedomain and then a separate encoding that maps the targetdata to the same space using an asymmetric mapping learnedthrough a domain-adversarial loss. Our approach is simpleyet surprisingly powerful and achieves state-of-the-art visualadaptation results on the MNIST, USPS, and SVHN digitsdatasets. We also test its potential to bridge the gap betweeneven more difficult cross-modality shifts, without requiringinstance constraints, by transferring object classifiers fromRGB color images to depth observations. Finally, we eval-uate on the standard Office adaptation dataset, and showthat ADDA achieves strong improvements over competingmethods, especially on the most challenging domain shift.2. Related workThere has been extensive prior work on domain trans-fer learning, see e.g., [3]. Recent work has focused ontransferring deep neural network representations from alabeled source datasets to a target domain where labeleddata is sparse or non-existent. In the case of unlabeledtarget domains (the focus of this paper) the main strat-egy has been to guide feature learning by minimizing thedifference between the source and target feature distribu-tions [11, 12, 5, 6, 8, 9, 13, 14].Several methods have used the Maximum Mean Discrep-ancy (MMD) [3] loss for this purpose. MMD computes thenorm of the difference between two domain means. TheDDC method [5] used MMD in addition to the regular clas-sification loss on the source to learn a representation that isboth discriminative and domain invariant. The Deep Adapta-tion Network (DAN) [6] applied MMD to layers embeddedin a reproducing kernel Hilbert space, effectively matchinghigher order statistics of the two distributions. In contrast,the deep Correlation Alignment (CORAL) [8] method pro-posed to match the mean and covariance of the two distribu-tions.Other methods have chosen an adversarial loss to mini-mize domain shift, learning a representation that is simulta-neously discriminative of source labels while not being ableto distinguish between domains. [12] proposed adding a do-main classifier (a single fully connected layer) that predictsthe binary domain label of the inputs and designed a domainconfusion loss to encourage its prediction to be as close aspossible to a uniform distribution over binary labels. The gra-dient reversal algorithm (ReverseGrad) proposed in [11] alsotreats domain invariance as a binary classification problem,but directly maximizes the loss of the domain classifier byreversing its gradients. DRCN [9] takes a similar approachbut also learns to reconstruct target domain images. Domainseparation networks [15] enforce these adversarial losses tominimize domain shift in a shared feature space, but achieveimpressive results by augmenting their model with privatefeature spaces per-domain, an additional dissimilarity lossbetween the shared and private spaces, and a reconstructionloss.In related work, adversarial learning has been exploredfor generative tasks. The Generative Adversarial Network(GAN) method [10] is a generative deep model that pits twonetworks against one another: a generative model G thatcaptures the data distribution and a discriminative modelD that distinguishes between samples drawn from G andimages drawn from the training data by predicting a binarylabel. The networks are trained jointly using backprop on thelabel prediction loss in a mini-max fashion: simultaneouslyupdate G to minimize the loss while also updating D tomaximize the loss (fooling the discriminator). The advantageof GAN over other generative methods is that there is no7168need for complex sampling or inference during training;the downside is that it may be difficult to train. GANshave been applied to generate natural images of objects,such as digits and faces, and have been extended in severalways. The BiGAN approach [16] extends GANs to alsolearn the inverse mapping from the image data back into thelatent space, and shows that this can learn features usefulfor image classification tasks. The conditional generativeadversarial net (CGAN) [17] is an extension of the GANwhere both networks G and D receive an additional vector ofinformation as input. This might contain, say, informationabout the class of the training example. The authors applyCGAN to generate a (possibly multi-modal) distribution oftag-vectors conditional on image features. GANs have alsobeen explicitly applied to domain transfer tasks, such asdomain transfer networks [18], which seek to directly mapsource images into target images.Recently the CoGAN [13] approach applied GANs to thedomain transfer problem by training two GANs to generatethe source and target images respectively. The approachachieves a domain invariant feature space by tying the high-level layer parameters of the two GANs, and shows thatthe same noise input can generate a corresponding pair ofimages from the two distributions. Domain adaptation wasperformed by training a classifier on the discriminator outputand applied to shifts between the MNIST and USPS digitdatasets. However, this approach relies on the generatorsfinding a mapping from the shared high-level layer featurespace to full images in both domains. This can work wellfor say digits which can be difficult in the case of moredistinct domains. In this paper, we observe that modelingthe image distributions is not strictly necessary to achievedomain adaptation, as long as the latent feature space isdomain invariant, and propose a discriminative approach.3. Generalized adversarial adaptationWe present a general framework for adversarial unsuper-vised adaptation methods. In unsupervised adaptation, weassume access to source images Xs and labels Ys drawnfrom a source domain distribution ps(x, y), as well as targetimages Xt drawn from a target distribution pt(x, y), wherethere are no label observations. Our goal is to learn a tar-get representation, Mt and classifier Ct that can correctlyclassify target images into one of K categories at test time,despite the lack of in domain annotations. Since direct su-pervised learning on the target is not possible, domain adap-tation instead learns a source representation mapping, Ms,along with a source classifier, Cs, and then learns to adaptthat model for use in the target domain.In adversarial adaptive methods, the main goal is to reg-ularize the learning of the source and target mappings, Msand Mt, so as to minimize the distance between the empir-ical source and target mapping distributions: Ms(Xs) andsource mappingWhich adversarial objective?source inputtarget inputtarget mappingclassifierWeights tied or untied?source discriminatortarget discriminatorGenerative or discriminative model?Figure 2: Our generalized architecture for adversarial do-main adaptation. Existing adversarial adaptation methodscan be viewed as instantiations of our framework with dif-ferent choices regarding their properties.Mt(Xt). If this is the case then the source classificationmodel, Cs, can be directly applied to the target representa-tions, elimating the need to learn a separate target classifierand instead setting, C = Cs = Ct.The source classification model is then trained using thestandard supervised loss below:minMs,C Lcls(Xs, Yt) =E(xs,ys)∼(Xs,Yt) −K�k=1✶[k=ys] log C(Ms(xs))(1)We are now able to describe our full general frameworkview of adversarial adaptation approaches. We note thatall approaches minimize source and target representationdistances through alternating minimization between twofunctions. First a domain discriminator, D, which classi-fies whether a data point is drawn from the source or thetarget domain. Thus, D is optimized according to a standardsupervised loss, LadvD(Xs, Xt, Ms, Mt) where the labelsindicate the origin domain, defined below:LadvD(Xs, Xt, Ms, Mt) =− Exs∼Xs[log D(Ms(xs))]− Ext∼Xt[log(1 − D(Mt(xt)))](2)Second, the source and target mappings are optimized ac-cording to a constrained adversarial objective, whose partic-ular instantiation may vary across methods. Thus, we canderive a generic formulation for domain adversarial tech-niques below:minDLadvD(Xs, Xt, Ms, Mt)minMs,Mt LadvM (Xs, Xt, D)s.t.ψ(Ms, Mt)(3)7169MethodBase modelWeight sharingAdversarial lossGradient reversal [19]discriminativesharedminimaxDomain confusion [12]discriminativesharedconfusionCoGAN [13]generativeunsharedGANADDA (Ours)discriminativeunsharedGANTable 1: Overview of adversarial domain adaption methods and their various properties. Viewing methods under a unifiedframework enables us to easily propose a new adaptation method, adversarial discriminative domain adaptation (ADDA).In the next sections, we demonstrate the value of ourframework by positioning recent domain adversarial ap-proaches within our framework. We describe the poten-tial mapping structure, mapping optimization constraints(ψ(Ms, Mt)) choices and finally choices of adversarial map-ping loss, LadvM .3.1. Source and target mappingsIn the case of learning a source mapping Ms alone it isclear that supervised training through a latent space discrim-inative loss using the known labels Ys results in the bestrepresentation for final source recognition. However, giventhat our target domain is unlabeled, it remains an open ques-tion how best to minimize the distance between the sourceand target mappings. Thus the first choice to be made is inthe particular parameterization of these mappings.Because unsupervised domain adaptation generally con-siders target discriminative tasks such as classification, pre-vious adaptation methods have generally relied on adaptingdiscriminative models between domains [12, 19]. With adiscriminative base model, input images are mapped intoa feature space that is useful for a discriminative task suchas image classification. For example, in the case of digitclassification this may be the standard LeNet model. How-ever, Liu and Tuzel achieve state of the art results on un-supervised MNIST-USPS using two generative adversarialnetworks [13]. These generative models use random noiseas input to generate samples in image space—generally, anintermediate feature of an adversarial discriminator is thenused as a feature for training a task-specific classifier.Once the mapping parameterization is determined forthe source, we must decide how to parametrize the targetmapping Mt. In general, the target mapping almost alwaysmatches the source in terms of the specific functional layer(architecture), but different methods have proposed variousregularization techniques. All methods initialize the targetmapping parameters with the source, but different methodschoose different constraints between the source an
下载后可阅读完整内容,剩余1页未读,立即下载
cpongm
- 粉丝: 5
- 资源: 2万+
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 探索数据转换实验平台在设备装置中的应用
- 使用git-log-to-tikz.py将Git日志转换为TIKZ图形
- 小栗子源码2.9.3版本发布
- 使用Tinder-Hack-Client实现Tinder API交互
- Android Studio新模板:个性化Material Design导航抽屉
- React API分页模块:数据获取与页面管理
- C语言实现顺序表的动态分配方法
- 光催化分解水产氢固溶体催化剂制备技术揭秘
- VS2013环境下tinyxml库的32位与64位编译指南
- 网易云歌词情感分析系统实现与架构
- React应用展示GitHub用户详细信息及项目分析
- LayUI2.1.6帮助文档API功能详解
- 全栈开发实现的chatgpt应用可打包小程序/H5/App
- C++实现顺序表的动态内存分配技术
- Java制作水果格斗游戏:策略与随机性的结合
- 基于若依框架的后台管理系统开发实例解析
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功