深层神经网络中的辅助数据对鲁棒性的影响

137 浏览量更新于2023-10-16 收藏 1.42MB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

4551Seehyung Lee*Hyungyu Lee，首尔国立大学电气和计算机工程系{halo 8218S·u。a. c. k r（kr）Abstract（抽象）最近的几次图迪？有吗你呢，你呢 e of ex-tratributiondata can lead to a high level of ad-ver？阿里·罗布TNA？好吧没有保证这一点为A被选中的数据et. 在他报纸还是个婴儿ed multiarialtraining是否有公共可用的辅助数据E？，你不需要一个俱乐部吗？是吗？ tribution match between the primary andauxilet？。这个提议这一方法可以增加吗艾德·爱德华？阿里 · 罗布 TNA ？？有一个原始数据吗 et by leverag-ingE？via multiSpecif-特色菜？你能做到吗是否通过Biamata的应用恶魔？trated through a theoretical and emi？。是的trate thatwhile ex-iTing方法你对消极的变压器有危险吗被派往迪有责任吗crepancybetween auxiliary and pri-mary data，thepropo？该方法是启用的神经网络通过培训那是什么？完全处理这个领域吗crep-？trategy。pre-trained该代码可在htps： / / g i t h u b. com/saehyung-n e e / B i a M？T . T.l . L. Introduction在训练深层神经网络 Many approaches [1] have beenproposed to bridge the gap in ad-versarial robustnessbetween humans and DNNs。在这些例子中，培训是基于相对示例的使用，而培训数据被认为是最有效的方法来提高DNN的健壮性。Unfortunately* Correspondence to： Saehyung Lee halo 8218@snu.ac.kr。培训图像健壮性特征非Figure 1。robust和非robust特征的可视化由Schmidt等人撰写[34]， the sample complexity of ad-versariallyrobust generalization is substantially higher than that of standardgeneralization。 To address this issue，severalrecentstudies[4，38]leveragedex\（in-distribution）unla-beleddataandOracleedmethodsforimprovingthesamplecomplexityofrobustgeneralization。 However ， although such methods enable state-of-the-art adversarialrobustness，theyarenotalwayscapableofobtainingextrain-distributiondataforanyselecteddatadistribution。在这张纸中基于公共可用的（标记的）辅助数据集的使用拟议的方法是根据预期的效果而根据随后的假设：Assumptio n l.ACOmmonROBU ？ tandnon-ro b u ？tfeature？艾斯·艾斯T？在初级数据和辅助数据之间et？。Figure1 shows that robust features [21] exhibithuman-perceptible patterns。我们可以认为，如果两个数据集从人类的角度看是相似的，然后他们分享了强大的特性。 However Fortunately ， recent stud-ies [28 ， 24] haveprovided empirical evidence in support of the presence of a common non-robust feature space among diverse image datasets。因此， unlikeexisting state-of-the-art methods [4， 38]， which employ in-distributiondata， under BiaMAT， the distribution of the auxiliary dataset and thecorrespondingprimarydatasetcandifferer。以举例来说4552by applying BiaMATproposed method achieves an inductive transfer be-t¬±±¬h”伱p®·m®§¬µµ µ µ÷？n？“¨·l···®§¬”伱®®®”伱 ”¨·l·伱®§伱÷. ×n±¬h®©±®· 此方法可以be consid-ered tobeanincreaseinthesizeofthetraManagementdataset[5]。 In addition， based onstudies that have demonstrated the presence of non-robust features [40，21]， we classify the ef-f\of adversarial traManagement into two types anddemonstrate the usefulness of the proposed method irrespective of the typeconsidered。在particular不规范化和罗布？t特征学习和评估随机标签期望的每个贡献的使用[5，24]。我们对CIFAR数据集和ImageNet演示的实验结果，因为Biamat可以有效地使用培训信号从各种辅助数据集产生。 Furthermore，weshowthat while existing methods are vulnerable to negative transfer due to the dis-tributionaldiscrepancybetweenauxiliaryandprimarydata，利用多样性图像数据集，用于对抗性培训，通过成功地交易，通过领域的干扰2. Biased multi-domain2.l. Method（Method）Existingstate-of-the-art methodsleveragingextraunla-beleddata[4， 38datafromagivenauxiliarydataset， andthen use the remain-ing data for training with pseudo-labels 。因此 Tomaximize the data utility Multi-domain learning is a strategy forimprovingtheperformanceoftasksthatsolvethesameproManagementacrossmultipledomainsbyshar-inginformationacrossthesedomains。在标准设置中例如[33]，属于同一类的数据（例如：键盘（键盘亚马逊在相反的设置中不同数据集之间的船只[28]。In particular-嵌入式功能-嵌入式功能图 2 。 Overview of BiaMAT 。 We use a shared featureembed-ding function for both primary and auxiliary tasks ，while prediction functions （（）givenauil iarydatt wogr oups： a u x ii （ b ） W e a d v e r a l iiForX？[Ilow]stratesthat common adversarial spaces can exist acrossdif-ferentdatasets。因此，我们提出的方法扩展了相关领域的范围，以相对于其认为标准设置以最大化的目标？arial（arial）.InthisrespectWe classify given auxiliary data into two types Wedo not consider aux-iliary datasharingonlyrobustfeatureswiththeprimarydatasetbasedonstudiesdemonstratingthepresenceofacommonnon-robustfeaturespaceamongdiverse imagedatasets[36， 24]。 As previous multi-domainlearningstudiesobserved[5，27taskperformances。By contrast®± _ 用于在SEC。2.2特别是4553l==1使用randomlabelscanim-prove adversarial robustnesswhileavoidingnegativetransfer。根据我们的理论分析，我们建议在（ii）上使用随机标签（yER）。有趣的是初步的。Tsipras等人。[40]描述了其影响通过什么样的培训示例（w， y）？D+11它是从一个分发中提取出来的[24]，which uses OOD data to improve adversarial robust-ness。我们在附录D中比较了我们的方法我们提议将分类为两种类型的辅助数据分类yu.a.r1x2. .+yw. p. p（p）=Y W. P。1p/1ps.s.d.(（二）ingsemi-supervised learning [37， 32] and OOD detection[18]. For example标签通过对比，根据我们提议的方法，自然地将辅助数据记录到它们的健壮特征，以及用于每个两个数据的相对训练算法x1是一个健壮的特征，它与标签相关（p0.5）和保留的. . ，x d+1 are non-robust features thatare vulnerable to adversarial a t a c k s （ 0？-bo und ）。Forthisdatadistribution它可以达到标准的准确性，接近100%，也可以认为是对抗types are applied。我们提供了对该项目的详细（wB.11 1.To sum upunif（unif）unif=0. .d d d一个初级数据集D和一个辅助手册。不幸的datasetdDhigh. DisdividedintotwodatasubsetsD.LOW对抗性训练Lemal. （T s i p r as e t a l. A. D和R？ARIALTRAINNNRE？你和T？任务被hpri这里Ture？x2. .D+1。它是初级和辅助数据集的输出类可能性、对应地和g是嵌入功能的共享特征的预测函数。失函数forDl= ？（ x ，y）D[ladv（w，y;hpri，S）]，wheer-turbationsanadversarycanapply。 Any existing adversarial l oses[25， 4 3 ] c a n b e em p l o y e d f o r l a d v. INADDItIONLemma：对抗训练（i）花的学习 robust featuresWe refer to不规范化和罗布 tfeaturelearning， respec-tively。ford~ 1 ~1|（x，y）|(x˜,y˜)Dhigh ladv设置和概览。Given a shared feature embedding1（1）|（x，y）|(x˜,y˜)D. LOWladv（ w？， y E R ; h p r i ，S ）。Ourgoalistoattaina函数G在主要数据集上有轻微的不利损失因此，pro-posedmethodminimizesthefollowingloss：（a）（a）在特色空间中的模型以下发行版(Primary)=L+l，where[0，1]。（一）x2. .D+1s.s.d.（一），is a hyperparameter that biases the multiAlthough Wedescribe Biamat at(Ay=sign（） y，s.s.d.x1（四）the dataset level批处理级别。图2提供了BiaMAT的概览x2，。 . .在哪里[1，1(,2.2. 理论上的动机。我们从non-robu中提出的方法？不规范化和罗布？不具备特色的学习能力，这是相对训练的两种影响。特别是有两个任务。我们使用口音（tilde）来代表@ equation_0@equation_0@ ×nq. 您的位置：知道173> Ifthetwo datasets are highly correlated in terms ofrobust and non-robust1455445556L~6xadvs（s）1s（s）s（s）s（s）2s（s）i （i）1初级和辅助数据集1y（1y）function to pay less attention to non-robustIn ad-dition(Primary)x+2（五）从辅助对抗中获得的培训信号。loss and back(P（Y）wherev = 0，1，。. .x=2,功能具有与主要任务相同的影响from non-robustd.稳健特征规范化。tion. To study the effect of adversarial traManagementon the aux针对primary task的non-robust feature regularization， we derivethe gradients of the primary and auxiliary adversarial losses with respectto the non-robust feature， which are then back-propagated through theshared featureembeddingfunction g。 In addition， we demonstratehow the use of random labels enables us to dissociate the compoundeffect of the proposed method into the effect of non-robust featureregularization and robust feature learning。Non-robust首先xadv=稳健的特点学习。 If = 1 and the weight value for therobustfeaturex1isnon-zero，clearly， the auxiliary t a s k o n x？a d v learning as well as non-robust feature regularization。 How-ever， when the auxiliary datasetcontainsextranousrobustfeaturs，theeearngonx？advmayleeeetransfer，whichsuppressestheadvantagesofmulti-domainlearning。防止任务之间的感应转移，卡鲁阿纳[5]dataset。Similarlyg拒绝使用辅助性的投入王国标签可以根据我们的情况来考虑对该机构的调查vector ）。 objective function of the adversary todeceive our model is the cross-entropyLemma 2。让我去吧。第二，D+ 1和1 。那么Arrial Feature Vector Again这是辅助的吗？K是什么“º？¬”®lº？lpº·®º？D+11按以下方式分发x？（y，v2），x？，. . .s.s.d. （,？Xadv=y，？Xadv（a）y .（六）1 2 2qu.a. rD+11（八）附录A中的证据stochastic gradient descenttothecross-e由于对相对特征向量的尊重而失去梯度Theoreml. Letl（ ; v ） an d l（ ; v ） b e e e e l o？？function？在初级和辅助的？K？，re？很明显，且t=Forthecaseinwhichtheauxiliarytaskisadversariallytrainedon理论2。让我（V）成为上帝吗辅助功能的功能k. k.那么，如果= 1，那么概率很高。ign？ of xadv：i第二，d+ 1 an d t h e aux i l ia r y l o ？？RADIENTWITRE？pectoxadvare1什么时候有用的数据被删除了可能与该有关按比例分列的主要数据罗布的照片t and non-robut特征I.E.，= 1，对L梯度的期望sign（sign）Xadv==签名.（九）adv（adv）Becauseeeegradientwihrespectox？advisofthesame我是R和E？pectoxs： i2 ， d + 1 i？s（s）adv（adv）？==（v t）d. Dxadv（ t=？6升6xadv.（七）SIGNasxswithhighproba bili ty， t h e a p l i c at i on o f a g ra-dient descent makes the shared feature embeddingfunction to refrain from using non-robustCon-在这一案件中的理论结果<1研究表明，由于随机标签的使用完全边缘化。相关性来自Lemma 2和理论1图像和标签之间的关系To further in签名？Xadv“”信号6L~6xadv6L~6xadvs（s）s（s）4556s（s）. 这就是辅助性数据集，有资格学习强暴功能关于一个梯度下降指南，包括共享功能的嵌入4557以相对的例子为例，6L~这是Q的独立性6xadvi （i）11表1：Accuracy(ER（ER）Tures关于辅助数据集的资料。每项辅助数据集的最佳结果是被标记为粗体;当使用yER被标记为红色时生产更好的结果的辅助数据集y？/yERAuxi liarydatase tBasline[25]SVHN（SVHN） CIFAR-100 Places365ImageNet和47。4448. 4848. 885，O。3348. 53348.53 49.89 49.24 49.81对应于强有力的特征x1，并产生培训信号，该信号将其发送到共享特征嵌入功能如下理论3。让我辅助功能的功能k. k. 然后= 1和w1> 0，具有高概率？我的gn ofxadvandtheauxiliarylo？？ RADIENTWITRE？pectot hatprod ucedbytheuseofy？. CIFAR-10 as the primarydataset，-10 as the primary datasetD. D. Here， we useSVHN[28]， CIFAR-100， andPlaces365asauxiliary datasets that are weakly related to CIFAR-10 in terms of robustfeatures， based on the previous OOD de-tection studies [18， 35]。Inaddition As shown ， for SVHN ， CIFAR-100 ， andPlaces365， the use of y ER leManagement to better a d v rs a r ial r o u s t s th a t e o f y？， e v e n though image-label mappings in the auxiliary datasets are disrupted 。《These results demonstrate that the robust feature l》、《 These results demonstrate that the robust featurel》、《CIFAR-100》、《andPlaces365datasetsisdetrimentaltotheprimarytaskonCIFAR-10》。由对比xadvare1sign=使用你的ER来减少业绩改善指标learning can be achieved from the auxiliary task onIma-geNetThat is一个反对者可能总是会导致一个大的损失，毫无节制的Q。not affect the robust feature learning for the primary taskbe-causeIn practiceing。 To resolve this issue，weusetheexpectationofrandomlabelsinsteadoftheone-hotrandomlabels。FurthermoreyER =[1. .我们的分析演示，即当使用辅助数据集用于主要任务时，最佳算法将应用于不同的数据集，以反映两个数据集之间的关系在健壮特征条款中在现实世界的场景中，如何，辅助数据集可能包含主要任务中最受欢迎和最不受欢迎的健壮特性。 Hence， we introduce a sample-wise selection strategy in our pro-posed method。选择 strategy is for classify given auxiliarydataintotwogroupsbasedontherobustfeManagement，andtherobustfeManagementexhibithuman-perceptiblepatternsasshowninFig。1.（a）因此，选择策略的目标与现有的OD检测方法是一致的[18，35]。Hendrycks等人。[18]建议ER（ER）C C样本是一个统一的海报。有趣的是主要数据集，供供只查阅的the effectiveness of non-robust2.3. 经验的证据。为了经验地演示这些论点，已经开发出来。statemen ts在野蛮特征的条款中与他们提出的方法自然相连That is任务中的野蛮特性的条款在这个基础上(ii)after a few epochsIOW=0，1，D~图中的1个2超参数？+versarialrobustnesthanthatproducedbytheuseeofy？primarydatosorttttheauiliiarydataamplesthat(0，1，D~1 我很低=在图中。（2are likely to cause negative transfer;6L~6xadv14558|B˜||B˜|Algorithm l Biased multiRequire？？+and？+1234：？(第五：第六部分7：8： ω？（x，y） B [m6x h pri （w）]f1011： l？(113：1415161718图3。对每个辅助数据集的比率很高，对该数据集的尊重aux（aux）primary task on CIFAR对抗性攻击方法。 Fast gradient signmethod（FGSM）[15] is an one-step attack using the signof the gradi-ent 。Madry等人。[25]提出了FGSM方法（PGD）的迭代应用。Carlini Wagner Autoattack（AA） [10]isan ensembleattack thatconsistsoftwoPGDextensions， onewhite-boxattack[9]， andoneblack-boxattack[2]。 We focus onthel？-r o b u s t n e s s ， t h e m o s tco m on r o b u s t n es s c en a rio192021（L O W + L H I G H）22：2324:than-thresholdYER和剩余的辅助设备d a t a withy？。 Th e ps e udo-co d e i s p r o v i d i n A l g or i t h m1。3. 最后的结果和讨论3.l. EzperimentalSetupDatasets。我们用CIFAR数据集和ImageNet的实验来补充我们的分析。 ImageNet已缩小到6464的尺寸，然后随机细分为包含 100 和 900 类的数据集，并相应地选择ImgNet100 和 ImgNet900 。SVHN [29] ， Places365 ， andImageNetareusedasausil-instance实施细节。在我们的实验中[25]和Zhang等人。[43]作为基线方法，被AT和TRADES拒绝，尊重。在CIFAR数据集上，我们为AT和TRADES使用WRN28-10[42]和WRN34-10，尊敬的。AlthoughincreasingthenumberoftraManagement epochs is expected to lead to higheradversarialrobustnessbecauseoftheuseofadditional data ， owing to the high-computational complexity ofadversarialtraManagement， we re-strictthe traManagementof BiaMAT to100or110 epochswitha batchsize of256（128 primary and 128 auxiliarydata sam-ples ，respectively ）。To evaluate adversarialrobustness， we apply multiple attacks， including PGD，CW， and AA， with an l？-b o u n d w i t h e a m e e t i i ngatteeing。PGDandCWwihKierationsaredeoedbyPGDnand CWn ， respectively，andheunperturbeeeetetisdenotedbyClean。我们一致地选择了最好的检查点[41]，以衡量该模型在测试集上的相对稳健性。进一步的详细资料对模型的实施进行了评估，包括关于选择不同值的失效研究。,3.2. 在各种攻击下的相对健壮性Table2 summarizes the improvements in the adversar-ial拟议的方法可以免费使用各种辅助数据集，因为它可以通过消极的转移来使用。n（n）4559naux的n（n）表2。在CIFAR-10、CIFAR-100、andImgNet100followingapplicationoftheproposedmethodusingvariousdatasets上的Performance improvements（accuracy%）。每个基线方法（AT和TRADES）中的最佳结果被主数据集方法辅助数据集清洁PGD100CW100AAAT77.3451.9051.4048.61CIFAR-10AT+Biamat（我们的）CIFARPlaces 36587.76 57.00 51.70 49.48ImageNet 88.75 57.63 53.o4 5 o.78TRADES85.49 56.86 55.21 53.94Trades+Biamat（我们的）CIFARPlaces 36587.18 59.15 56.36 55.24ImageNet8. 3，59。8至58。L56.64ATCIFAR-100AT+Biamat（我们的）Places 36563.44 32.61 28.53 26.49ImageNet 64.o5 33.74 29.78 27.65TRADESTrades+Biamat（我们的）Places 36564.58 34.38 30.72 29.24Im ageNet65. 8236. 3633. 423L.87ATImgnet 100AT+Biamat（我们的）Places 3657 o.o4 4 o.52 33.24 30.64ImgNet 900 68.00 40.18 35.oo 32.88TRADESTrades+Biamat（我们的）Places 36557.80 29.30 24.14 23.06Im ageNet58. 763L。2625. 9824.98事实1和Tab。关于拟议的方法有效过度造成消极转移的2个证明基于选择策略的工作，当模型通过提议的方法的应用进行训练时，我们在高，在N高，在Nof auxiliary data and the higher-than That is ， the ratiorepresents the percentage of data used for robust feature learning aswell as non-robust feature regularization for an a u x il i a r y d a s e t.Figure3showstheplotftherationhighaux（aux）at（at）=0.55使用各种辅助数据集的AT+Biamat模型对主要任务的负转移 In other words ， a rela-tively highpercentageofImageNetdataareusedforrobustfeaturelearning，andeachoftheSVHN， CIFAR100， andPlaces365datasetsaremostlyusedwithyER，whichiscon-在选项卡中列出的结果1.（a）附加分析见附录F。我们将进行几次实验，以进一步调查所建议的方法 Inother words ， the relationship （ in terms of robust andnon-robust features ） Between the primary and auxiliarydatasets is more important to BiaMAT than the number ofauxiliary datasets。（附录C）进一步演示，即学习健壮特性可以从BiaMAT的辅助任务中完成，我们从AT和AT+BiaMAT模型和通常的火车模型从每个健壮数据集（DAT和DBiaMAT）中提取出更多健壮特性的结果，显示DBiaMAT结果在DAT培训中比DAT更健壮的模型上训练，使BiaMAT有用的DNNs能够通过工业学习更好的健壮特性在初级和辅助数据集上进行反向培训之间的转移。4560表3。ComparisonMethod（Method）辅助性数据集清洁PGD100CW100AATRADES85.8556.6255.1653.93Hendrycks等人。[17]CIFAR-10080.2145.6844.5242.36ImageNet87.1157.1655.4355.30CIFAR-10082.6154.3251.6450.81Carmon等人。[365]地点83.9556.7253.9552.81ImageNet85.4257.4654.6653.79ImageNet-500k86.0259.4956.4355.63TRADES+BiaMATCIFAR-10087.o258.6956.8555.48(ours)87.。L8号59.L5/L556.3655.24ImageNet88.o359.8o58. ol l56.643.3. 与其他相关方法的比较Carmon等人。[4] proposed a semi-supervised例如collected the in-distribution data of CIFAR-10 from 80 MillionTinyimages dataset [38] and used the unlabeled data with pseudo-labels。因此， no assump-tions are required regarding the classes of theprimary and auxiliary datasets in our scenario ， but the semi-supervised method is ineffective when the primary and auxiliarydatasets do not share the same class distributions 。Todemonstrate this，weassignpseudo-labelstotheauxiliarydatausingaTRADESIn particularand select the top 50 k按标签显示。第三[4] method excallyHendrycks等人。[17] demonstrated that ImageNetpre-traManagement can improve adversarial robustness on theCIFAR datasets。 However，thepre-trainingmethodiseffectiveonlywhenadatasetthathasadistributionsimilartothatoftheused. To demonstrate thisdemonstratethat the pre-traManagement methodisineffectivewhenleveragingdatasetsthatdonotsatisfytheconditionsmen-tionedabove。In other wordstrained on a dataset that contains a large quan\of datawithadistributionsimilartothatoftheprimarydatasetConversely， Bi

下载后可阅读完整内容，剩余1页未读，立即下载

cpongm

粉丝: 5
资源: 2万+

深层神经网络中的辅助数据对鲁棒性的影响

深层卷积神经网络在车标分类上的应用.pdf

基于神经网络专家系统的水电机组故障诊断研究.pdf

基于残差BP神经网络的6自由度机器人视觉标定.pdf

神经网络鲁棒性的形式化验证

bp神经网络的鲁棒性

极端量化神经网络的鲁棒性

PCA算法相较于神经网络，鲁棒性较好还是较差？

图卷积神经网络鲁棒性

神经网络模型的鲁棒性的研究背景

如何使用MATLAB查看BP神经网络测算的准确性和鲁棒性

定义神经网络鲁棒性的指标

深层卷积神经网络与浅层卷积神经网络的对比

如何提高神经网络模型的对抗鲁棒性

图神经网络上鲁棒的模型

深度卷积网络中鲁棒性是什么

在工业过程控制中，如何结合动态BP神经网络提高预测控制的准确性和鲁棒性？

2输入RBF神经网络鲁棒滑模

在机器人的神经网络鲁棒控制方法中，神经网络有哪些潜在用途？

如何利用BP神经网络提高动态矩阵控制模型的鲁棒性和适应性？

CNN 深层神经网络识别汉字

最新资源