深入理解监督对比学习在深度图像模型中的应用

下载需积分: 1 | PDF格式 | 2.1MB | 更新于2025-03-20 | 100 浏览量 | 举报

在这篇标题为"Supervised Contrastive Learning.pdf"的论文中，作者们探讨了监督对比学习（Supervised Contrastive Learning）在深度学习领域中的应用。描述中提到，文章主要关注于如何将对比学习从自监督学习扩展到完全监督学习的场景中，通过有效利用标签信息来提升深度图像模型的性能。具体来说，论文提出了一种在嵌入空间中将同一类别的样本点聚集在一起，同时将不同类别的样本点分离开来的方法。该部分内容涵盖了对比学习方法的发展、自监督学习在深度图像模型训练中的成效、以及现代批量对比方法的优势。对于深度学习（Deep Learning）而言，监督对比学习是一种新的学习范式，它不同于传统的监督学习，后者依赖于样本与标签的直接配对。相反，监督对比学习更加关注于样本间的关系，通过增强相似样本间的相似性以及不同样本间的差异性，来学习更好的特征表示。这种方法已被证明在多种机器学习任务中非常有效，尤其是在那些具有丰富数据标注信息的场景中。在描述中提到的自监督学习（Self-supervised Learning）中，对比学习指的是一个强大的范式，其在无监督学习领域内取得了显著的进展。对比学习的基本思想是利用未标记数据本身来学习数据的表征。通过比较和区分正负样本对，模型能够捕捉到更有判别力的特征。文中提到，现代的批量对比方法（如SimCLR、MoCo等）已经显著超越了传统的对比损失，如三元组损失（Triplet Loss）、最大间隔损失（Max-margin Loss）和N对损失（N-pairs Loss）。在完全监督学习（Fully-supervised Learning）场景下，研究者们提出将自监督学习中的对比学习策略应用于监督学习中。在该策略中，模型通过对比不同类别的样本对来学习区分不同类别。这种方法的好处在于，它不仅能够使模型更好地理解类别之间的边界，而且还能够利用丰富的标签信息来增强样本间相似性的学习。具体做法包括将同一类别的样本点通过某种方式“拉近”在嵌入空间中，而将不同类别的样本点“推开”，从而在嵌入空间内创建出更加清晰的类别边界。这种监督对比学习的方法在处理大量标注数据时表现得尤为有效。由于深度学习模型通常需要大量的数据来训练，监督对比学习提供了一种新的方式来利用这些数据，以便在保留标签信息的同时，增强模型的特征学习能力。这一点尤其重要，因为传统的深度学习模型在处理小样本数据集时效果通常较差，而监督对比学习能够通过这种数据增强技术来缓解这一问题。在描述的后半部分，论文作者分析了两种可能的监督对比学习的版本。这可能涉及到不同的损失函数设计、优化策略、以及样本选择方式等。具体细节在提供的内容中未被详细展开，但可以推测这将包括对比损失函数的修改，使其能够在有监督的设置下工作得更好。综上所述，这篇论文主要探讨了如何将自监督学习中的对比学习技术应用于有监督的学习场景中，并通过理论分析和实验证明了其有效性。这种新的方法为深度学习提供了一种新的特征学习范式，尤其是在处理大规模标注数据时，展示了显著的性能优势。

展开

Note that for each anchor i, there is 1 positive pair and 2N − 2 negative pairs. The denominator has

a total of 2N − 1 terms (the positive and negatives).

3.2.2 Supervised Contrastive Losses

For supervised learning, the contrastive loss in Eq. 1 is incapable of handling the case where, due to

the presence of labels, more than one sample is known to belong to the same class. Generalization

to an arbitrary numbers of positives, though, leads to a choice between multiple possible functions.

Eqs. 2 and 3 present the two most straightforward ways to generalize Eq. 1 to incorporate supervi-

sion.

sup

out

i∈I

sup

out,i

i∈I

−1

|P (i)|

p∈P (i)

log

exp (z

•

/τ)

a∈A(i)

exp (z

•

/τ)

(2)

sup

i∈I

sup

in,i

i∈I

− log











|P (i)|

p∈P (i)

exp (z

•

/τ)

a∈A(i)

exp (z

•

/τ)











(3)

Here, P (i) ≡ {p ∈ A(i) : ˜y

= ˜y

} is the set of indices of all positives in the multiviewed batch

distinct from i, and |P (i)| is its cardinality. In Eq. 2, the summation over positives is located outside

of the log (L

sup

out

) while in Eq. 3, the summation is located inside of the log (L

sup

). Both losses have

the following desirable properties:

• Generalization to an arbitrary number of positives. The major structural change of Eqs. 2

and 3 over Eq. 1 is that now, for any anchor, all positives in a multiviewed batch (i.e., the

augmentation-based sample as well as any of the remaining samples with the same label) con-

tribute to the numerator. For randomly-generated batches whose size is large with respect to the

number of classes, multiple additional terms will be present (on average, N/C, where C is the

number of classes). The supervised losses encourage the encoder to give closely aligned represen-

tations to all entries from the same class, resulting in a more robust clustering of the representation

space than that generated from Eq. 1, as is supported by our experiments in Sec. 4.

• Contrastive power increases with more negatives. Eqs. 2 and 3 both preserve the summation

over negatives in the contrastive denominator of Eq. 1. This form is largely motivated by noise

contrastive estimation and N-pair losses [13, 45], wherein the ability to discriminate between

signal and noise (negatives) is improved by adding more examples of negatives. This property is

important for representation learning via self-supervised contrastive learning, with many papers

showing increased performance with increasing number of negatives [18, 15, 48, 3].

• Intrinsic ability to perform hard positive/negative mining. When used with normalized rep-

resentations, the loss in Eq. 1 induces a gradient structure that gives rise to implicit hard posi-

tive/negative mining. The gradient contributions from hard positives/negatives (i.e., ones against

which continuing to contrast the anchor greatly beneﬁts the encoder) are large while those for easy

positives/negatives (i.e., ones against which continuing to contrast the anchor only weakly beneﬁts

the encoder) are small. Furthermore, for hard positives, the effect increases (asymptotically) as

the number of negatives does. Eqs. 2 and 3 both preserve this useful property and generalize it

to all positives. This implicit property allows the contrastive loss to sidestep the need for explicit

hard mining, which is a delicate but critical part of many losses, such as triplet loss [42]. We note

that this implicit property applies to both supervised and self-supervised contrastive losses, but our

derivation is the ﬁrst to clearly show this property. We provide a full derivation of this property

from the loss gradient in the Supplementary material.

Loss Top-1

sup

out

78.7%

sup

67.4%

Table 1: ImageNet Top-1 classiﬁcation

accuracy for supervised contrastive

losses on ResNet-50 for a batch size of

6144.

The two loss formulations are not, however, equivalent. Be-

cause log is a concave function, Jensen’s Inequality [23] im-

plies that L

sup

≤ L

sup

out

. One would thus expect L

sup

out

to be

the superior supervised loss function (since it upper-bounds

sup

). This conclusion is also supported analytically. Table 1

compares the ImageNet [7] top-1 classiﬁcation accuracy using

sup

out

and L

sup

for different batch sizes (N ) on the ResNet-50

[17] architecture. The L

sup

out

supervised loss achieves signiﬁ-

cantly higher performance than L

sup

. We conjecture that this is due to the gradient of L

sup

having

下载后可阅读完整内容，剩余22页未读，立即下载

身份认证购VIP最低享 7 折!

30元优惠券

图先

粉丝: 188

深入理解监督对比学习在深度图像模型中的应用

再介绍一篇最新的Contrastive Self-supervised Learning综述论文 .pdf

丹琦女神新作：对比学习，简单到只需要Dropout两下.pdf

contrastive self-supervised learning pdf

2306.10125.pdf

深度学习神经网络(英文版PDF教程）

Archive.zip

LabVIEW控件设计与实现：媲美QT控件的高级UI开发技巧

Delphi 12.3控件之unidac-10.4.0-d27pro.exe

11.盛趣自闭面(还是自己太菜).txt

58面经面试过程和题目.txt

最新资源