聚焦Focal Loss：提升密集目标检测的精度

下载需积分: 14 | PDF格式 | 1.22MB | 更新于2024-09-08 | 147 浏览量 | 举报

Focal Loss for Dense Object Detection 是由Tsung-Yi Lin等人在ICCV 2017年发表的一篇备受瞩目的论文。该研究旨在改进密集物体检测器的训练方法，尤其是在面对大量背景样本时，提高模型对目标物体的准确识别能力。传统上，深度学习中的对象检测任务通常采用两阶段方法，如R-CNN系列，它们首先通过区域提议网络生成候选对象位置，然后对这些位置进行分类。这种方法虽然准确性较高，但速度相对较慢，且流程复杂。论文的核心创新是提出了一种名为Focal Loss的新损失函数。传统的交叉熵损失（Cross-Entropy Loss, CE）对于正确分类的样本（pt > 0.5）给出较小的权重，而对于误分类的样本则给予较大惩罚。Focal Loss在此基础上进行了增强，引入了一个参数γ（gamma），它在标准交叉熵的基础上增加了项（1-pt）^γ。当γ > 0时，Focal Loss会进一步减少已正确分类样本的损失，从而将更多的学习资源分配给那些难以区分的、误分类的样本，尤其是背景样本。这种设计的目的是平衡模型在优化过程中的关注点，使得在众多容易分类的背景样本中，模型能更加专注于那些真正具有挑战性的目标检测任务。通过Focal Loss，一阶段的检测器能够在保持速度优势的同时，显著提升其在密集场景下的检测精度。实验结果表明，Focal Loss极大地推动了一阶段密集物体检测器的发展，使得它们能够在保持高效的同时，达到与两阶段方法相当甚至更高的检测性能。这对于实时应用和大规模场景的物体识别至关重要，比如自动驾驶、无人机监控等。 Focal Loss作为一项关键的深度学习技术，革新了物体检测领域的训练策略，展示了在处理大量背景噪声时，如何通过巧妙的损失函数设计来优化模型性能。这一创新不仅提升了检测器的精度，还简化了系统架构，为今后的实时和高效物体检测奠定了坚实基础。

Focal Loss for Dense Object Detection

Tsung-Yi Lin Priya Goyal Ross Girshick Kaiming He Piotr Doll

Facebook AI Research (FAIR)

0 0.2 0.4 0.6 0.8 1

probability of ground truth class

loss

= 0

= 0.5

= 1

= 2

= 5

well-classied

examples

well-classied

examples

CE(p

) = − log(p

)

FL(p

) = −(1 − p

)

log(p

)

Figure 1. We propose a novel loss we term the Focal Loss that

adds a factor (1 − p

)

to the standard cross entropy criterion.

Setting γ > 0 reduces the relative loss for well-classiﬁed examples

> .5), putting more focus on hard, misclassiﬁed examples. As

our experiments will demonstrate, the proposed focal loss enables

training highly accurate dense object detectors in the presence of

vast numbers of easy background examples.

Abstract

The highest accuracy object detectors to date are based

on a two-stage approach popularized by R-CNN, where a

classiﬁer is applied to a sparse set of candidate object lo-

cations. In contrast, one-stage detectors that are applied

over a regular, dense sampling of possible object locations

have the potential to be faster and simpler, but have trailed

the accuracy of two-stage detectors thus far. In this paper,

we investigate why this is the case. We discover that the ex-

treme foreground-background class imbalance encountered

during training of dense detectors is the central cause. We

propose to address this class imbalance by reshaping the

standard cross entropy loss such that it down-weights the

loss assigned to well-classiﬁed examples. Our novel Focal

Loss focuses training on a sparse set of hard examples and

prevents the vast number of easy negatives from overwhelm-

ing the detector during training. To evaluate the effective-

ness of our loss, we design and train a simple dense detector

we call RetinaNet. Our results show that when trained with

the focal loss, RetinaNet is able to match the speed of pre-

vious one-stage detectors while surpassing the accuracy of

all existing state-of-the-art two-stage detectors.

50 100 150 200 250

inference time (ms)

COCO AP

B C

RetinaNet-50

RetinaNet-101

AP time

[A] YOLOv2

†

[26] 21.6 25

[B] SSD321 [21]

28.0 61

[C] DSSD321 [9] 28.0 85

[D] R-FCN

‡

[3] 29.9 85

[E] SSD513 [21] 31.2 125

[F] DSSD513 [9] 33.2 156

[G] FPN FRCN [19] 36.2 172

RetinaNet-50-500 32.5 73

RetinaNet-101-500 34.4 90

RetinaNet-101-800 37.8 198

†

Not plotted

‡

Extrapolated time

Figure 2. Speed (ms) versus accuracy (AP) on COCO test-dev.

Enabled by the focal loss, our simple one-stage RetinaNet detec-

tor outperforms all previous one-stage and two-stage detectors, in-

cluding the best reported Faster R-CNN [27] system from [19]. We

show variants of RetinaNet with ResNet-50-FPN (blue circles) and

ResNet-101-FPN (orange diamonds) at ﬁve scales (400-800 pix-

els). Ignoring the low-accuracy regime (AP<25), RetinaNet forms

an upper envelope of all current detectors, and a variant trained for

longer (not shown) achieves 39.1 AP. Details are given in §5.

1. Introduction

Current state-of-the-art object detectors are based on

a two-stage, proposal-driven mechanism. As popularized

in the R-CNN framework [11], the ﬁrst stage generates a

sparse set of candidate object locations and the second stage

classiﬁes each candidate location as one of the foreground

classes or as background using a convolutional neural net-

work. Through a sequence of advances [10, 27, 19, 13], this

two-stage framework consistently achieves top accuracy on

the challenging COCO benchmark [20].

Despite the success of two-stage detectors, a natural

question to ask is: could a simple one-stage detector achieve

similar accuracy? One stage detectors are applied over a

regular, dense sampling of object locations, scales, and as-

pect ratios. Recent work on one-stage detectors, such as

YOLO [25, 26] and SSD [21, 9], demonstrates promising

results, yielding faster detectors with accuracy within 10-

40% relative to state-of-the-art two-stage methods.

This paper pushes the envelop further: we present a one-

stage object detector that, for the ﬁrst time, matches the

arXiv:1708.02002v1 [cs.CV] 7 Aug 2017

下载后可阅读完整内容，剩余9页未读，立即下载

身份认证购VIP最低享 7 折!

30元优惠券

贤爸爸

粉丝: 9

聚焦Focal Loss：提升密集目标检测的精度

Focal Loss for Dense Object Detection

focal loss

focal loss for dense object detection

focal loss for dense object de

论文研究-PA-RetinaNet: Path Augmented RetinaNet for Dense Object Detection.pdf

医学图像处理，增强CT分类，增强CT分割，Dense Net，U-Net，Focal Loss.zip

FocalLoss-master.zip

Python-密集对象检测的Focalloss

FocalLoss:多类分类的焦点损失

Python-用于密集旋转物体检测的FocalLoss

最新资源