优化阈值与分类器以最大化F1分数

需积分: 9 18 浏览量更新于2024-07-19 收藏 515KB PDF 举报

"这篇论文探讨了在二分类和多标签分类任务中如何通过调整阈值最大化F1分数。F1分数是评估罕见类别分类性能的常用指标，它是由精确率和召回率的调和平均值构成。在多标签分类中，有微平均、宏平均和每个实例平均的F1分数。作者揭示了对于任何产生实值输出的分类器，最佳F1分数与实现这一最优值的决策阈值之间的关系。特别地，如果分类器输出的是良好的条件概率，那么最优阈值是最佳F1分数的一半。另一方面，如果分类器完全没有信息性，最优策略是将所有样本都分类为正类。实际正例的流行率通常会影响最佳阈值的选择。" 本文深入研究了如何优化二元分类和多标签分类中的F1分数，这是衡量分类器性能的重要指标，尤其是在处理不平衡数据集时。F1分数综合考虑了精确率（Precision）和召回率（Recall），对于稀有类别的识别尤其关键。在多标签分类中，可以使用不同类型的平均F1分数来评估模型性能，包括微平均（Micro-average）、宏平均（Macro-average）和每个实例平均（Per-instance average）。论文提出，对于任何给出连续输出的分类器，存在一个最佳的阈值，使得F1分数达到最大。这个阈值的选择依赖于分类器的输出特性。例如，如果分类器能够输出准确的条件概率，即输出值代表样本属于正类的概率，那么最佳阈值是最佳F1分数的一半。这种情况下，通过调整阈值，可以在保持精确率和召回率之间平衡，从而最大化F1分数。另一方面，如果分类器没有提供任何有用信息，即其输出与真实类别无关，那么最佳策略是将所有样本都预测为正类，这会得到100%的召回率，但精确率可能很低，具体取决于数据集中正例的比例。实际上，真实场景中正例的分布（即阳性样本的预估比例）会显著影响最佳阈值的选取，因为选择阈值的目标是最大化F1分数，而F1分数是对精确率和召回率的综合权衡。该论文为优化分类器性能提供了理论基础，指导了在面对不同情况时如何选择合适的阈值，以提高模型对罕见或不平衡类别的识别能力。这对于开发和评估实际应用中的机器学习模型具有重要意义。

4 Zachary C. Lipton, Charles Elkan, and Balakrishnan Naryanaswamy

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

True Positive

F1 score

Base Rate of 0.1

Fig. 2: Holding base rate and fp con-

stant, F1 is concave in tp. Each line

is a diﬀerent value of fp.

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

True Positive

Accuracy

Base Rate of 0.1

Fig. 3: Unlike F1, accuracy oﬀers lin-

early increasing returns. Each line is

a ﬁxed value of fp.

2.2 Multilabel Performance Measures

While F1 was developed for single-label information retrieval, as mentioned there

are variants of F1 for the multilabel setting. Micro F1 treats all predictions on

all labels as one vector and then calculates the F1 score. In particular,

tp = 2

i=1

j=1

= 1)

= 1).

We deﬁne fp and fn analogously and calculate the ﬁnal score using (1). Macro

F1, which can also be called per label F1, calculates the F1 for each of the m

labels and averages them:

F 1

Macro

(P |G) =

j=1

F 1(P

, G

Per instance F1 is similar but averages F1 over all n examples:

F 1

Instance

(P |G) =

i=1

F 1(P

, G

Accuracy is the fraction of all instances that are predicted correctly:

Acc =

tp + tn

tp + tn + f p + fn

Accuracy is adapted to the multilabel setting by summing tp and tn for all labels

and then dividing by the total number of predictions:

Acc(P |G) =

i=1

j=1

= G

剩余15页未读，继续阅读

RoaringKitty

粉丝: 6w+
资源: 26

优化阈值与分类器以最大化F1分数

Fuzzy Homogeneity Approach to Multilevel Thresholding

How to Segment Images Using Color Thresholding.zip

Global-Thresholding-Optimum-Thresholding-Otsu-:vs2013+opencv 基本全局阈值处理 最佳全局阈值处理（Otsu）

Minnimum error thresholding

Thresholding阈值分割

matlab开发-Thresholding

Minimum cross entropy thresholding

软阈值soft thresholding

Thresholding for Change Detection

Segmentation using Thresholding:Segmentation using Thresholding by using inbuilt Matlab functions-matlab开发

最新资源

Global-Thresholding-Optimum-Thresholding-Otsu-:vs2013+opencv 基本全局阈值处理最佳全局阈值处理（Otsu）