mRMR特征选择：最大化互信息准则下的高效算法

版权申诉

183 浏览量更新于2024-08-12 收藏 1.22MB PDF 举报

2005年《Pattern Analysis and Machine Intelligence》(TPAMI)期刊上的一篇论文"FeatureSelectionBasedonMutualInformation: CriteriaofMax-Dependency,Max-Relevance, andMin-Redundancy"由Hanchuan Peng、Fuhui Long和Chris Ding三位作者共同撰写。该研究主要关注特征选择在模式分类系统中的关键作用，特别是如何根据互信息的最大统计依赖性准则来选取优秀的特征。论文的核心内容在于提出了一种名为最小冗余-最大相关性准则（Minimal Redundancy - Maximal Relevance, mRMR）的方法，作为一种解决直接实施最大依赖条件困难的等价形式。mRMR准则强调的是在保持高相关性的前提下，尽可能减少特征之间的冗余度，从而选出对分类性能有显著贡献的特征集合。为了实现高效且成本低的特征选择，论文提出了一种两阶段特征选择算法，将mRMR与其他更复杂的特征选择器（如包裹式方法wrappers）相结合。这种方法的优势在于能够在保证特征集质量的同时，大大降低计算复杂性和资源消耗。实验部分是论文的关键部分，作者使用了包括手写数字识别、心律不齐数据集、NCI癌症细胞系和淋巴瘤样本在内的四个不同的数据集，以及三种不同的分类器——朴素贝叶斯、支持向量机（SVM）和线性判别分析（LDA）。通过对比他们的算法与这些方法在实际应用中的表现，结果证实了mRMR准则的有效性和优越性，特别是在提高分类准确性和效率方面。这篇论文的研究成果对于特征选择在机器学习和数据挖掘领域的实践具有重要意义，它提供了一种实用的理论框架和技术手段，帮助工程师们优化特征子集，提升模型性能，同时减少了过拟合的风险。在当今大数据时代，高效的特征选择策略仍然是一个重要的研究课题，mRMR作为其中的一种经典方法，值得深入理解和应用。

The computational complexity of this incremental search

method is OðjSjMÞ .

2.3 Optimal First-Order Incremental Selection

We prove in the following that the combination of Max-

Relevance and Min-Redundancy criteria, i.e., the mRMR

criterion, is equivalent to the Max-Dependency criterion if

one feature is selected (added) at one time. We call this type

of selection the “first-order” incremental search. We have

the following theorem:

Theorem. For the first-order incremental search, mRMR is

equivalent to Max-Dependency (2).

Proof. By definition of the first-order search, we assume

that S

m1

, i.e., the set of m  1 features, has already been

obtained. The task is to select the optimal mth feature x

from set fX  S

m1

The dependency D in (2) and (3) is represented by

mutual information, i.e., D ¼ IðS

; cÞ,whereS

m1

g can be treated as a multivariate variable.

Thus, by the definition of mutual information, we have:

IðS

; cÞ¼HðcÞþHðS

ÞHðS

;cÞ

¼ HðcÞþHðS

m1

ÞHðS

m1

;cÞ;

ð8Þ

where Hð:Þ is the entropy of the respective multivariate

(or univariate) variables.

Now, we define the following quantity JðS

Þ¼

Jðx

; ...;x

Þ for scalar variables x

; ...;x

Jðx

; ...;x

Þ¼



pðx

; ...;x

Þ log

pðx

; ...;x

pðx

Þpðx

dx

ð9Þ

Similarly, we define JðS

;cÞ¼Jðx

; ...;x

;cÞ as

Jðx

; ...;x

;cÞ¼



pðx

; ...;x

;cÞ log

pðx

; ...;x

;cÞ

pðx

Þpðx

ÞpðcÞ

dx

dc:

ð10Þ

We can easily derive (11) and (12) from (9) and (10),

HðS

m1

Þ¼HðS

Þ¼

i¼1

Hðx

ÞJðS

Þ; ð11Þ

HðS

m1

;cÞ¼HðS

;cÞ¼HðcÞþ

i¼1

Hðx

ÞJðS

;cÞ:

ð12Þ

By substituting them to the corresponding terms in

(8), we have

IðS

; cÞ¼JðS

;cÞJðS

¼ JðS

m1

;cÞJðS

m1

Þ:

ð13Þ

Obviously, Max-Dependency is equivalent to simul-

taneously maximizing the first term and minimizing the

second term.

We can use the Jensen’s Inequality [16] to show the

second term JðS

m1

Þ is lower-bounded by 0. A

related and slightly simpler proof is to consider the

inequality logðzÞz  1 with the equality if and only if

z ¼ 1. We see that

 Jðx

; ...;x



pðx

; ...;x

Þ log

pðx

Þpðx

pðx

; ...;x

dx





pðx

; ...;x

pðx

Þpðx

pðx

; ...;x

 1



dx



pðx

Þpðx

Þdx

dx





pðx

; ;x

Þdx

dx

¼1  1 ¼ 0:

ð14Þ

It is easy to verify that the minimum is attained when

pðx

; ...;x

Þ¼pðx

Þpðx

Þ, i.e., all the variables are

independent of each other. As all the m  1 features have

been selected, this pair-wise independence condition

means that the mutual information between x

and any

selected feature x

ði ¼ 1; ...;m 1Þ is minimized. This is

the Min-Redundancy criterion.

We can also derive the upper bound of the first term in

(13), JðS

m1

;c;x

Þ. For simplicity, let us first show the

upper bound of the general form Jðy

; ...;y

Þ, assuming

there are n variables y

; ...;y

. This can be seen as follows:

Jðy

; ...;y



pðy

; ...;y

Þ log

pðy

; ...;y

pðy

Þpðy

dy



pðy

; ...;y

Þ log

pðy

;...;y

Þpðy

;...;y

Þpðy

n1

Þpðy

pðy

Þpðy

n1

Þpðy

dy

n1

i¼1

Hðy

ÞHðy

; ...;y

ÞHðy

; ...;y

ÞHðy

n1



n1

i¼1

Hðy

Þ:

ð15Þ

Equation (15) can be easily extended as

Jðy

; ...;y

Þmin

i¼2

Hðy

Þ;

i¼1;i6¼2

Hðy

Þ; ;

i¼1;i6¼n1

Hðy

Þ;

n1

i¼1

Hðy

()

ð16Þ

It is easy to verify the maximum of Jðy

; ...;y

Þ or,

similarly, the first term in (13), JðS

m1

;c;x

Þ, is attained

when all variables are maximally dependent. When S

m1

has been fixed, this indicates that x

and c should have the

maximal dependency. This is the Max-Relevance criterion.

Therefore, according to (13), as a combination of Max-

Relevance and Min-Redundancy, mRMR is equivalent to

Max-Dependency for first-order selection. tu

Note that the quantity Jð:Þ in (9) and (10) has also been

called “mutual information” for multiple scalar variables[10].

We have the following observations:

1228 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 27, NO. 8, AUGUST 2005

剩余12页未读，继续阅读

应用市场

粉丝: 889
资源: 4164

mRMR特征选择：最大化互信息准则下的高效算法

IEEE论文标准模板

IEEEtrans论文的latex模板

SCI论文模板：CVPR、IEEE、TPAMI、ICCV等通用模板

https://github.com/GeWu-Lab/CSOL_TPAMI2021代码怎么完成运行任务

https://github.com/GeWu-Lab/CSOL_TPAMI2021运行步骤

https://github.com/GeWu-Lab/CSOL_TPAMI2021怎么完成复现工作

10.1109/tpami.2023.3235415

10.1109/tpami.2023.3299568

ssim loss 语义分割

宏像素图像(MacPI)

最新资源