谱聚类提升多类问题的ECOC鉴别纠错：新方法与应用对比

119 浏览量更新于2024-07-14 收藏 896KB PDF 举报

"基于谱聚类的鉴别纠错输出码"研究论文探讨了在多类问题解决中，尤其是通过纠错输出代码（ECOC）提升分类性能的重要课题。纠错输出代码是一种广泛应用的框架，旨在通过编码策略将多类问题转化为二进制或更简单的决策问题。然而，优化类别之间的区分度以获得最佳性能是一项挑战。文章的核心创新在于提出了一种结合光谱聚类和混淆矩阵的新方法。传统的ECOC依赖于硬划分或软划分，而作者的方法则将类空间的划分映射到无向图的割，利用谱聚类算法来发掘数据内在的结构和分组。这种方法不仅考虑了类别间的相似度，还引入了预分类器，增强了分类的准确性。在实验部分，作者将他们的方法应用于合成数据集、UCI机器学习数据集以及人脸识别场景中，与经典的ECOC和DECOC进行对比。结果显示出显著的优势，即在保持或提高分类精度的同时，降低了计算复杂性。这表明，通过谱聚类的视角进行鉴别纠错编码，不仅提供了更有效的类别划分，还有助于在实际应用中节省资源并提升效率。此外，论文中还强调了校对和修正过程的重要性，包括在线提交、PDF标注、清晰的传真格式，以及对元数据的核实，确保作者信息和引用文献的准确性。对于文本内容的完整性，作者要求检查所有元素如图表、公式和电子补充材料，以确保最终版本的无误。总结来说，这篇文章为多类问题的编码方法提供了一个新颖且高效的技术改进，展示了谱聚类在ECOC框架中的潜力，并通过实证结果证明了其在性能和效率上的优势。这对于那些处理大规模多类分类任务的科研人员和工程师来说，是一篇极具实用价值的研究成果。

UNCORRECT

PROOF

which is proposed ﬁrst by Crammer and Singer [

16]. In

their paper three learning problems have been arisen. They

focus on the second problem which is given a set of binary

classiﬁer, ﬁnd a good matrix. They prove that this problem

is NP-complete which underscores the difﬁculty of coding

matrix design. The difﬁculty is due to that there is no

function mapping a coding matrix to the empirical loss. In

other word, it is difﬁcult to know what kind of coding

matrix can lead to have small empirical loss. For this

purpose, Masulli and Valentini [

17] experimentally analyze

some of the main factors affecting the effectiveness of

ECOC methods and the analysis shows that all these fac-

tors concur to the effectiveness of ECOC methods in a not

straightforward way, very likely dependent on the distri-

bution and complexity of the data. Garcia-Pedrajas and

Fyfe [

18] propose an evolutionary approach to the design

of output codes with a ﬁtness function, which is made up of

ﬁve terms that are relevan t for a coding matrix to achieve a

good performance. Their paper obtains a better perfor-

mance, but more detailed works are also needed to

understand how these terms affect the performance of a

coding matrix clearly.

On the othe r hand, since the factors affecting the

effectiveness of ECOC methods are diverse and interact

between them, simpl y, we can focus on only one aspect to

check the impact of the performance of a coding matrix.

Ali-Bagheri et al. [

14, 19] propose an efﬁcient way to

improve independency among binary classiﬁers. Angel-

Bautista et al. [

15] utilize a novel Genetic strategy to ﬁnd

a set of dichotomizers with better performance, then,

ensemble them as an optimal coding matrix. Moreover,

Valentini [

20] proposed that the empirical loss depends on

100

the complexity of the dichotomies induced by the selected

101

decomposition method and on the accuracy of the

102

dichotomizers. Kong and Dietterich [

21] present that the

103

dichotomizers must learn more separating surfaces

104

between classes. From this point, Pujol et al. [

12] present a

105

heuristic method (discriminant ECOC, DECOC) to obtain

106

maximum class discrimination in the partitions. With

107

maximum class discrimination in each partition, DECOC

108

can reduce the complexity of the dichotomies efﬁciently

109

and have more separating surfaces, which will lead to a

110

smaller empirical loss. Furthermore, with the aid of a

111

binary tree, DECOC can also have compact codewords.

112

So, we can state that DECOC is a promising method for

113

ECOC design.

114

However, ﬁnding the binary partitions with maximum

115

class discrimination in DECOC is still a difﬁcul t problem,

116

which affects the application of DECOC in practice. The

117

reason is that we need an exhaustive search among all

118

possible partitions to ﬁnd the optimal binary partitions,

119

which will spend an impractical computation cost. It is

120

worth noting that in Escalera’s work [

28] they give a

121

simpliﬁed version for DECOC, which has a random per-

122

mutation among two given partitions and cannot guarantee

123

to obtain the optimal binary partitions. This has proven that

124

it is difﬁcult to apply DECOC with the SFFS algorithm and

125

the fast quadratic mutual information in practice. This also

126

proves that it is necessary to ﬁnd an alternative method to

127

improve the original DECOC. The motivation of this paper

128

is to make DECOC be applied in practice easily, by

129

proposing an alternative and efﬁcient approach to obtain

130

the optimal binary partitions.

131

It is our purpose in this paper to make a certain con-

132

tribution to improve DECOC using spectral clustering to

133

obtain the optimal binary partitions. In our algorithm, each

134

class is seen as a vertex in an undirected graph and the

135

weight for each edge which joints two classes is measured

136

by the confusion matrix with a pre-classiﬁer. Here, the

137

weight is related to the discrimination. The smaller weight

138

means the larger discrimination. In this case, ﬁnding a

139

binary partition with maximum class discrimination is

140

changed to solve a mincut problem in spectral clustering.

141

Fortunately, spectral clustering provides the way to solve

142

the relaxed version of this problem. Avoiding the exhaus-

143

tive search process, our algorithm can reduce the compu-

144

tational complexity of DECO C signiﬁcantly. Furthermore,

145

normalized cut in spectral clustering also bring a new

146

property (balanced column [

22]), which is helpful to avoid

147

imbalanced problems to a certain degree. The ﬁnal DECOC

148

design is obta ined by implementing a recursive process

149

with the aid of binary tree.

150

The paper is organized as follows: Sect.

2 provides a

151

simple description for ECOC framework. Section

3 pre-

152

sents the Spectral DECOC approach and the computational

153

complexity analysis. Section

4 shows the experimental

154

results on several datasets from different environments.

155

Finally, Sect.

5 concludes the paper.

156

2 ECOC

157

ECOC provides an efﬁcient way to solve the complex

158

multi-class classiﬁcation by decomposing the complex

159

multi-class classiﬁcation into a series of binary classiﬁ-

160

cation. There are two decomposing frameworks: Binary

161

ECOC and Ternary ECOC. In the Binary ECOC frame-

162

work, there are two symbols þ1; 1

which stand for a

163

negative class and a positive class in a binary problem. In

164

the Ternary ECOC framework, symbol 0

is added,

165

which stands for the ignored class in a binary problem. In

166

Fig.

1 four classical ECOCs are shown: (a) one-versus-all,

167

(b) one-versus-one, (c) dense random, (d) sparse random.

168

Each row of the ECOC matrix represents a codeword for

169

class C

i ¼ 1; 2; 3; 4ðÞand each column represents a par-

170

tition on the data set for a binary classiﬁer. The code

Pattern Anal Applic

123

Journal : Large 10044 Dispatch : 23-10-2015 Pages : 19

Article No. : 523

h LE h TYPESET

MS Code : PAAA-D-15-00264 h CP h DISK

Author Proof

剩余22页未读，继续阅读

weixin_38558623

粉丝: 4
资源: 930

谱聚类提升多类问题的ECOC鉴别纠错：新方法与应用对比

谱聚类驱动的纠错输出码优化方法

词与文档联合聚类：一种基于谱聚类的新方法

空域扇区优化算法：基于谱聚类与边界优化

基于谱聚类的聚类集成算法

一种基于谱聚类的社交关系数据处理方法

基于谱聚类的高阶模糊时序自适应预测方法

基于谱聚类的振动多模态信号幅谱分割研究与应用

基于谱聚类欠取样的不均衡数据SVM分类算法

基于谱聚类和稀疏表示的高光谱图像分类算法

论文研究-基于谱聚类的个性化推荐算法 .pdf

最新资源