二维随机投影与约束子空间：提升图像识别的新方法

84 浏览量更新于2024-08-26 收藏 1.57MB PDF 举报

本文探讨了基于二维随机投影和最近约束子空间的图像识别方法，这是一项在计算机视觉领域中的前沿研究。首先，文章关注的是监督图像分类问题，即通过对图像进行特征提取，然后利用二维随机投影技术进行有效的数据降维和特征表示。二维随机投影是一种压缩感知技术，通过将高维数据映射到低维空间，保持关键信息的同时减少计算复杂性。传统的图像识别往往依赖于高维特征空间，但随着数据量的增长，这种方法的效率和存储需求成为瓶颈。二维随机投影通过将原始特征向量投影到一个随机生成的低维基底，保留了主要特征方向上的信息，同时通过限制投影后的向量长度，实现了特征的稀疏表示。这种方法利用了'1-范数最小化和'0-范数稀疏表示的概念，以提高识别精度和减少噪声的影响。文章的核心是引入最近约束子空间的概念，它是一种基于局部结构的方法，通过寻找与每个投影后样本最接近的特定子空间，来进行分类决策。这样，即使在数据集中存在噪声或异常值，也能找到一个在局部上对样本集有良好解释的子空间，从而提高了识别的鲁棒性。此外，文中还讨论了如何利用近似阿菲哈赫（affine hull）来估计图像的内在维度，这对于理解数据的复杂性和优化算法性能至关重要。通过估计图像特征的阿菲哈赫边界，可以更准确地确定最相关的特征维度，进一步提升图像识别的效率和准确性。整个研究流程包括特征提取、二维随机投影、约束子空间搜索以及后续的分类决策。这种方法对于大规模图像数据集具有显著的优势，不仅在降低计算成本的同时，还能保持较高的识别性能。最后，该研究的结果于2012年9月14日接收，经过修订后于2014年3月23日接受，并于同年4月3日在线发布，为图像识别领域的实践者和理论研究者提供了有价值的新视角和技术手段。

be the optimal classiﬁer using the nearest distance as the proxim-

ity measurement.

2. 2DCS: two-dimensional compressive sampling

In 1DCS, images are ﬁrst recast as vectors and then projected to

a lower dimensional space, namely image x 2 R

MN

is represented

by vector x

2 R

Alternatively, we propose that the matrix x 2 R

MN

can be pro-

jected by a column-wise approach using a matrix U

2 R

mM

ðm < MÞ as follows [34].

z ¼ U

x ð11Þ

After Eq. (11), the row number of z is reduced to m. In the con-

text of 2DCS, we call Eq. (11) the step of ‘‘row compression’’.

Similarly, the right multiplication of z by U

2 R

Nn

ðn < NÞ

leads to ‘‘column compression’’, yielding a matrix y 2 R

mn

follows.

y ¼ U

: ð12Þ

Due to its similarity to 1DCS, we call Eq. (12) 2DCS (two-dimen-

sional compressive sampling). As a kind of stepwise implementa-

tion of 1DCS, 2DCS reduces feature extraction to a series of

subtasks. Thus, the computational complexity of 2DCS is signiﬁ-

cantly less than that of 1DCS, which is superlinear function of

the input scale.

If the sparsity of x is appropriately harnessed, the reconstruc-

tion of x from y is guaranteed.

The 2DCS reconstruction requires two steps of reconstructions,

i.e., the column reconstruction and row reconstruction as follows.

(S1) Column reconstruction



row;i

¼ argmin

row;i

1N

ðz

row;i



subject to y

¼ z

row;i

8i ¼ 1; 2; ...; m ð13Þ

where z



row;i

is the ith recovered row of z¼

x; y

is the ith row of y

and

ðÞ is a sparsifying transformation, which transforms a target

vector or matrix (not explicitly sparse) to sparse one. For image

data,

could be TV (Total Variation) transform. If the target vector

x is already sparse itself, then

is the identity transformation.

After the above step, z¼

x is recovered.

(S2) Row reconstruction



¼ argmin

2 R

ðx



subject to z

col;j

¼ U

8j ¼ 1; 2; ...; N ð14Þ

where x



is the recovered jth column of x and z

col;j

is the jth column

of z.

After the two steps, x is recovered.

To be more speciﬁc, given (column) vector u, which is not

explicitly sparse (e.g., u is a vector from image data), and its mea-

surements b ¼ Du, the reconstruction of u via b; D can be imple-

mented via TV (Total Variation) minimization [36].TV

minimization is deﬁned as follows.



¼ argmin

ðuÞkk

subject to Du ¼ b ð15Þ

where

ðuÞ is the discrete gradient vector of u at position i.

Hereinafter, given projection matrix D and vector b, we denote

the solution of Eq. (15) by TVðD; bÞ. Thus, we summarize our algo-

rithm of 2DCS image reconstruction via TV minimization as Algo-

rithm 1.

Algorithm 1. 2DCS Image Reconstruction via TV minimization

Input: Projection matrices U

2 R

mM

, U

2 R

Nn

and

y 2 R

mn

Output: Reconstructed x 2 R

MN

1: Y y

; D U

;

2: for i 1 to m do . Column Reconstruction

3: b YðiÞ; .YðiÞ is the i-th column of matrix Y

4: UðiÞ TVðD; bÞ; . UðiÞ is the ith column of matrix U

5: end for

6: z U

; . Column Reconstruction Completed

7: D U

;

8: for i 1 to N do .Row Reconstruction

9: b zðiÞ; . zðiÞ is the ith column of matrix z

10: xðiÞ TVðD; bÞ; . xðiÞ is the ith column of matrix x

11: end for

12: return x; .Row Reconstruction Completed

3. NCSC: Nearest Constrained Subspace Classiﬁer

In this section, we extend NN, NFL and NS to a uniﬁed classiﬁer

called NCSC (Nearest Constrained Subspace Classiﬁer), in which,

the employed constrained subspaces with the tuned intrinsic

dimension parameter are better approximations to the data mani-

folds than those of NN, NFL and NS.

3.1. Manifold perspective and manifold approximation

From the geometric point of view, the vectors representing the

natural images of the same class generally reside on (or near to) a

low dimensional geometric structure known as manifold, embed-

ded in the high dimensional feature space [37–39]. If the data man-

ifolds for all the classes can be learned, then it would be possible to

design more effective classiﬁers. The concept of manifold has long

been a powerful analytical tool for understanding image classes,

for example images of human face or handwritten digits [40–42].

In the last decade, some well-known manifold learning algo-

rithms have emerged, such as ISOMAP [37], LLE (Local Linear

Embedding) [38], Laplacian Eigenmap [39], Hessian Eigenmaps

(HLLE) [43], Maximum Variance Unfolding (MVU) [44] and Local

Tangent Space Alignment (LTSA) [45]. However they are not

designed to solve the problem of classifying new images. Although

there are some works which attempt to deal with this problem

[46–48], the algorithms are all unsupervised and designed for a

single manifold, not for multiple manifolds. These algorithms are

unsuitable for supervised multi-class classiﬁcation, in which each

class is modeled by a manifold.

As discussed in Section 1.3, NM (Nearest Manifold), with the

nearest distance criterion, is believed to be optimal in terms of

classiﬁcation accuracy. But due to the unavailability of NM, we

argue that some approximation strategies should be exploited.

Since the training data are the points on manifolds, if there are

enough well-distributed training data, then the manifold can be

accurately approximated.

From this viewpoint, we argue, to achieve an accurate manifold

approximation, it is necessary to make the intrinsic dimension of

equal to the intrinsic dimension of manifold M

(

i ¼ 1; ...; K,

given K classes). Otherwise the accuracy of the approximation to

the manifold can not be guaranteed. We call this criterion the

dimension equality.

L. Liao et al. / J. Vis. Commun. Image R. 25 (2014) 1187–1198

1189

剩余11页未读，继续阅读

weixin_38627521

粉丝: 5
资源: 924

二维随机投影与约束子空间：提升图像识别的新方法

53677763SSI-MATLAB_SSI_随机子空间_随机子空间方法_

data_随机子空间_随机子空间模态辨识_子空间辨识_

摄像测量学与极线约束在特征点匹配中的应用

MATLAB图像恢复与重建技巧

【MATLAB图像处理实战指南】：揭秘图像处理全流程

【异常检测方法】：全方位技术解析，识别异常行为

MATLAB机器学习：探索机器学习算法和技术的实用手册

1300张图片训练效果

springboot116基于java的教学辅助平台.zip

yolo算法-火灾探测数据集-3466张图像带标签-火灾fire_detect-oqlpv.zip

最新资源