步态识别：GEI子空间投影与协同表示分类法

172 浏览量更新于2024-08-27 收藏 1.72MB PDF 举报

"通过GEI子空间投影和协同表示分类进行步态识别" 这篇研究论文发表在《Neurocomputing》2018年的275期，由Wei Li、C.-C. Jay Kuo和Jingliang Peng共同撰写。文章主要探讨了一种新的基于视频的步态识别方法，该方法旨在实现鲁棒且高效的性能。作者提出的方法分为两个主要模块：（1）通过GEI（Gait Energy Image）的子空间投影进行特征提取，以及（2）采用协同表示分类（CRC）进行识别。步态识别是生物识别技术的一种，主要用于无接触地识别个体身份，尤其适用于监控和安全领域。本文关注的是如何在复杂环境下提高步态识别的准确性和稳定性。GEI是一种常用的方法，它将连续帧的步态序列转化为单一的二维图像，从而简化了步态特征的提取过程。在第一模块中，研究者使用一系列的子空间投影来处理GEI。子空间学习是一种有效的降维方法，可以减少数据的冗余，突出关键信息。通过这种方式，他们可以从GEI中捕获到与步态相关的显著特征，这些特征对个体身份的区分至关重要。子空间投影可能包括主成分分析（PCA）、独立成分分析（ICA）或其他类似的线性或非线性变换。第二模块是采用协同表示分类器进行识别。CRC是一种近年来提出的分类方法，它依赖于数据本身的线性表示。在步态识别问题中，CRC通过寻找最接近测试样本的训练样本的线性组合来进行分类，这有助于增强模型的泛化能力和鲁棒性，特别是在存在噪声或部分遮挡的情况下。文章经过2017年5月的提交，8月的修订，最终在10月被接受，并于11月在线发布。关键词包括步态识别、步态能量图像、子空间学习和协同表示，表明了研究的主要关注点。这篇研究论文提出了一种结合GEI子空间投影和CRC的新方法，以应对步态识别中的挑战，如环境变化、视角差异和身体部分遮挡等。通过这两个模块的联合应用，期望能够提升步态识别系统的性能和可靠性。

1934 W. Li et al. / Neurocomputing 275 (2018) 1932–1945

based on spatiotemporal shape variation. It captures the motion

information over time and represents normalized frame differ-

ences over a gait cycle. The dynamic texture descriptors and lo-

cal binary patterns from three orthogonal planes were used to de-

scribe the human gait in a spatiotemporal manner by Kellokumpu

et al. [31] . Kusakunniran et al. [32] used higher-order shape conﬁg-

uration based on a differential composition model for cross-speed

gait recognition. Jure Kova

and Peter Peer [33] investigated the

inﬂuence of walking speed variation to different gait recognition

approaches and proposed normalization based on geometric trans-

formations to mitigate the inﬂuence in gait recognition. Mansur

et al. [34] proposed to model speed change using a cylindrical

manifold whose azimuth and height correspond to the phase and

the stride, respectively. Huang et al. [35] presented a scheme com-

posed by speed invariant gait template (SIGT) and normalized hy-

pergraph classiﬁer for cross-speed gait recognition. These methods

used the shape and dynamic information of frames for recognition.

However, extracting the spatiotemporal shape and motion informa-

tion frame by frame is a time-consuming task, and extracted fea-

tures can be sensitive to noise.

Worapan Kusakunniran [36,37] proposed the histogram of

space-time interest points descriptors (HSD) as a gait feature. Cas-

tro et al. [38] proposed the pyramidal Fisher motion (PFM) de-

scriptors by combining densely sampled local features and Fisher

vectors for both single-view and multi-view gait recognition. Un-

like most appearance-based methods that rely on human silhou-

ettes obtained from the foreground-background segmentation, the

HSD-based and the PFM-based methods extract gait features di-

rectly from the raw gait videos.

Recently, deep learning techniques have been employed for gait

recognition [39–42] . Alotaibi and Mahmood [39] proposed a spe-

cialized deep convolutional neural networks (CNN) for gait recog-

nition. Their CNN architecture consists of multiple convolutional

and sub-sampling layers, making the gait recognition scheme ro-

bust against certain types of variations. Yan et al. [40] proposed

to use convolutional neural networks (ConvNets) and multi-task

learning model (MLT) to identify human gait and predict multi-

ple human attributes simultaneously. Castro et al. [41] proposed

to use CNN to learn high-level descriptors from low-level motion

features (i.e. optical ﬂow components) for gait recognition. Zhang

et al. [42] proposed a Siamese neural network (SiaNet) with dis-

tance metric learning for gait recognition. In general, the perfor-

mance of these methods is highly dependent on the training sam-

ples. With suﬃcient training samples containing rich variances,

these methods can learn effective features automatically, leading

to good recognition performance.

2.2. Multi-View Gait Recognition

As compared with single-view gait recognition, the appear-

ance variation caused by multiple viewing angles brings even

more challenges for robust gait recognition. Algorithms have been

proposed to address the viewing angle variation. Examples in-

clude [11,12,27,43–53] . These methods can be divided into three

types: methods [11,43–45,47,49] based on view transformation

models (VTMs), methods [46,51,52] based on pairwise projection

by canonical correlation analysis (CCA) or nonlinear coupled map-

pings (NCMs) and the others [12,27,38,48,50,53] without learning

speciﬁc VTMs or speciﬁc pairwise subspaces.

Makihara et al. [11] used frequency-domain features and

view transformation models (VTMs) for multi-view gait recog-

nition. Kusakunnira emphet al. [43] exploited the VTMs based

on optimized GEIs for further performance improvement. Zheng

et al. [44] considered VTMs using the partial least squares on

the GEI, which offered more robust performance against varia-

tions in viewing angle, clothing and object carrying. Muramatsu

et al. [45] proposed a VTM-based approach by using transfor-

mation consistency measures for cross-view recognition. More-

over, Muramatsu et al. [47] proposed multiple quality measures

for VTM-based cross-view gait recognition. The key idea is to as-

sociate the quality measures with the degree of how well the

test subjects’ gait features are represented by a joint subspace

spanned by the training subjects’ gait features. Still further, Mura-

matsu et al. [49] proposed an arbitrary view transformation model

(AVTM) to match a pair of gait traits from an arbitrary view. These

methods based on VTMs achieved high performance for dealing

with multi-view gait recognition. However, they are all based on

the assumption that the viewing angles of the gallery and the

probe sets are known as a prior, which imposes a strong restric-

tion on gait applications. Besides, it is burdensome to learn a spe-

ciﬁc VTM for every pair of views and the recognition rate is highly

dependent on the density of view sampling.

Kusakunniran et al. [46] carried out motion co-clustering to par-

tition the most related parts of gaits from different views into

the same group. Inside each group, a linear correlation between

gait information across views is further maximized through CCA.

Xing et al. [51] proposed a complete canonical correlation analy-

sis (C3A) method to deal with multi-view gait recognition. As re-

ported in these papers, methods of this type currently achieve the

highest recognition rates among all the multi-view gait recognition

methods. Ben et al. [52] proposed a novel nonlinear coupled map-

pings (NCMs) algorithm to successfully match between the cross-

domain gaits. The relationships within the training data as nodes

in a graph are modeled in the kernel space and the constraint is

designed to make the difference minimized between cross-domain

gaits for an identical subject. However, these methods also as-

sume that the viewing angles of the gallery and the probe sets

are known as a prior. Besides, they need to learn a projection sub-

space, through CCA, C3A or NCMs, for every pair of views and the

recognition rate is highly dependent on the density of view sam-

pling. All of these form a strong barrier to practical use of this type

of methods.

Yu et al. [12] proposed a framework for gait recognition perfor-

mance evaluation, and employed the GEI and the nearest neigh-

bor classiﬁer for multi-view gait recognition. Dupuis et al. [27] and

Choudhury et al. [50] both adopted a two-step hierarchical recog-

nition procedure which, for a probe sample, ﬁrstly predicts its

viewing angle and secondly ﬁnds the match in the predicted

subset of the gallery. These algorithms do not require any prior

knowledge about the probe or the gallery samples, but their

recognition rates are dependent on the prediction accuracy and

the completeness of the gallery subsets. Makihara et al. [48] de-

scribed a method of multi-view discriminant analysis with tensor

representation (MvDATER) for multi-view gait recognition. How-

ever, there must be suﬃcient training samples for it to learn mul-

tiple view-speciﬁc projection matrices. Besides, the tensor repre-

sentation is sensitive to large viewing angle changes. Wu et al.

[53] conducted multi-view gait recognition via similarity learning

by deep CNN. They trained deep networks to recognize the most

discriminative changes of gait patterns by a small group of labeled

multi-view human walking videos.

Despite all the above-mentioned effort s, the achieved multi-

view gait recognition rates are still relatively low. This is largely

due to the fact that the viewing angle variation brings larger intra-

class variances than other types of variation. Furthermore, the

viewing angle variation on top of other variation types adds even

步态识别：GEI子空间投影与协同表示分类法

人体步态识别算法

步态识别gaitgan论文

步态识别经典外文

论文研究-基于协同表示的步态识别.pdf

butaishibie.zip_步态识别_步态识别C++

步态识别C++实现

步态识别python实现

步态能量图，Alexnet网络的MATLAB编程学习和实现，以步态识别为例

基于多视角步态的人类识别的视图不变判别投影

包含了步态识别的整一套流程的代码实现matlab.rar_gait recognition_matlab 行人识别_步态_步态能

最新资源