关键点选择与时空挖掘：提升骨架动作识别效率

需积分: 12 117 浏览量更新于2024-08-13 收藏 1.55MB PDF 举报

本文主要探讨了"基于骨架的动作识别关键点选择和时空挖掘"这一研究主题，由作者Zhikai Wang、Chongyang Zhang、Wu Luo和Weiyao Lin共同贡献，他们分别来自上海交通大学电子与信息工程学院和上海数字媒体处理与传输实验室。他们的工作关注的是在 skeleton-based action recognition（基于骨骼的动作识别）领域中，如何更有效地利用关键点和时空特性来提升动作分类的准确性。当前的动作识别方法普遍倾向于挖掘动作的时空结构，特别是通过轨迹和时空注意力模型。然而，实际上，对于动作识别来说，不是所有的人体关节及其位置特征都是同等重要的。例如，手部、头部和腿部的关键位置变化对于动作类别具有决定性作用。因此，该研究提出了一种新颖的框架——Key Joints Selection and Spatiotemporal Mining（关键点选择和时空挖掘），旨在同时捕捉关键关节的位置和速度变化（即位置&速度直方图），以及关节轨迹特征，以此增强动作分类的表现。首先，作者构建了一个人类关节位置和速度的直方图，这个工具能更好地展现人体运动在时间和空间上的动态模式。通过这种方式，不仅可以提取出动作的局部特征，还能捕捉到动作的整体趋势，从而提高了模型对动作模式的敏感性和区分度。关键点的选择在这里起到了筛选和聚焦的作用，确保了模型只关注对动作识别至关重要的部分，减少了冗余信息，提升了识别效率。此外，时空挖掘部分可能涉及到对动作序列的时序分析，可能包括滑动窗口技术、递归神经网络（RNNs）、长短时记忆网络（LSTM）或注意力机制，这些都用于捕捉动作随时间演变的规律。通过结合关键点和时空特征，该框架能够更好地理解动作的动态行为，有助于提升在复杂动作识别任务中的性能，如体态识别、手势控制或运动捕捉等应用。总结来说，这篇研究论文创新性地提出了一个骨架动作识别框架，通过关键点选择和时空挖掘相结合的方法，优化了特征提取过程，提高了动作识别的准确性和鲁棒性。这对于推动骨架数据在计算机视觉领域的应用具有重要意义，也为后续的动作识别研究提供了新的思路和技术手段。

KEY JOINTS SELECTION AND SPATIOTEMPORAL MINING FOR SKELETON-BASED

ACTION RECOGNITION

Zhikai Wang

,Chongyang Zhang

1,2∗

,Wu Luo

, and Weiyao Lin

School of Electronic Information and Electrical Engineering,

Shanghai Jiao Tong University, Shanghai 200240, China

Shanghai Key Lab of Digital Media Processing and Transmission, Shanghai 200240, China

∗

Corresponding email: sunny zhang@sjtu.edu.cn

ABSTRACT

Trajectories and spatiotemporal attention model have been

successfully used in skeleton-based action recognition. Most

existing methods focus more attention on temporal structure

mining. However, only a few local joints and their position

features (e.g., critical position changes of hand, head, leg etc.)

are responsible for the action label. In this work, we introduce

a novel action recognition framework using Key Joints Se-

lection and Spatiotemporal Mining, which can identify both

key joints and their position & velocity histogram as well as

trajectory features for action classiﬁcation. First, histogram

of human joints position and velocity are developed to en-

hance the spatiotemporal structure representation of existing

trajectory-based methods. Second, the key joints are select-

ed according to their information gains, and then their posi-

tion & velocity histograms are weighted and composed with

trajectory features to form one richer representation for ﬁ-

nal action classiﬁcation. Experiments on two widely-tested

benchmark datasets show that by combining the strength of

both richer features and key joints selecting, our method can

achieve state-of-the-art or competitive performance compared

with existing results using sophisticated models such as deep

learning, with advantages regarding the recognition accuracy

and robustness.

Index Terms— Action recognition, key joints, position &

velocity histograms, spatiotemporal mining, skeleton

1. INTRODUCTION

Action recognition has attracted much attention due to its im-

portance in many applications. Thanks to the development of

commodity RGB-D cameras, skeleton-based action recogni-

tion has drawn considerable attention in the computer vision

community recently [1, 2]. Although the recent advances in

deep convolutional networks (ConvNets) have brought some

improvements on action recognition [3], it remains a difﬁcult

challenge due to the problem that they require a large num-

ber of labeled videos for training [4], while most available

datasets, especially the skeleton-based 3D action datasets, are

relatively small. Thus, traditional handcrafted feature based

methods are still useful for 3D action recognition.

In recent years, many learning-based methods have been

proposed for skeleton-based action recognition. Three cate-

gories of approaches are often used: spatial modeling, tem-

poral modeling, and spatiotemporal modeling. The modeling

in the spatial domain is mainly driven by the fact that an ac-

tion is usually only characterized by the interactions or com-

binations of a subset of skeleton joints [5]. In HBRNN [6],

skeletons are decomposed into ﬁve parts and a hierarchical

recurrent neural network is built to model the relationship a-

mong these parts. Similarly, in [7] a part-aware model is pro-

posed to construct the relationship between body parts. In S-

MIJ [8], the most informative joints are selected simply based

on measures such as mean or variance of joint angle trajecto-

ries. On the temporal domain, temporal pyramid matching

[9], and dynamic time warping [10] or segmentation [11] are

the common methods for temporal modeling. In [12], short-

term and long-term temporal models are combined to form

a multi-model framework. Many efforts on spatiotemporal

modeling are also proposed: In [13], LSTM model is extend-

ed to spatiotemporal domain to analyze skeletons, spatiotem-

poral vector of locally max pooled features are developed in

[14], and spatiotemporal attention parts are selected in [2].

Good features are crucial to reliable action recognition.

Although features developed from existing works have shown

big improvements in many domains, most of the mentioned

methods pay more attention to temporal trajectory features

while largely ignore key local parts’ spatial patterns: posi-

tion and velocity distribution of the key body parts. Without

this type patterns, they have limitations in precisely differen-

tiating the ambiguity among ﬁne-grained action classes due

to the subtle inter-class trajectory differences. For example,

the actions of Horizontal-arm-wave and High-arm-wave have

similar hands trajectory. Conversely, the hands height his-

tograms of these two actions have notable differences (Fig.

1), which can be used to distinguish them more easily. In an-

other case, actions with inverse part activities, such as pull and

push, are easy to be confused due to the similar trajectory and

下载后可阅读完整内容，剩余4页未读，立即下载

weixin_38547887

粉丝: 5
资源: 920

关键点选择与时空挖掘：提升骨架动作识别效率

基于深度学习的人体骨架动作识别.pdf

TS-TCN基于骨架的人体动作识别算法

基于骨架的动作识别的时空图路由

朴素贝叶斯matlab源码-ST-NBNN-demo:基于骨架动作识别的ST-NBNN演示

python-人体骨架动作识别系统（实时动作识别和骨架检测）+源代码+文档说明

时空图路由：骨架动作识别新方法

ST-GCN：时空图卷积网络在骨架动作识别中的应用

命令手册 Linux常用命令

最新资源