时空图路由：骨架动作识别新方法

需积分: 9 31 浏览量更新于2024-08-12 收藏 1.12MB PDF 举报

"这篇研究论文探讨了基于骨架的动作识别中的时空图路由方法，旨在改进现有的骨骼动作识别技术，通过自适应学习骨骼关节之间的内在高阶关联性，提高识别的准确性和鲁棒性。作者提出了一个新颖的时空图路由（STGR）方案，该方案由空间图路由器和时间图路由器两部分组成，能够捕捉物理上分离的骨骼关节的动态交互。" 在当前的计算机视觉和人工智能领域，基于骨架的动作识别是一个关键的研究方向，它在安全监控、人机交互和体育分析等领域有着广泛的应用。传统的骨架结构依赖于固定的物理连接，这种方法往往无法有效地捕捉到骨骼关节间的内在高阶关联，限制了识别性能。论文提出的STGR方案是一种创新的方法，它打破了固定骨骼结构的限制，通过学习和路由骨骼节点之间的时空信息来挖掘动作的复杂模式。空间图路由器关注于静态的骨架结构，即在单个帧内关节之间的关系，而时间图路由器则处理帧与帧之间的动态变化，捕捉动作的连续性和流畅性。这种两步方法结合了空间和时间维度的信息，可以更好地理解动作的连贯性和关节运动的规律。具体来说，空间图路由器通过学习关节的邻接关系，构建了一个可变的图结构，这个结构可以动态地反映关节间的非物理联系，从而揭示了关节间的隐藏关联。时间图路由器则利用这一空间关系，通过路由算法跟踪关节在不同时间步的动态变化，这有助于捕捉动作的时间演化特征。此外，论文可能还讨论了STGR在实际应用中的效果，包括实验设计、数据集选择（如NTU RGB+D或Kinetics）、性能比较（与其他方法的准确率对比）以及可能的局限性和未来的研究方向。这样的工作不仅提升了动作识别的准确性，也为理解和解决骨架数据的复杂性提供了新的视角，对后续研究具有重要启示。 "基于骨架的动作识别的时空图路由"研究论文提出了一种新颖的框架，通过自适应的图路由策略，提高了对复杂人体动作的理解和识别能力，为骨骼动作识别技术的发展开辟了新的道路。

The Thirty-Third AAAI Conference on Artiﬁcial Intelligence (AAAI-19)

Spatio-Temporal Graph Routing for Skeleton-Based Action Recognition

Bin Li,

Xi Li,

2∗

Zhongfei Zhang,

Fei Wu

College of Information Science & Electronic Engineering, Zhejiang University, Hangzhou, China

College of Computer Science and Technology, Zhejiang University, Hangzhou, China

{bin li, xilizju, zhongfei}@zju.edu.cn wufei@cs.zju.edu.cn

Abstract

With the representation effectiveness, skeleton-based human

action recognition has received considerable research at-

tention, and has a wide range of real applications. In this

area, many existing methods typically rely on ﬁxed physical-

connectivity skeleton structure for recognition, which is in-

capable of well capturing the intrinsic high-order correla-

tions among skeleton joints. In this paper, we propose a novel

spatio-temporal graph routing (STGR) scheme for skeleton-

based action recognition, which adaptively learns the in-

trinsic high-order connectivity relationships for physically-

apart skeleton joints. Speciﬁcally, the scheme is composed

of two components: spatial graph router (SGR) and tempo-

ral graph router (TGR). The SGR aims to discover the con-

nectivity relationships among the joints based on sub-group

clustering along the spatial dimension, while the TGR ex-

plores the structural information by measuring the correla-

tion degrees between temporal joint node trajectories. The

proposed scheme is naturally and seamlessly incorporated

into the framework of graph convolutional networks (GCNs)

to produce a set of skeleton-joint-connectivity graphs, which

are further fed into the classiﬁcation networks. Moreover, an

insightful analysis on receptive ﬁeld of graph node is pro-

vided to explain the necessity of our method. Experimental

results on two benchmark datasets (NTU-RGB+D and Kinet-

ics) demonstrate the effectiveness against the state-of-the-art.

Introduction

As a challenging problem in computer vision, skeleton-

based human action recogntion takes 3d human body co-

ordinates as input and outputs action class, which attracts

increasing attention recently (Wang et al. 2018b). Typically,

human body skeletons characterize the geometric body con-

ﬁguration as rigid body, and their dynamics capture mo-

tion patterns in a continuous way. This dynamic geomet-

ric structure expresses relation among the joints not only

spatially but also temporally. By this means, graph repre-

sentation is the natural way to express the intrinsic human

structure. Therefore, it is crucial to automatically represent

joints on the given graph. Recent success of Spatial Tem-

poral Graph Convolution Networks (ST-GCN) (Yan, Xiong,

and Lin 2018) has justiﬁed the effectiveness by a graph

∗

Corresponding author: Xi Li

 2019, Association for the Advancement of Artiﬁcial

(a)

(b)

(c)

correlated

sub-group

neighbour

Layer 𝐿

Layer 𝐿 + 1

Figure 1: Illustraion of three routing ways: (a) ﬁxed routing

by physical connections; (b) spatial routing by considering

local clustering; (c) temporal routing by modeling the corre-

lation degrees of node trajectories.

aggregation scheme with physical human skeleton, against

the existing literatures such as pseduo images (Wang et al.

2018a; Xie et al. 2018), variants of LSTM (Shahroudy et al.

2016; Song et al. 2017; Liu et al. 2017).

In general, the graph-based method applies a ﬁxed hu-

man skeleton to graph convolution operation and iteratively

aggregates the hidden feature with neighbourhood features.

However, it is challenging to capture changeable human

structure in complex scene. This brings three-fold problems

for further improvement: 1) The skeleton itself is change-

able and depends on speciﬁc dataset, e.g., 25 joints in NTU-

RGB+D (Shahroudy et al. 2016) while 18 joints in Kinet-

ics (Kay et al. 2017), resulting in confusion on real human

skeleton; 2) The joint connections are highly unbalanced.

While torso joints become over-smoothing, limb joints may

still be under-smoothing, which causes extreme difﬁculty on

feature sharing for two limb joints; 3) A global graph struc-

ture is applied to each sample, raising the question “one size

8561

下载后可阅读完整内容，剩余7页未读，立即下载

weixin_38604330

粉丝: 6

时空图路由：骨架动作识别新方法

基于深度学习的人体骨架动作识别.pdf

TS-TCN基于骨架的人体动作识别算法

Python-骨架动作识别论文汇总

基于人体骨架的异常行为动作识别国内外参考文献

TS-TCN算法是如何实现骨架数据的人体动作识别，并提高其准确性的？

基于人体骨架的异常行为动作识别国内外研究综述

如何利用ST-GCN进行时空动作数据的特征提取和动作识别？请提供基于《ST-GCN人体动作识别系统：完整python源码与应用指南》的实践步骤和示例代码。

transformer骨架行为识别

TS-TCN算法是如何结合时空信息来提高骨架数据的人体动作识别准确性的？请详细解释其工作原理及其优势。

在实施基于ST-GCN的人体动作识别项目时，应如何对输入数据进行预处理以提高模型的识别准确率？

最新资源