深度视频驱动的视点与时间不变动作识别新方法

研究论文

195 浏览量更新于2024-08-26 收藏 93KB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

资源详情

资源推荐

An effective view and time-invariant action

recognition method based on depth videos

Zhi Liu

, Xin Feng

, Yingli Tian

College of Computer Science and Engineering, Chongqing University of Technology, Chongqing, 400050, China

Department of Electrical Engineering, The City College of New York, New York, NY 10031, USA

liuzhi@cqut.edu.cn, xfeng@cqut.edu.cn, ytian@ccny.cuny.edu

Abstract—Little progress has been achieved in hand-crafted

feature based human action recognition (HAR) for RGB videos

in recent years. The emergence of low price depth camera

presents more information for action recognition. Compared

to RGB videos, depth video sequences are more insensitive to

light changes and more discriminative in many vision tasks

such as segmentation and activity recognition. In this paper,

we propose an effective and straightforward HAR method by

using skeleton joints information of the depth sequence. First,

we calculate three feature vectors which capture angle and

position information between joints. Then, the obtained vectors

are used as the inputs of three separate support vector machine

(SVM) classiﬁers. Finally, the action recognition is conducted

by fusing the SVM classiﬁcation results. Our features are view-

invariant because the extracted vectors contain only angle and

normalized position information based on joint coordinates. By

normalizing action videos with different temporal lengths to a

ﬁxed size using interpolation, the extracted features have the same

dimension for different videos and can still keep the principal

movement patterns which make the proposed method time-

invariant. Experimental results demonstrate that our method

performs comparable results on the UTKinect-Action3D dataset,

and is more efﬁcient and simpler than state-of-the-art methods.

I. INTRODUCTION

HAR plays an important role in many applications such

as video surveillance, human-computer interaction, video re-

trieval, etc. In past several years, the progress on various

visual recognition tasks has been based mostly on hand-

crafted features including scale-invariant feature transform

(SIFT) [1], histograms of oriented gradient (HOG) [2], motion

history image (MHI) [3] etc. However, most of the canonical

visual recognition algorithms just build ensemble systems and

employee minor variants of successful methods, it is generally

acknowledged that progress has been slow in recent years [4].

Fortunately, the low-cost depth camera promotes researchers

reconsider problems of image processing and computer vision

[5]. Different from RGB camera which captures color and

texture information, depth camera records depth information

with the geometric and skeleton joints information. In addition,

depth camera is insensitive to light changes and more discrim-

inative than color and texture features in many problems such

as segmentation and activity recognition. In this paper, we

propose an effective and straightforward HAR method by only

utilizing skeleton joints information. The proposed method

extracts angle and normalized position information to form

feature vectors from skeleton joint coordinates, which make

it view-invariant. By normalizing action videos with differ-

ent lengths to a ﬁxed size using interpolation, the extracted

features have the same dimension for different video and

keep principal movement patterns which make the proposed

method time-invariant. Experimental results demonstrate that

our method performs comparable results on the UTKinect-

Action3D dataset but is more efﬁcient and simpler than the

state-of-the-art methods The key contributions of this work

are summarized as follows:

1) We propose an effective and simple method for action

recognition just using skeleton joints information for

depth video sequences. Experimental results demon-

strate that our proposed method is time and view-

invariant.

2) Two different hand-crafted joint feature vectors which

are Hip center based vector (HCBV) and angle vector

(AV) are proposed. Pairwise relative position [6] vector

(PRPV) is improved.

3) By fusing classiﬁcation results from three hand-crafted

features, the recognition accuracy of the proposed

method is comparable to and more efﬁcient and simpler

than the state-of-the-art methods.

The remainder of this paper is organized as follows: Section

II reviews the related work. Section III presents the details

of three hand-crafted features. The experimental results and

discussions are presented in Section IV. Finally, Section V

concludes the paper.

下载后可阅读完整内容，剩余3页未读，立即下载

weixin_38671048

粉丝: 4
资源: 870

深度视频驱动的视点与时间不变动作识别新方法

基于深度卷积神经网络的位置识别方法.pdf

基于三维点云和OpenCV的物品识别研究.pdf

基于深度学习和已知三维场景想一个视点选择方法

基于深度图像的绘制技术是DIBR技术的中文名吗

基于深度图像的绘制技术是什么

DIBR技术的中文名是基于深度图像的绘制技术吗

虚拟视点合成技术与DIBR技术的关系

虚拟视点合成技术会用到DIBR技术吗

虚拟视点合成研究背景和意义

osgearth视点的功能有哪些

基于视觉的空间物体体积测量技术综述

基于halcon的双目立体视觉系统实现

MVDnet原理详细讲解

matlab 点云转深度图

MVDnet的神经网络详解

双目立体视觉国内外研究现状

请解释一下计算机图形学中模型变换和视点变换的对偶性

基于双目立体视觉的三维立体重建matlab+opengl代码

启动分析过程：确认利益相关者、识别视点、协同工作、首次提问

为什么需要从多个不同的视点来描述和分析软件体系结构

最新资源