无约束视频中的人类动作识别：轨迹协方差描述符方法

PDF格式 | 535KB | 更新于2024-08-26 | 36 浏览量 | 举报

"本文主要探讨了在无约束视频中的人类动作识别技术，特别是通过基于轨迹的协方差描述符来提升识别效果。作者提出了一个新颖的Trajectory-Based Covariance (TBC) 描述符，该描述符是在稠密轨迹的基础上构建的。为了将描述符矩阵映射到向量空间并减少数据冗余，采用了对数主成分分析（LogPCA）将TBC描述符矩阵投影到欧几里得空间。在具有挑战性的Hollywood2和TV Human Interaction数据集上进行了测试，实验结果显示，提出的TBC描述符优于传统的基于方向梯度直方图、光流直方图和运动边界直方图的基线描述符，并且在动作识别方面表现出更优的性能。" 本文的研究重点在于解决无约束视频中的人类动作识别问题，这是一个在多媒体事件检测和理解中至关重要的任务。作者提出的TBC描述符是针对这一领域的一个创新性贡献。它利用稠密轨迹作为基础，这通常包括跟踪视频中的运动物体，尤其是人体的关键部位，如关节或身体中心，以捕捉动作的连续变化。在计算描述符时，TBC方法考虑了轨迹的统计特性，如协方差，以捕获运动模式的复杂性和多样性。然而，原始的协方差矩阵可能维度较高且包含大量冗余信息。为了解决这个问题，文章引入了对数主成分分析（LogPCA）。LogPCA是一种降维技术，能够将高维的协方差矩阵转换为低维空间，同时保留关键信息，这有助于提高识别效率并减少计算复杂性。实验部分，作者选取了两个具有代表性的数据集——Hollywood2和TV Human Interaction。这两个数据集包含了多种复杂、多样且非结构化的人类交互动作，是评估无约束视频中动作识别算法的理想平台。通过对比实验，TBC描述符在这些数据集上的表现优于其他常用的描述符，如HOG（方向梯度直方图）、HOF（光流直方图）和MBH（运动边界直方图），这表明TBC在处理真实世界视频中的复杂动作识别时更具优势。这项工作展示了基于轨迹的协方差描述符在无约束视频人类动作识别中的潜力，为未来的研究提供了新的视角和方法。其贡献不仅在于提出了一种有效的特征表示，还在于通过LogPCA进行降维处理，以实现更高效的动作识别系统。

Human Action Recognition With Trajectory Based

Covariance Descriptor In Unconstrained Videos

Hanli Wang

∗

Yun Yi Jun Wu

Department of Computer Science and Technology, Tongji University, Shanghai, China

Key Laboratory of Embedded System and Service Computing, Ministry of Education,

Tongji University, Shanghai, China

{hanliwang,13yunyi,wujun}@tongji.edu.cn

ABSTRACT

Human action recognition from realistic videos plays a key

role in multimedia event detection and understanding. In

this paper, a novel Trajectory Based Covariance (TBC) de-

scriptor is proposed, which is formulated along the dense

trajectories. To map the descriptor matrix to vector space

and trim out the redundancy of data, the TBC descriptor

matrix is projected to Euclidean space by the Logarithm

Principal Components Analysis (LogPCA). Our method is

tested on the challenging Hollywood2 and TV Human In-

teraction datasets. Experimental results show that the pro-

posed TBC descriptor outperforms three baseline descrip-

tors (i.e., histogram of oriented gradient, histogram of opti-

cal ﬂow and motion boundary histogram), and our method

achieves better recognition performances than a number of

state-of-the-art approaches.

Categories and Subject Descriptors

I.2.10 [Artiﬁcial Intelligence]: Vision and Scene Under-

standing

General Terms

Algorithms, Exp erimentation, Performance

Keywords

TBC Descriptor; Motion Trajectory; LogPCA; Covariance

∗

H. Wang is the corresponding author. This work was sup-

ported in part by the National Natural Science Foundation

of China under Grant 61472281, the “Shu Guang” project of

Shanghai Municipal Education Commission and Shanghai

Education Development Foundation under Grant 12SG23,

the Program for Professor of Special Appointment (Eastern

Scholar) at Shanghai Institutions of Higher Learning (No.

GZ2015005), and the Fundamental Research Funds for the

Central Universities under Grant 0800219270.

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for proﬁt or commercial advantage and that copies bear this notice and the full cita-

tion on the ﬁrst page. Copyrights for components of this work owned by others than

ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-

publish, to post on servers or to redistribute to lists, requires prior speciﬁc permission

and/or a fee. Request permissions from Permissions@acm.org.

MM’15, October 26–30, 2015, Brisbane, Australia

° 2015 ACM. ISBN 978-1-4503-3459-4/15/10$15.00

DOI: http://dx.doi.org/10.1145/2733373.2806310.

1. INTRODUCTION

The past few years have witnessed a great success of so-

cial networks and multimedia technologies, leading to the

generation of vast amount of Internet videos. To well orga-

nize these videos and provide value-added services to users,

it is increasingly important to automatically understand the

human activities from videos. The success of many applica-

tions (e.g., intelligent visual surveillance, human computer

interaction, video retrieval and smart cameras) are condi-

tioned on the accuracy of human action recognition. A num-

ber of research studies are fo cused on this challenging topic,

such as [1, 2, 3], to name a few.

There are generally two main processing steps about a

typical human action recognition algorithm. The ﬁrst is fea-

ture extraction, in which the human action is described by

feature vectors. The second is detection, in which the fea-

ture vectors are utilized for event classiﬁcation. This paper

focuses on the ﬁrst step by describing human actions with

dense trajectory and Riemannian manifolds.

The major contributions of this work are summarized as

follows. First, a novel Trajectory Based Covariance (TBC)

descriptor is proposed to describe human actions. Unlike

other covariance descriptors, the TBC descriptor is formu-

lated along the dense trajectories, which enhances the ability

of describing human actions. Second, the TBC descriptor is

projected to Euclidean space by the Logarithm Principal

Components Analysis (LogPCA) to further improve its de-

scribing ability. The rest of this paper is organized as follows.

The proposed TBC descriptor for human action recognition

is introduced in Section 2. The experimental setup and re-

sults are presented in Section 3. Finally, Section 4 concludes

this paper.

2. TBC DESCRIPTOR

2.1 Descriptor Formulation

As stated in [3], the dense trajectory based method with

camera motion estimation is able to achieve excellent human

action recognition performances on challenging datasets. In-

spired by this, the proposed TBC descriptor is designed

along dense trajectories. Given a video, it is ﬁrstly divided

into N trajectories T = {T

, T

, · · · , T

}, and a trajectory

can be deﬁned as

= {R

(W, H), R

(W, H), · · · , R

(W, H)}, (1)

1175

下载后可阅读完整内容，剩余3页未读，立即下载

weixin_38681301

粉丝: 5

无约束视频中的人类动作识别：轨迹协方差描述符方法

RegionCovarianceDes​criptor.zip:区域协方差描述符的纯 Matlab 实现-matlab开发

基于协方差矩阵重构和相似约束的零宽自适应波束形成

颜色分类leetcode-RGBD-Covariance-Descriptors:使用字典学习和RGB-D协方差描述符的对象分类

基于协方差检测图片中的汽车或者卡车_协方差_汽车识别_卡车识别_matlab

基于局部特征的高斯描述符用于人员重新识别

基于Matlab协方差检测图片中的汽车或者卡车

行业分类-设备装置-基于图像协方差特征的手写体数字识别方法及装置.zip

基于协方差的车辆检测，用协方差矩阵为描述子，检测图片中的汽车或者卡车

Matlab实现区域协方差描述符教程

纯Matlab实现区域协方差描述符开发

最新资源

RegionCovarianceDescriptor.zip:区域协方差描述符的纯 Matlab 实现-matlab开发