TaiChi: 高难度动作识别新挑战的网络视频数据集

研究论文

172 浏览量更新于2024-08-28 收藏 2.1MB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

本文介绍了一项名为"TaiChi: A Fine-Grained Action Recognition Dataset"的研究论文，该工作主要关注于细粒度动作识别领域的挑战。当前的动作识别领域已有了许多现有的数据集，但这些数据集往往在约束条件、视频质量和动作细节方面有所限制。TaiChi与众不同，它包含了一组由用户上传的无约束网络视频，这些视频包含了摄像头运动和部分遮挡的情况，这使得动作识别变得更加复杂且具有实际应用价值。 TaiChi 数据集的特点在于其规模和多样性，共有2,772个样本，覆盖了58个细致划分的动作类别。每个样本都经过了人工精细标注，确保了动作识别任务的准确性。研究人员针对这个数据集进行了基准测试，采用了先进的改进Dense Trajectory特征提取方法和Fisher Vector表示，结果显示其平均精度（Mean Average Precision, mAP）达到了51.39%，这是一个重要的性能指标，表明在处理这类具有复杂背景和遮挡的视频时，现有的技术仍有改进空间。这篇论文的重要贡献包括： 1. **数据集建设**：提供了新的细粒度动作识别基准，对于研究者来说，可以作为评估和改进算法的宝贵资源。 2. **场景多样性**：通过用户上传的视频反映了真实世界中可能遇到的各种环境和动作细节变化，提高了模型的泛化能力。 3. **基准结果**：通过提供详细的实验设置和性能指标，为后续研究者提供了衡量其方法优劣的参考标准。未来的研究方向可能包括发展更有效的特征提取算法，以应对视频中的动态背景和遮挡问题，以及探索深度学习模型，如卷积神经网络（CNN）或注意力机制，来提高细粒度动作识别的性能。此外，数据增强、多模态融合等技术也可能被用于进一步提升模型的鲁棒性和识别精度。TaiChi数据集和基准测试的结果为细粒度动作识别领域的未来发展奠定了坚实的基础。

资源详情

资源推荐

TaiChi: A Fine-Grained Action Recognition Dataset

Shan Sun, Feng Wang

∗

, Qi Liang, Liang He

Shanghai Key Laboratory of Multidimensional Information Processing

Dept. of Computer Science and Technology, East China Normal University

52141201004@stu.ecnu.edu.cn,fwang@cs.ecnu.edu.cn,51151201039@stu.ecnu.edu.cn,lhe@cs.ecnu.edu.cn

ABSTRACT

In this paper, we introduce TaiChi which is a ﬁne-grained

action dataset. It consists of unconstrained user-uploaded

web videos containing camera motion and partial occlusions

which pose new challenges to ﬁne-grained action recognition

compared to the existing datasets. In this dataset, 2,772 sam-

ples of 58 ﬁne-grained action classes are manually annotated.

Additionally, we provide the baseline action recognition re-

sults using the state-of-the-art Improved Dense Trajectory

feature and Fisher Vector representation with an MAP (Mean

Average Precision) of 51.39%.

KEYWORDS

Fine-grained action recognition dataset; Tai Chi; benchmark

dataset

ACM Reference format:

Shan Sun, Feng Wang

∗

, Qi Liang, Liang He. 2017. TaiChi: A Fine-

Grained Action Recognition Dataset. In Proceedings of ICMR ’17,

June 6–9, 2017, Bucharest, Romania, , 5 pages.

DOI: http://dx.doi.org/10.1145/3078971.3079039

1 INTRODUCTION

With the explosive growth of videos on the Internet, numer-

ous works have been devoted to automatic understanding of

the video content. Among them, human action recognition

attracts a lot of research attention since it is widely used in

various applications such as video surveillance, indexing, and

event recounting. Human action recognition is faced with a

number of challenges such as complex human actions, large

intra-class variability, background motion, and occlusions. A

lot of approaches have been proposed to tackle these issues.

Most existing works focus on classifying coarsely-grained ac-

tions with relatively large inter-class variations, for instance,

to distinguish football from basketball sports. Meanwhile,

ﬁne-grained human action recognition is rarely studied. Fine-

grained human action recognition aims to distinguish between

diﬀerent actions with low inter-class variability. Compared

∗

Corresponding author.

Permission to make digital or hard copies of all or part of this work

for personal or classroom use is granted without fee provided that

copies are not made or distributed for proﬁt or commercial advantage

and that copies bear this notice and the full citation on the ﬁrst page.

Copyrights for components of this work owned by others than ACM

must be honored. Abstracting with credit is permitted. To copy

otherwise, or republish, to post on servers or to redistribute to lists,

requires prior speciﬁc permission and/or a fee. Request permissions

from permissions@acm.org.

ICMR ’17, June 6–9, 2017, Bucharest, Romania

DOI: http://dx.doi.org/10.1145/3078971.3079039

to the traditional action recognition, ﬁne-grained actions are

usually with small spatial and temporal scales. In many cases,

a ﬁne-grained action is a part of a higher-level action, and

shares the same context with other ﬁne-grained actions. For

instance, throwing and slam dunk are two ﬁne-grained actions

in action basketball, and they share the same background,

actors, and objects. The traditional coarsely-grained action

recognition distinguishes basketball from other actions, but

rarely distinguishes throwing from slam dunk. However, the

distinction of highly similar actions is necessary for many

applications. For instance, for one who wants to learn bas-

ketball from online videos, s/he would like to search the clips

containing action throwing or slam dunk in order to practice

the speciﬁc skills rather than the clips simply labelled as

basketball.

Fine-grained action recognition presents more detailed

understanding of the video content. However, it has not been

extensively studied. One reason for the lack of research on

ﬁne-grained action recognition is the absence of benchmark

datasets. Most existing datasets such as the KTH [

], the

Weizmann [

], the Hollywood [

], the UCF databases [

] and the CCV [

] are without ﬁne-grained labels. With

these datasets, we can distinguish basketball from football,

but cannot learn how to distinguish throwing from slam

drunk. The MPII database [

] released in 2013 is ﬁnely

labelled. However, all the videos are captured in a ﬁxed

kitchen with a stationary camera and mainly focus on the

hand actions, which cannot meet the requirements of the

realistic applications. The FGA-240 [

] dataset is another

ﬁne-grained action dataset with a very large scale. However,

only the videos containing actions are released, while the

start and the end frames of each action are not annotated.

In this paper, we introduce and release a new ﬁne-grained

action dataset called TaiChi, which is composed by videos

about Tai Chi sports. Tai Chi is a traditional Chinese art.

It expresses as a kind of sport composed of slow, soft, and

continuously ﬂowing movements. There are currently lots

of Tai Chi genres in diﬀerent styles, and [

] provides a

glance of Tai Chi styles. The Tai Chi sport is composed of

a number of moves, and each move consists of a few basic

actions of diﬀerent body parts. Diﬀerent Tai Chi genres

develop diﬀerent Tai Chi moves, but they share the same

basic Tai Chi actions. The basic Tai Chi actions compose

numerous Tai Chi moves by diﬀerent permutations. We take

the basic Tai Chi actions as the ﬁne-grained actions in our

dataset which are very similar to each other. Speciﬁcally, the

TaiChi dataset contains 58 ﬁne-grained Tai Chi actions of

diﬀerent body parts such as hand, arm, leg, and foot. The

Poster

ICMR’17, June 6–9, 2017, Bucharest, Romania

429

下载后可阅读完整内容，剩余4页未读，立即下载

weixin_38606811

粉丝: 6
资源: 982

TaiChi: 高难度动作识别新挑战的网络视频数据集

数据融合matlab代码-UMONS-TAICHI:太极拳手势的多模态运动捕捉数据集

taichi.graphics:太极网站的主页

css太极图

这个代码是什么意思

can taichi progarmmining language support AMDGPU as backend?

No module named 'taichi'

ERROR: Could not find a version that satisfies the requirement taichi (from versions: none) ERROR: No matching distribution found for taichi

ModuleNotFoundError: No module named 'taichi._lib.core'怎么解决

python 另一个程序正在使用此文件，进程无法访问。: 'd:\\3d\\mipnerf360\\taichi-nerfs\\lib\\site-packages\\Morton3D-1.0.1-py3.10.egg'

python绘制太极八卦图案

基于python的自动化框架

ucf101每个分类的样本个数

python 传统文化

生成一个太极八卦阵的动画HTML代码

用taichi编程语言写一个欧拉视角的流体仿真程序，带注释

ARDUINO驱动42步进电机

drawio下载地址

games101 mac环境配置

taichi stable fluid

arduino开发教程

最新资源