解决误检挑战：RMPE框架提升多人姿态估计准确度

需积分: 9 96 浏览量更新于2024-09-08 收藏 2.6MB PDF 举报

本文主要探讨了"帧间平滑"在多个人体姿态估计（Multi-person Pose Estimation, MPPE）中的应用，特别是在野外环境中的挑战。现有的最先进的人体检测器虽然在性能上表现出色，但局部定位和识别的微小误差仍难以避免。这些误差可能会对单个人体姿态估计器（Single-Person Pose Estimator, SPPE）造成影响，特别是那些完全依赖于人体检测结果的方法。作者提出了一种新颖的区域多个人体姿态估计框架（Regional Multi-Person Pose Estimation, RMPE），旨在解决在不准确的人体框（bounding boxes）背景下进行姿态估计的问题。该框架由三个关键组件组成： 1. **对称空间变换网络（Symmetric Spatial Transformer Network, SSTN）**：这是一种利用深度学习技术的空间变换模块，它能够处理输入图像中的变形和错位问题，增强对不同姿势和姿态变化的鲁棒性。 2. **参数化姿态非极大抑制（Parametric Pose Non-Maximum Suppression, NMS）**：不同于传统的非极大值抑制方法，参数化NMS能够根据姿态信息动态调整，有效地消除因不准确框导致的重复检测，提高精度。 3. **姿态引导的提议生成器（Pose-Guided Proposals Generator, PGPG）**：这个组件基于已有的姿态估计结果，生成更精确的候选区域，减少因为误检或漏检导致的错误估计。通过结合这三个组件，RMPE能够在面对不精确的人体框时，有效地处理错误并减少冗余检测。在MPII（Multiperson Pose in the Wild）数据集上的实验结果显示，这种方法取得了显著的性能提升，达到了76.7 mAP，证明了其在实际场景中有效改善了多个人体姿态估计的准确性。这篇论文关注的是如何通过改进的姿态估计框架来适应实际场景中的挑战，特别是在人类检测结果存在偏差的情况下，为多个人体姿态估计任务提供了一个重要的解决方案。这对于提升AI在计算机视觉领域的实用性和可靠性具有重要意义。

arXiv:1612.00137v4 [cs.CV] 2 Sep 2017

RMPE: Regional Multi-Person Pose Estimation

Hao-Shu Fang

1∗

, Shuqin Xie

, Yu-Wing Tai

, Cewu Lu

1§

Shanghai Jiao Tong University, China

Tencent YouTu

fhaoshu@gmail.com qweasdshu@sjtu.edu.cn yuwingtai@tencent.com lucewu@sjtu.edu.cn

Abstract

Multi-person pose estimation in the wild is challenging.

Although state-of-the-art human detectors have demon-

strated good performance, small errors in localization and

recognition are inevitable. These errors can cause failures

for a single-person pose estimator (SPPE), especially for

methods that solely depend on human detection results. In

this paper, we propose a novel regional multi-person pose

estimation (RMPE) framework to facilitate pose estimation

in the presence of inaccurate human bounding boxes. Our

framework consists of three components: Symmetric Spa-

tial Transformer Network (SSTN), Parametric Pose Non-

Maximum-Suppression (NMS), and Pose-Guided Proposals

Generator (PGPG). Our method is able to handle inaccu-

rate bounding boxes and redundant detections, allowing it

to achieve 76.7 mAP on the MPII (multi person) dataset[

3].

Our model and source codes are made publicly available.

†

1. Introduction

Human pose estimation is a fundamental challen ge for

computer vision. In practice, recognizing the pose of

multiple persons in the wild is a lot more challenging

than recogn izing the pose of a single person in an im-

age [

30, 31, 21, 23, 38]. Recent attempts approach this

problem by using either a two-step framework [28, 12] or a

part-based fra mework [

7, 27, 17]. Th e two-step framework

ﬁrst detects huma n bounding boxes and then estimates the

pose within each box independently. The part-based frame-

work ﬁrst detects body parts independently and the n assem-

bles the detected body parts to form multiple human poses.

Both frameworks have their advantages and disadvantages.

In the two-step framework, the accuracy of pose estima-

tion highly dep ends on the quality of the detected b ound-

ing boxes. In the part-based framework, the assembled hu-

∗

part of this work was done when Hao-Shu Fang was an student intern

in Tencent

corresponding author is Cewu Lu

†

https://cvsjtu.wordpress.com/rmpe-regional-multi-person-pose-estimation/

man poses a re ambigu ous when two or more per sons are too

close together. Also, part-based framework loses the cap a -

bility to recognize body parts from a global pose view due to

the mere utilization of second-order body parts dependence.

Our approach follows the two-step framework. We aim

to detect accura te human poses even when given inaccu -

rate bounding boxes. To illustrate the problems of previous

approa c hes, we applied the state-of-the-art object detector

Faster-RCNN [

29] and the SPPE Stacked Hourglass model

[

23]. Figure 1 and Figure 2 sh ow two major problems:

the lo calization error problem and the redundant detection

problem. In fact, SPPE is r ather vulner able to bounding

box errors. Even for the cases when the bo unding boxes

are considered as correct with IoU > 0.5, the detected hu-

man poses can still be wrong. Since SPPE produces a pose

for each given bound ing box, redund a nt detections result in

redundant poses.

To address the above problems, a regional multi-person

pose estimation (RMPE) framework is proposed. Ou r

framework improves the performance of SPPE-based hu-

man pose estimation algorithms. We have designed a new

symmetric spatial transformer ne twork (SSTN) which is at-

tached to the SPPE to extract a high-quality single pe rson

region from an inaccurate bounding box. A novel paral-

lel SPPE branch is introduced to optimize this network. To

address the problem of redundant detection, a parametric

pose NMS is introduced. Our parametric pose NMS elimi-

nates redunda nt poses by using a novel pose distance met-

ric to compare pose similarity. A data-driven approach is

applied to optimize the pose distance parameters. Lastly,

we propose a n ovel pose- guided human proposal genera-

tor (PGPG) to augment training samples. By learnin g the

output distribution of a human dete ctor for different poses,

we can simulate the gene ration of human bounding boxes,

producing a large sample of training data.

Our RMPE framework is general and is applicable to

different human detectors and single person pose estima-

tors. We applied our framework on the MPII (multi-person )

dataset [

3], where it o utperforms the state-of-the-art meth-

ods and achieves 76.7 mAP. We have also conducted ab-

lation studies to validate the effectiveness of each pro-

4321

下载后可阅读完整内容，剩余9页未读，立即下载

hhbxll

粉丝: 0
资源: 1

解决误检挑战：RMPE框架提升多人姿态估计准确度

AlphaPose：领先姿态估计系统的开源里程碑

AlphaPose模型集合快速下载指南

跨平台部署yolov7-pose与TensorRT推理指南

AlphaPose-pytorch.zip

实时2D转3D人体模型技术：DensePose应用详解

AlphaPose轻量化版本训练代码指南及实践

【路径规划】乌燕鸥算法栅格地图机器人最短路径规划【含Matlab仿真 2886期】.zip

【路径规划】生物地理算法栅格地图机器人最短路径规划【含Matlab仿真 2914期】.zip

【路径规划】冠状病毒群体免疫算法栅格地图机器人路径规划【含Matlab仿真 2818期】.zip

在 GPU 上计算的各种样条算法.zip

最新资源