使用Kinect深度信息的人体检测与跟踪

需积分: 9 91 浏览量更新于2024-09-15 收藏 819KB PDF 举报

"基于Kinect的人体检测方法" 在计算机视觉领域，人体检测是一个重要的研究课题，尤其是在游戏、安全监控和人机交互应用中。传统的基于可见光摄像头的人体检测方法通常模仿人类的视觉感知，利用图像梯度如直方图定向梯度（HOG）或者尺度不变特征变换（SIFT）等特征来进行检测。然而，这些方法容易受到姿态变化、衣物颜色、光照条件以及复杂背景的影响。本文提出了一种创新的人体检测方法，该方法利用微软Kinect for Xbox 360提供的深度信息。与传统的基于二维图像的方法相比，深度信息提供了额外的维度，有助于区分前景和背景，从而提高检测的准确性。我们提出的检测方法基于一种结合二维头部轮廓模型和三维头部表面模型的模型化方法。首先，利用深度信息对人与周围环境进行分割，提取出人物的整体轮廓。这个过程依赖于我们的检测点来定位人体边界。然后，我们设计了一种跟踪算法，该算法根据检测结果进行人体的连续追踪，增强了在动态场景中的稳定性。在实验部分，我们在实验室环境中使用Kinect采集的数据集上测试了我们的方法，并取得了优越的结果。相比于传统的图像处理技术，我们的方法在应对姿态变化、光照变化以及复杂背景时表现出了更高的鲁棒性。这主要得益于深度信息的使用，它使得我们可以更准确地识别和分离人体，减少误检和漏检。此外，深度信息还允许我们构建更加精细的3D模型，这对于人体姿势估计、行为分析和实时交互等任务具有重要意义。这种方法不仅提高了人体检测的精确度，而且为未来的研究开辟了新的方向，例如结合深度学习进一步优化模型，或者在多人场景下的多人检测。基于Kinect的深度信息人体检测方法提供了一种有效且可靠的方式来处理传统视觉方法难以解决的问题，展示了深度传感器在计算机视觉领域的巨大潜力。这一技术的不断发展和完善，将有望推动人机交互和智能监控等领域取得更大的进步。

Abstract

Conventional human detection is mostly done in images

taken by visible-light cameras. These methods imitate the

detection process that human use. They use features based

on gradients, such as histograms of oriented gradients

(HOG), or extract interest points in the image, such as

scale-invariant feature transform (SIFT), etc. In this paper,

we present a novel human detection method using depth

information taken by the Kinect for Xbox 360. We propose

a model based approach, which detects humans using a

2-D head contour model and a 3-D head surface model. We

propose a segmentation scheme to segment the human from

his/her surroundings and extract the whole contours of the

figure based on our detection point. We also explore the

tracking algorithm based on our detection result. The

methods are tested on our database taken by the Kinect in

our lab and present superior results.

1. Introduction

Detecting human in images or videos is a challenging

problem due to variations in pose, clothing, lighting

conditions and complexity of the backgrounds. There has

been much research in the past few years in human

detection and various methods are proposed [1, 2, 6, 13].

Most of the research is based on images taken by

visible-light cameras, which is a natural way to do it just as

what human eyes perform. Some methods involve

statistical training based on local features, e.g.

gradient-based features such as HOG [1], EOH [8], and

some involve extracting interest points in the image, such

as scale-invariant feature transform (SIFT) [9], etc.

Although lots of reports showed that these methods can

provide highly accurate human detection results, RGB

image based methods encounter difficulties in perceiving

the shapes of the human subjects with articulated poses or

when the background is cluttered. These will result in the

drop of accuracy or the increase of computational cost.

Depth information is an important cue when human

recognize objects because the objects may not have

consistent color and texture but must occupy an integrated

region in space. There has been research using range image

for object recognition or modeling in the past few decades

[12, 14]. Range images have several advantages over 2D

intensity images: range images are robust to the change in

color and illumination. Also, range images are simple

representations of 3D information. However, earlier range

sensors were expensive and difficult to use in human

environments because of lasers. Now, Microsoft has

launched the Kinect, which is cheap and very easy to use.

Also, it does not have the disadvantages of laser so it can be

used in human environment and facilitate the research in

human detection, tracking and activity analysis.

In recent years, there is a body of research on the

problem of human parts detection, pose estimation and

tracking from 3D data. Earlier research used stereo cameras

to estimate human poses or perform human tracking [3, 4,

15]. In the past few years, a part of the research has focused

on the use of time-of-flight range cameras (TOF). Many

algorithms have been proposed to address the problem of

pose estimation and motion capture from range images [5, 7,

11, 16]. Ganapathi et al. [5] present a filtering algorithm to

track human poses using a stream of depth images captured

by a TOF camera. Jain et al. [7] present a model based

approach for estimating human poses by fusing depth and

RGB color data. Recently, there have been several works

on human/parts detection using TOF cameras. Plagemann

et al. [10] use a novel interest point detector to solve the

problem of detection and identifying body parts in depth

images. Ikemura et al. [6] proposed a window-based human

detection method using relational depth similarity features

based on depth information.

In this paper, we present a novel model based method for

human detection from depth images. Our method detects

people using depth information obtained by Kinect in

indoor environments. We detect people using a 2-stage

head detection process, which includes a 2D edge detector

and a 3D shape detector to utilize both the edge information

and the relational depth change information in the depth

image. We also propose a segmentation method to segment

the figure from the background objects that attached to it

and extract the overall contour of the subject accurately.

The method is evaluated on a 3D dataset taken in our lab

using the Kinect for Xbox 360 and achieves excellent

results.

Human Detection Using Depth Information by Kinect

Lu Xia, Chia-Chih Chen and J. K. Aggarwal

The University of Texas at Austin

Department of Electrical and Computer Engineering

{xialu|ccchen|aggarwaljk}@mail.utexas.edu

下载后可阅读完整内容，剩余7页未读，立即下载

gzhujsj

粉丝: 0
资源: 29

使用Kinect深度信息的人体检测与跟踪

基于Kinect的人体检测和跟踪算法研究

Kinect——人体骨架检测

Kinect2.0测试报告1

kinect骨骼检测python

跌倒行为检测方法 kinect

kinect获得人体关节角度

写出kinect技术路线

kinect三维人体建模系统设计研究方法技术路线

在unity中使用kinect怎么实时捕捉人的动作，并且让人与出现的文字或者图片进行碰撞

如何使用kinect相机实现人体动作识别，请给出包括代码在内的具体步骤

最新资源