密集场景中行人数量预测：丰富高维特征的应用

84 浏览量更新于2024-08-28 收藏 708KB PDF 举报

"这篇文章主要探讨了在拥挤场景中预测行人数量的方法，特别是在交通监控图像和视频中的应用。针对密集人群的计数问题，传统的单行人跟踪技术效率有限，因此研究者转向利用统计学习算法直接从图像或场景的视觉特征来推断行人数量。文中介绍了一种系统，该系统通过丰富和高维的特征显著提升了行人计数的实用性。尽管这些特征导致了高维空间的回归问题，但作者利用降维学习技术在保持高精度预测行人数量的同时解决了这一问题。实验证明了这种方法的有效性。" 本文的研究集中在智能交通系统中的一个关键问题——从监控图像和视频中估计行人数量。尤其是在人群密集的场景下，由于个体跟踪的困难，传统方法难以奏效。因此，作者提出了一种新的统计学习方法，该方法不依赖于个体跟踪，而是直接利用图像或场景的视觉特征来预测行人总数。文章中提到的关键技术包括丰富的特征集和高维空间的学习。通过引入更全面的特征，可以捕捉到更多与行人计数相关的视觉信息，但这也带来了高维空间的回归挑战。为了解决这个问题，文章采用了降维学习策略，如核主成分分析（Kernel PCA）或者核岭回归（Kernel Ridge Regression）等，这些方法能够在减少数据维度的同时保持模型的预测能力，降低过拟合风险，提高预测准确性。此外，论文中还可能涉及集成学习（Ensemble Learning）和高斯过程（Gaussian Processes），这些是统计建模和预测中常用的技术。集成学习通过组合多个弱学习器来创建强学习器，提高模型的稳定性和泛化能力。高斯过程则是一种非参数概率模型，适用于处理回归和分类任务，特别适合处理不确定性和噪声数据。标签中的“Statistical Landscape Features”表明，文章可能还探讨了如何从统计角度提取场景特征，比如行人密度、群体行为模式以及环境背景等，这些特征对于准确预测行人数量至关重要。这篇文章深入研究了在拥挤场景中利用丰富和高维特征进行行人计数的统计学习方法，通过降维和集成学习等技术提高了预测的准确性和效率，对于智能交通系统的行人流量监测具有重要的理论和实际意义。

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 12, NO. 4, DECEMBER 2011 1037

Predicting Pedestrian Counts in Crowded Scenes

With Rich and High-Dimensional Features

Junping Zhang, Member, IEEE, Ben Tan, Fei Sha, and Li He

Abstract—Estimating the number of pedestrians in surveil-

lance images and videos has important applications in intelligent

transportation systems. This problem is particularly challenging

when the scenes are densely crowded, in which the techniques of

tracking a single pedestrian has limited effectiveness. Alternative

approaches employ statistical learning algorithms to infer pedes-

trian counts directly from visual features computed on images or

scenes. In this paper, we describe a system for predicting pedes-

trian counts that signiﬁcantly extends the utility of those ideas.

Our approach incorporates a richer set of features for statistical

modeling. While these features give rise to regression problems in a

high-dimensional space, we leverage learning techniques to reduce

dimensionality while still attaining high accuracy for predict-

ing the number of pedestrians. Empirical results have validated

our strategy. Speciﬁcally, our system outperforms state-of-the-art

methods on standard benchmark tasks by a large margin.

Index Terms—Ensemble learning, Gaussian processes, kernel

dimension reduction (KDR), pedestrian counting, statistical land-

scape features (SLFs).

I. INTRODUCTION

STIMATING the number of pedestrians has many ap-

plications in intelligent transportation systems. Pedes-

trian counts have been used to optimize the design of trafﬁc

infrastructures and manage practices for transportation and

pedestrian safety [1], [2]. In emergency response systems,

counting pedestrians with high accuracy provides timely and

valuable feedback to guide mass evacuation [3]. Of particular

interest is to automatically infer pedestrian counts from sur-

veillance images and videos. Such interest has been instigated

Manuscript received March 19, 2010; revised December 3, 2010 and

February 23, 2011; accepted March 10, 2011. Date of publication April 19,

2011; date of current version December 5, 2011. This work was supported

in part by the 973 Program under Project 2006CB705506 and Project

2010CB327900; by the National Science Foundation of China under Grant

60975044; by Fudan University Key Laboratory Senior Visiting Scholarship;

and by the State Key Laboratory of Rail Trafﬁc Control and Safety, Beijing

Jiaotong University, under Contract RCS2008007. The Associate Editor for this

paper was S. Tang.

J. Zhang is with Shanghai Key Laboratory of Intelligent Information

Processing, School of Computer Science, Fudan University, Shanghai 200433,

China, and also with the State Key Laboratory of Rail Trafﬁc Control

and Safety, Beijing Jiaotong University, Beijing 100044, China (e-mail:

jpzhang@fudan.edu.cn).

B. Tan is with Shanghai Key Laboratory of Intelligent Information Process-

ing, School of Computer Science, Fudan University, Shanghai 200433, China

(e-mail: tanben@yeah.net).

F. Sha is with the Department of Computer Science, Viterbi School of

Engineering, University of Southern California, Los Angeles, CA 90089 USA

(e-mail: feisha@usc.edu).

L. He is with Yahoo! Labs China, Beijing 100083, China (e-mail:

sigmastudio@gmail.com).

Color versions of one or more of the ﬁgures in this paper are available online

at http://ieeexplore.ieee.org.

Digital Object Identiﬁer 10.1109/TITS.2011.2132759

by widespread deployments of surveillance video cameras in

public areas [4].

Automatic inference of pedestrian counts from surveillance

images is a challenging task for image processing and computer

vision. Broadly speaking, two types of approaches have been

investigated. The ﬁrst type relies on reliable tracking of indi-

vidual pedestrians [5], [6]. These methods are well suited for

images with a few number of pedestrians (and moving objects).

When the density of the pedestrian crowd increases, the perfor-

mance of tracking-based methods starts to deteriorate. This is

often attributed to signiﬁcant occlusion in the scenes, as well

as large variances in pedestrian appearances, including height,

clothing, accessories, etc. With these complicating factors, de-

tecting individual pedestrians quickly becomes impractical.

A more scalable approach is to directly estimate the counts

without identifying individuals in complicated scenes. Intu-

itively, one views the task of estimating pedestrian counts

(or crowd density) as a regression problem. In the regression

model, the inputs (i.e., the covariates) are visual features com-

puted on images, and the output (i.e., the response variable) is

the pedestrian count or the crowd density. Parameters of these

regression models can be estimated from training data, i.e.,

images annotated with a known number of pedestrians.

Davies et al. examined this kind of approach with geomet-

rical features such as areas (the number of pixels occupied)

and perimeters (the number of pixels in the edges) [7]. They

used a linear regression model between features and pedes-

trian counts. Since object sizes depend on view angles and

distances between imaging planes of cameras and pedestrians,

Ma et al. [8] and Chan et al. [9] studied these issues and pro-

posed methods to normalize the effect of imaging differences.

There have also been experiments with other types of features.

For instance, the Minkowski fractal dimension of edges, which

describes the irregularity of edges, was shown to correlate with

denseness of pedestrians in the images [10].

Dong et al. [11] built a lookup table between silhouettes and

pedestrian counts and pedestrians’ conﬁguration in the crowd

so that each silhouette corresponds to a pair of pedestrian count

and conﬁguration. The silhouette is calculated by sampling

some points along its external boundary and then transforming

these points to a frequency domain using a discrete Fourier

transform. While this method can be fast and accurate, it only

works well when the pedestrian density is small enough such

that each connected region contains only a few pedestrians, as

suggested by the empirical studies reported in [11].

In this paper, we extend these approaches and describe our

system of pedestrian counting for crowded scenes. In particular,

we have experimented with a rich set of features that were

下载后可阅读完整内容，剩余9页未读，立即下载

weixin_38675777

粉丝: 3
资源: 917

密集场景中行人数量预测：丰富高维特征的应用

"基于三维格构模型的长龄期混凝土力学性能预测研究

"AXP2601高精度锂电池电压型电量计 V1.0详解

"Polar S810i 心率监测器用户手册及品牌说明书

Design and fabrication of one-dimensional focusing X-ray compound lens with Al material

Predicting performance in the first grade with the first-grade screening test

Predicting-Customer-Churn-with-Watson-Data-Platform

Predicting the Future in Science, Economics and Politics-Edward Elgar Pub

Cytokine profiles predicting Gram-positive and Gram-negative bacteremia in febrile neutropenia associated with hematological and malignant diseases: a prospective study

Predicting-Bike-Rental-Counts-in-Washington-D.C.:使用线性回归模型和计算方法进行统计数据分析

Predicting-High-School-Graduation:田纳西州明星学生调查的证据

最新资源