多分支扩张卷积网络：静态人群场景分析的新方法 - CSDN文库

需积分: 5 53 浏览量更新于2024-08-12 收藏 2.44MB PDF 举报

本文主要探讨了"StaticCrowdSceneAnalysisviaDeepNetworkwithMulti-branchDilatedConvolutionBlocks"这一主题，由Haoran Liu等人在江西师范大学计算机与信息工程学院提出。研究专注于静态单图像中的人群计数和高精度密度图估计，这是一个重要的计算机视觉任务，特别是在公共安全监控和智能城市应用中。作者们设计了一种名为MDBNet（Multi-branch Dilated Convolution Network）的深度网络架构。MDBNet采用了单阶段目标检测框架，其核心在于多分支扩张卷积块（Multi-branch Dilated Convolution Blocks）。这种设计旨在解决静态场景下密集人群分析的挑战，利用扩张卷积的优势来增加模型的感知范围，同时保持较高的分辨率，以便于准确捕捉人群的细节。扩张卷积块的引入使得网络能够处理不同尺度和密度的人群，避免了常规卷积可能面临的像素级细节丢失问题。通过多个分支并行工作，网络能够在不同的特征层提取不同的上下文信息，这有助于提高对复杂场景中人群分布的判断能力。此外，预训练的卷积层作为基础模块，确保了网络在初始阶段具备良好的特征学习能力。论文的重点在于对比和分析了MDBNet与其他同类方法的性能，包括但不限于传统的单分支网络、多尺度卷积和池化策略。通过实验结果，作者展示了MDBNet在静态人群计数和密度图预测方面的优越性，证明了多分支扩张卷积块的有效性和适应性。总结来说，这篇研究论文在静态人群场景分析领域做出了创新贡献，提出了一种有效结合深度学习和多分支扩张卷积技术的方法，对于提升密集人群检测的精确度和效率具有重要意义。通过深入研究，该成果为未来静态人群分析任务提供了新的解决方案，并可能启发更多的研究者探索更高效的计算机视觉模型。

Static Crowd Scene Analysis via Deep Network

with Multi-branch Dilated Convolution Blocks

Haoran Liu

College of Computer and Information Engineering

Jiangxi Normal University

Nanchang, China

Aiwen Jiang*

College of Computer and Information Engineering

Jiangxi Normal University

Nanchang, China

Corresponding Author: jiangaiwen@jxnu.edu.cn

Qiaosi Yi

College of Computer and Information Engineering

Jiangxi Normal University

Nanchang, China

Xiaolin Deng

College of Computer and Information Engineering

Jiangxi Normal University

Nanchang, China

Jianyi Wan

College of Computer and Information Engineering

Jiangxi Normal University

Nanchang, China

Mingwen Wang

College of Computer and Information Engineering

Jiangxi Normal University

Nanchang, China

Abstract—In this paper, we have proposed a static crowd scene

analysis network via multi-branch dilated convolution block,

called MDBNet. It focuses on a joint task of estimating crowd

count and high-quality density map from static single image. The

proposed MDBNet follows one-stage object detection framework,

and consists of two parts: pre-trained convolutional layers as

the front end for high-level feature extraction and cascaded

multi-branch dilated convolution block as the back end for

context information aggregation on different ranges. Pixel-wise

objectness probabilities are predicted and regressed to generate

density map. The proposed MDBNet is an easy training model

with strong learning ability. We have tested it on two public

datasets (ShanghaiTech dataset and the UFC CC 50 dataset).

On almost all evaluation criterions, the proposed method has

achieved superior performance. Especially on structure quali-

ty criterions, including our newly introduced spatial adjusted

mutual information measurement, the MDBNet reports a new

state-of-the-art performance. The source code will be distributed

depending on publication of our work.

I. INTRODUCTION

Stampede, which happens frequently in big events around

the world, has caused serious disasters. For example, many

victims were died or injured in the fatal Shanghai Bund

stampede happened in the new year celebrations of 2015. If the

population density of the scene at the time could be accurately

estimated and corresponding security measures were arranged

in advance, such incidents might be effectively reduced or

avoided. Therefore, accurate knowledge of the crowd size,

crowd distribution in a public space is very necessary. With

the ubiquitous installation of surveillance cameras in city and

urban, crowd scene analysis from images or videos has become

an important practical and research topic in computer vision

community.

Since crowds are not regular across various scenes, typically

as shown in Fig. 1, it is not enough to calculate the pop-

ulation size merely. Distribution maps can thus help us get

more accurate and comprehensive information. Since crowd

counting is in principle self-evident: density times area, the

integral of a crowd density map gives the overall crowd count.

In recent works, crowd estimation has been developed from

simple crowd counting that outputs the number of people

in the target scene, to the presentation of density map that

explicitly shows visual patterns of target crowd distribution.

In this paper, we focus on the joint task of estimating crowd

count and high-quality density map from static single image.

Fig. 1. Images in the ﬁrst row are three samples from ShanghaiTech Part B

dataset. Heat mapping in the second row show respective density maps.

Besides public safety, crowd analysis has wide applications

in trafﬁc monitoring, ﬂow monitoring, and city planning etc.

IJCNN 2019 - International Joint Conference on Neural Networks, Budapest Hungary, 14-19 July 2019

978-1-7281-2009-6/$31.00 ©2019 IEEE

paper N-20158.pdf

下载后可阅读完整内容，剩余6页未读，立即下载

weixin_38691006

粉丝: 3
资源: 942

最新资源