Deep Spatio-Temporal Residual Networks for Citywide Crowd Flows Prediction
∗
Junbo Zhang
1
, Yu Zheng
1,2,3,4†
, Dekang Qi
2,1
1
Microsoft Research, Beijing, China
2
School of Information Science and Technology, Southwest Jiaotong University, Chengdu, China
3
School of Computer Science and Technology, Xidian University, China
4
Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences
{junbo.zhang, yuzheng}@microsoft.com, dekangqi@outlook.com
Abstract
Forecasting the flow of crowds is of great importance to traffic
management and public safety, and very challenging as it is
affected by many complex factors, such as inter-region traf-
fic, events, and weather. We propose a deep-learning-based
approach, called ST-ResNet, to collectively forecast the in-
flow and outflow of crowds in each and every region of a
city. We design an end-to-end structure of ST-ResNet based
on unique properties of spatio-temporal data. More specifi-
cally, we employ the residual neural network framework to
model the temporal closeness, period, and trend properties
of crowd traffic. For each property, we design a branch of
residual convolutional units, each of which models the spa-
tial properties of crowd traffic. ST-ResNet learns to dynam-
ically aggregate the output of the three residual neural net-
works based on data, assigning different weights to different
branches and regions. The aggregation is further combined
with external factors, such as weather and day of the week,
to predict the final traffic of crowds in each and every region.
Experiments on two types of crowd flows in Beijing and New
York City (NYC) demonstrate that the proposed ST-ResNet
outperforms six well-known methods.
Introduction
Predicting crowd flows in a city is of great importance to
traffic management and public safety (Zheng et al. 2014).
For instance, massive crowds of people streamed into a strip
region at the 2015 New Year’s Eve celebrations in Shanghai,
resulting in a catastrophic stampede that killed 36 people. In
mid-July of 2016, hundreds of “Pokemon Go” players ran
through New York City’s Central Park in hopes of catching
a particularly rare digital monster, leading to a dangerous
stampede there. If one can predict the crowd flow in a re-
gion, such tragedies can be mitigated or prevented by utiliz-
ing emergency mechanisms, such as conducting traffic con-
trol, sending out warnings, or evacuating people, in advance.
In this paper, we predict two types of crowd flows (Zhang
et al. 2016): inflow and outflow, as shown in Figure 1(a).
Inflow is the total traffic of crowds entering a region from
∗
This research was supported by NSFC (Nos. 61672399,
U1401258), and the 973 Program (No. 2015CB352400).
†
Correspondence author. This work was done when the third
author was an intern at Microsoft Research.
Copyright
c
2017, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
other places during a given time interval. Outflow denotes
the total traffic of crowds leaving a region for other places
during a given time interval. Both flows track the transition
of crowds between regions. Knowing them is very beneficial
for risk assessment and traffic management. Inflow/outflow
can be measured by the number of pedestrians, the number
of cars driven nearby roads, the number of people traveling
on public transportation systems (e.g., metro, bus), or all of
them together if data is available. Figure 1(b) presents an
example. We can use mobile phone signals to measure the
number of pedestrians, showing that the inflow and outflow
of r
2
are (3, 1) respectively. Similarly, using the GPS trajec-
tories of vehicles, two types of flows are (0, 3) respectively.
(a) Inflow and outflow (b) Measurement of flows
Figure 1: Crowd flows in a region
Simultaneously forecasting the inflow and outflow of
crowds in each region of a city, however, is very challenging,
affected by the following three complex factors:
1. Spatial dependencies. The inflow of Region r
2
(shown in
Figure 1(a)) is affected by outflows of nearby regions (like
r
1
) as well as distant regions. Likewise, the outflow of r
2
would affect inflows of other regions (e.g., r
3
). The inflow
of region r
2
would affect its own outflow as well.
2. Temporal dependencies. The flow of crowds in a region
is affected by recent time intervals, both near and far. For
instance, a traffic congestion occurring at 8am will affect
that of 9am. In addition, traffic conditions during morning
rush hours may be similar on consecutive workdays, re-
peating every 24 hours. Furthermore, morning rush hours
may gradually happen later as winter comes. When the
temperature gradually drops and the sun rises later in the
day, people get up later and later.
3. External influence. Some external factors, such as weather
conditions and events may change the flow of crowds
tremendously in different regions of a city.