Spatio-temporal Graph Convolutional Neural Network: A Deep Learning
Framework for Traffic Forecasting
Bing Yu,
∗1
Haoteng Yin,
∗2,3
Zhanxing Zhu
†3,4
1
School of Mathematical Sciences, Peking University, Beijing, China
2
Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
3
Center for Data Science, Peking University, Beijing, China
4
Beijing Institute of Big Data Research (BIBDR), Beijing, China
{byu, htyin, zhanxing.zhu}@pku.edu.cn
Abstract
The goal of traffic forecasting is to predict the future vital
indicators (such as speed, volume and density) of the local
traffic network in reasonable response time. Due to the dy-
namics and complexity of traffic network flow, typical sim-
ulation experiments and classic statistical methods cannot
satisfy the requirements of mid-and-long term forecasting.
In this work, we propose a novel deep learning framework,
Spatio-Temporal Graph Convolutional Neural Network (ST-
GCNN), to tackle this spatio-temporal sequence forecasting
task. Instead of applying recurrent models to sequence learn-
ing, we build our model entirely on convolutional neural net-
works (CNNs) with gated linear units (GLU) and highway
networks. The proposed architecture fully employs the graph
structure of the road networks and enables faster training. Ex-
periments show that our ST-GCNN network captures com-
prehensive spatio-temporal correlations throughout complex
traffic network and consistently outperforms state-of-the-art
baseline algorithms on several real-world traffic datasets.
Introduction
Traffic forecasting is one of the most challenging studies
of Intelligent Transportation System (ITS). Accurate and
timely forecasting of multi-scale traffic conditions is of
paramount importance for road users, management agencies
and private sectors. Widely used transportation services pro-
vided by ITS such as dynamic traffic control, route planning
and navigation service also rely on a high-quality assessment
of future traffic network conditions under reasonable cost.
Indicators such as speed, volume and density gathered by
various sensors reflect the general status of road traffic con-
ditions. Thus, those measurements are typically chosen as
the target of traffic prediction. Based on the length of pre-
diction, traffic forecasting can be divided into three scales:
short-term (5 ∼ 30 min), medium-term (30 ∼ 60 min) and
long-term (over an hour). Most prevalent approaches are
able to perform well on short forecasting interval. Inherently,
because of the uncertainty and complexity of traffic flow,
those methods are unsatisfying on long-term time-series pre-
diction.
Previous studies on traffic prediction can be roughly di-
vided into two different categories, namely, traditional sim-
∗
Equal contributions.
†
Corresponding author.
ulation approaches and data-driven methods. For the sim-
ulation approaches, making traffic flow prediction requires
comprehensive and meticulous systemic modeling based on
physical theories and prior knowledge (Vlahogianni 2015).
Even though, the analog system and simulation tools still
consume massive computational power and skillful parame-
ter settings to achieve steady state. Nowadays, with the rapid
development of real-time traffic data collection methods and
forms, researchers are transferring their attention to explor-
ing data-driven methods through enormous historical traffic
records which are gathered by the advanced ITS.
Classic statistical models and machine learning models
are two major representative categories of data-driven meth-
ods. In time-series analysis, autoregressive integrated mov-
ing average (ARIMA) is one of the most consolidated ap-
proaches. It has been applied into various study fields and
firstly introduced into traffic forecasting as early as 1970s
(Ahmed and Cook 1979). ARIMA model can be applied
to non-stationary data, which require an integrated term
to make the time series stationary. Extensive variants of
ARIMA model have been proposed to improve the ability on
pattern capturing and prediction accuracy, such as seasonal
ARIMA (SARIMA) (Williams and Hoel 2003), ARIMA
with the Kalman filter (Lippi, Bertini, and Frasconi 2013).
However, models mentioned above highly rely on the sta-
tionary assumption of the time series and ignore the spa-
tial correlation among traffic network. Therefore, time-series
models have partially limited representability of highly dy-
namic and inconstant traffic flow.
Recently, machine learning methods have shown promis-
ing development in traffic study. Higher prediction accuracy
can be acquired by these non-parametric methods, includ-
ing k-nearest neighbors algorithm (KNN), support vector
machine (SVM), and neural network (NN) models (also re-
ferred as deep learning models).
Deep Learning Approaches Nowadays, deep learning
techniques, deep architectures in particular, have drawn lots
of academic and industrial interest and attention. Deep learn-
ing methods have been widely and successfully employed
in various tasks such as classification, pattern recognition
and object detection. In traffic prediction research, the deep
belief network (DBN) has been proved the capability of
capturing the stochastic features and characteristics of traf-
arXiv:1709.04875v2 [cs.LG] 25 Sep 2017