Machine Vision and Applications (2017) 28:793–802
DOI 10.1007/s00138-017-0846-2
SPECIAL ISSUE PAPER
Vehicle classification for large-scale traffic surveillance videos
using Convolutional Neural Networks
Li Zhuo
1,2
· Liying Jiang
1
· Ziqi Zhu
1
· Jiafeng Li
1
· Jing Zhang
1
· Haixia Long
1
Received: 30 September 2016 / Revised: 7 March 2017 / Accepted: 6 May 2017 / Published online: 26 May 2017
© Springer-Verlag Berlin Heidelberg 2017
Abstract Vehicle classification plays an important role in
intelligent transport system. However, because the con-
ventional vehicle classification methods are not robust to
variations such as illumination, weather, noise, and the
classification accuracy cannot meet the requirements of prac-
tical applications. Therefore, a new vehicle classification
method using Convolutional Neural Networks is proposed
in this paper, which consists of two steps: pre-training and
fine-tuning. In pre-training, GoogLeNet is pre-trained on
ILSVRC-2012 dataset to obtain the initial model with the
corresponding connection weights. In fine-tuning, the initial
model is further fine-tuned on VehicleDataset which is con-
structed with 13,700 images in this paper to obtain the final
classification model. All images in the VehicleDataset are
extracted from real highway surveillance videos, including
variations of illumination, noise, resolution, angle of video
cameras and weather. The vehicles are divided into six cate-
gories, i.e., bus, car, motorcycle, minibus, truck and van. The
performance evaluation is carried out on the VehicleDataset.
The experimental results show that the proposed method can
avoid the complicated process of manually extracting fea-
tures and the average classification accuracy is up to 98.26%,
which is 3.42% higher than the conventional methods using
“Feature + Classifier”.
Keywords Vehicle classification · CNN · GoogLeNet ·
VehicleDataset · Pre-training · Fine-tuning
B
Liying Jiang
jly102@sina.com
1
Signal and Information Processing Lab, Beijing University of
Technology, Beijing, China
2
Collaborative Innovation Center of Electric Vehicles in
Beijing, Beijing, China
1 Introduction
Vehicle classification is one of the essential parts of intelli-
gent transport system (ITS) [1] which can be widely applied
in traffic control, flow analysis, traffic composition, etc. In
the last few decades, vehicle classification methods have been
explored actively, including the methods using loop induction
coil [2], laser [3], videos [4], etc. Due to the increasing pop-
ularity of intelligent traffic monitoring system, vision-based
traffic monitoring has been utilized to sustain traffic man-
agement. However, the stability of this method is vulnerable
to environmental variations, such as low-illumination, angle
variation of cameras, weather, noise.
Vision-based vehicle classification can be viewed as a typ-
ical pattern recognition problem, which can be divided into
feature extraction and classification. Visual features of vehi-
cles are firstly extracted to train the classifiers, and then the
classification model can be obtained to classify the vehicles.
Based on this idea, a lot of research works have been
carried out and various feature extraction methods and clas-
sifiers have been proposed. Specifically, the features can be
divided into global features (color, texture, contour, etc.) and
local features (SIFT, LBPH, SURF, etc.). And support vector
machine (SVM) is one of the most commonly used classi-
fiers. For this method, how to select the suitable features to
represent the property of the vehicles is the most important.
Many research works have proved that in a specific applica-
tion, this method can achieve a good performance, while the
prior knowledge of the specific task is essential. For exam-
ple, LBP has been proved to be effective in face recognition,
while in some other applications, its performance is not so
good. Similarly, Gabor feature has been widely used in digit
recognition, while in some other tasks, its effectiveness is
limited. For pedestrian detection, HOG (histogram of ori-
ented gradient) can achieve a good performance.
123