2018-01-0612 Published 03 Apr 2018
© 2018 SAE International; General Motors LLC.
Studies on Drivers’ Driving Styles Based on Inverse
Reinforcement Learning
Yuande Jiang and Weiwen Deng Jilin University
Jinsong Wang General Motors LLC
Bing Zhu Jilin University
Citation: Jiang, Y., Deng, W., Wang, J., and Zhu, B., “Studies on Drivers’ Driving Styles Based on Inverse Reinforcement Learning,”
SAETechnical Paper 2018-01-0612, 2018, doi:10.4271/2018-01-0612.
Abstract
A
lthough advanced driver assistance systems (ADAS)
have been widely introduced in automotive industry
to enhance driving safety and comfort, and to reduce
drivers’ driving burden, they do not in general reect dierent
drivers’ driving styles or customized with individual person-
alities. This can be important to comfort and enjoyable
driving experience, and to improved market acceptance.
However, it is challenging to understand and further identify
drivers’ driving styles due to large number and great varia-
tions of driving population. Previous research has mainly
adopted physical approaches in modeling drivers’ driving
behavior, which however are oen very much limited, if not
impossible, in capturing human drivers’ driving character-
istics. is paper proposes a reinforcement learning based
approach, in which the driving styles are formulated through
drivers’ learning processes from interaction with surrounding
environment. Based on the reinforcement learning theory,
driving action can be treated as maximizing a reward
function. Instead of calibrating the unknown reward function
to satisfy driver’s desired response, we try to recover it from
the human driving data, utilizing maximum likelihood
inverse reinforcement learning (MLIRL). An IRL-based longi-
tudinal driving assistance system is also proposed in this
paper. Firstly, large amount of real world driving data is
collected from a test vehicle, and the data is split into two sets
for training and for testing purposes respectively. en, the
longitudinal acceleration is modeled as a Boltzmann distribu-
tion in human driving activity. The reward function is
denoted as a linear combination of some kernelized basis
functions. e driving style parameter vector is estimated
using MLIRL based on the training set. Finally, a learning-
based longitudinal driving assistance algorithm is developed
and evaluated on the testing set. e results demonstrate that
the proposed method can satisfactorily reect human drivers’
driving behavior.
Introduction
I
n past decades, great progress has been acheived in the
development of various technology areas, such as sensing,
computer technology, embedded system and digital control
technology, which has further drived the advance develop-
ment for intelligent driving systems. Advanced driver assis-
tance systems (ADAS) are examples which have gained wide
applications to improve driving safety and comfort. Some
highly automated driving technologies are further on the way
to market. Autonomous driving technologies also become
frontier areas in academia and industry research. Along with
the advanced development in autonomous driving, the study
of the interaction between human (or human driver) and
machine (or intelligent systems) is becoming
increasingly prominent.
Currently, some research activities have been conducted
to take driver’s preferences and driving characteristics into
account to improve the performance of intelligent driving
systems. According to dierent purposes of use, these methods
can be classied into two categories: personalized assistance
system design, and estimation of the likely behavior of human
driven vehicle for autonomous vehicle. In ADAS, driver’s char-
acteristics are the most complicated factors which have great
impacts on the acceptance of these systems. In the design of
personalized driver assistance systems, driver’s characteristics
are considered mainly by using model-based approach and
learning-based approach. In model-based approaches, driver
behavior is modeled with a xed structure. For example, the
car-following process is treated as a linear regression function
of several typical features in [1, 2, 3, 4], and dierent driving
characteristics are represented by the model parameters. Some
nonlinear models are proposed in [5, 6, 7], and a probability
weighted autoregressive exogenous (PWARX) model, an
extension of normal linear mode, is introduced in [8, 9]. In
addition, some studies assume that individual driver charac-
teristics are the results of trade-o among some control objec-
tives [10, 11]. It can be found that the model-based approach
is implemented based on an underlying assumption that
human driving process can be modeled physically to some
extent. However, due to its inherent complexity and
Downloaded from SAE International by Jilin University, Saturday, January 04, 2020