Model-free optimal controller design for continuous-time nonlinear
systems by adaptive dynamic programming based on
a precompensator
Jilie Zhang
a,b
, Huaguang Zhang
a,
n
, Zhenwei Liu
a
, Yingchun Wang
a
a
School of Information Science and Engineering, Northeastern University, Shenyang, Liaoning, 110819, PR China
b
School of Information Science and Technology, Southwest Jiaotong University, Chengdu, 610031, PR China
article info
Article history:
Received 22 October 2013
Received in revised form
29 June 2014
Accepted 31 August 2014
Available online 20 February 2015
This paper was recommended for publica-
tion by Dr. Q.-G. Wang
Keywords:
Model-free controller
Optimal control
Precompensator
Adaptive dynamic programming
abstract
In this paper, we consider the problem of developing a controller for continuous-time nonlinear systems
where the equations governing the system are unknown. Using the measurements, two new online
schemes are presented for synthesizing a controller without building or assuming a model for the
system, by two new implementation schemes based on adaptive dynamic programming (ADP). To
circumvent the requirement of the prior knowledge for systems, a precompensator is introduced to
construct an augmented system. The corresponding Hamilton–Jacobi–Bellman (HJB) equation is solved
by adaptive dynamic programming, which consists of the least-squared technique, neural network
approximator and policy iteration (PI) algorithm. The main idea of our method is to sample the
information of state, state derivative and input to update the weighs of neural network by least-squared
technique. The update process is implemented in the framework of PI. In this paper, two new
implementation schemes are presented. Finally, several examples are given to illustrate the effectiveness
of our schemes.
& 2014 ISA. Published by Elsevier Ltd. All rights reserved.
1. Introduction
Since the optimal control problem for systems is ubiquity in
real world, such as [1–4], it is one of the most important problems
in control community. Recently, because the adaptive dynamic
programming (ADP) technology (it combines with adaptive con-
trol [5–7] and dynamic programming) rises, it has been developed
in a variety of areas for controlling systems with information of the
system model, such as [8–16]. However, these control schemes are
suited for systems whose dynamics can be characterized precisely.
When the plant's dynamics are poorly modeled, the controls
cannot provide satisfactory responses. This provides the motiva-
tion for developing a control procedure that does not require a
model for the underlying system. Therefore, some model-free or
partially model-free schemes[17–22] have been studied in
recent years.
[17] designs a model-free controller, but it is not optimal. The
designed controllers in [18] and [19] address the case that the
internal dynamics of systems is unknown, but the control matrix is
required. So, it is referred as to partially model-free controller.
While the designed controller in [20,21] and [22] addresses the
system without any a priori knowledge for systems. Although,
their methods are effective to design the model-free control, there
exist some restrictions and disadvantages. For instance, [20] and
[21] design the model-free control for the linear discrete-time and
continuous-time systems respectively, but they are only restricted
to linear systems. The schemes are difficult to be used to solve the
nonlinear problem. [22] designed the model-free control by
identifying the system parameters, but the identifying process is
known to respond slowly to parameter variations from the plant. It
reduces the respond speed of the designed optimal control.
These restrictions and disadvantages motivate our research on
the model-free control for continuous-time nonlinear systems.
Owing to the emergence of computer, most continuous-time
systems can be addressed in discrete version. Here, a precompen-
sator [23] is employed to eliminate the dependence on the prior
knowledge of systems, such as [24,25]. Based on these ideas and
inspiration, we design the fully model-free control for continuous-
time nonlinear systems by two schemes, rather than by identifying
the system. One is to solve the corresponding Hamilton–Jacobi–
Bellman (HJB) equation for the augmented system with a pre-
compensator, in discrete version, by adaptive dynamic program-
ming, which consists of the least-squared technique, neural
network approximator (such as [26]) and policy iteration (PI)
Contents lists available at ScienceDirect
journal homepage: www.elsevier.com/locate/isatrans
ISA Transactions
http://dx.doi.org/10.1016/j.isatra.2014.08.018
0019-0578/& 2014 ISA. Published by Elsevier Ltd. All rights reserved.
n
Corresponding author.
E-mail addresses: jilie0226@163.com (J. Zhang), hgzhang@ieee.org (H. Zhang),
jzlzw@sina.com (Z. Liu), drwangyc@gmail.com (Y. Wang).
ISA Transactions 57 (2015) 63–70