Adaptive dynamic programming-based stabilization 2091
able actuator saturation, this paper presents the
ADP-based control methods for nonlinear systems
subject to unknown actuator saturation. Thus, the
developed control method avoids any priori knowl-
edge of actuator saturation.
3. The optimal control is derived depending only on
critic NN, rather than dual- or triple-NN-based
architecture. Thus, it reduces the computational
burden of traditional adaptive critic designs [43,
49].
The structure of this paper is organized as follows:
In Sect. 2, the problem statement is provided. In Sect. 3,
the ADP-based online nominal optimal control is devel-
oped for nominal nonlinear systems. Then, a NN-based
saturation compensator is developed for eliminating
the negative affection of unknown actuator saturation.
In the following, the stability analysis is presented. In
Sect. 4, two numerical examples are employed to verify
the effectiveness of the proposed method. Finally, the
conclusion is drawn in Sect. 5.
2 Problem statement
The considered nominal continuous-time nonlinear
systems can be described as
˙x = f (x) + g(x)u, (1)
where x ∈ R
n
and u ∈ R
m
are the system state and
control input vectors, respectively. f (·) and g(·) are
assumed to be locally Lipschitz and differentiable in
their arguments such that the solution x(t ) to nonlinear
system (1) is unique for any given initial state x(0) = x
0
with f (0) = 0. Nonlinear system (1) is stable in the
sense that there exists a continuous control u which
stabilizes the system asymptotically.
In order to better adapt practical control require-
ments, we are concerned with the stabilization prob-
lems for continuous-time nonlinear systems subject to
unknown actuator saturation as
˙x = f (x) + g(x)τ, (2)
where τ =[τ
1
,τ
2
,...,τ
m
]
T
∈ R
m
is the saturated
actuator output vector, which is the actual applied con-
trol input of (2). It slopes between its lower and upper
limits, i.e.,
τ
i
= sat(u
i
) =
⎧
⎨
⎩
u
i max
, u
i
> u
i max
,
u
i
, u
i min
≤ u
i
≤ u
i max
,
u
i min
, u
i
< u
i min
,
(3)
where i = 1, 2,...,m, and u
i max
and u
i min
are the
unknown upper and lower limit bounds, respectively.
That is to say, actuator saturation occurs if the com-
manded input u
i
falls outside of the set [u
i min
, u
i max
],
and the control input cannot be implemented to the
device totally.
The main purpose of this paper is to propose a
NN compensation-based ADP stabilization scheme for
nonlinear systems subject to unknown actuator satura-
tion and ensure all the signals of the closed-loop non-
linear system (2) to be ultimately uniformly bounded
(UUB).
3 Online approximate optimal controller design
and stability analysis
This section is divided into three parts. The online
learning nominal optimal control scheme is presented
in the first part for nominal system (1). Then, in the sec-
ond part, a feed-forward NN compensator is developed
to tackle the unknown actuator saturation for nonlinear
system (2). In the third part, the UUB stability of the
closed-loop nonlinear system is analyzed.
3.1 Online nominal optimal control
For nominal nonlinear system (1), a feedback control
u
n
(x) ∈ Ψ(Ω) will be derived to tackle its control
problem such that the closed-loop nonlinear system is
stable. The objective of this optimal control problem is
to find the stabilizing nominal control u
n
(x) to mini-
mize the infinite-horizon cost function which is given
by
V (x
0
) =
∞
0
U (x(s), u
n
(s))ds, (4)
where U (x , u
n
) = x
T
Qx + u
T
n
Ru
n
is the utility func-
tion, U (x, u
n
) ≥ 0 for all x and u
n
with U (0, 0) = 0,
and Q ∈ R
n×n
and R ∈ R
m×m
are positive definite
matrices. If the associated infinite-horizon cost func-
tion (4) is continuously differentiable, the infinitesimal
123