基于预补偿器的自适应动态规划无模型最优控制器设计

72 浏览量更新于2024-08-26 收藏 901KB PDF 举报

基于预补偿器的自适应动态规划的连续时间非线性系统无模型最优控制器设计本文介绍了一种基于预补偿器的自适应动态规划的连续时间非线性系统无模型最优控制器设计方法。该方法可以在不需要系统模型的情况下，生成一个最优控制器，以控制连续时间非线性系统。 **预补偿器** 预补偿器是本文中的一种重要概念，它可以避免对系统的先验知识的需求。预补偿器可以将系统的输入信号转换为适合控制器的信号，从而避免了对系统模型的需求。 **自适应动态规划** 自适应动态规划（ADP）是一种在线学习算法，可以在线上学习和更新控制器策略。ADP算法可以通过在线学习来更新控制器，使其能够适应系统的变化。 **Hamilton-Jacobi-Bellman方程** Hamilton-Jacobi-Bellman（HJB）方程是一种常用的最优控制理论方程式。该方程式可以用来描述系统的最优控制问题。通过解HJB方程，可以获得系统的最优控制策略。 **神经网络逼近器** 神经网络逼近器是一种常用的函数逼近方法。该方法可以用来逼近HJB方程的解。神经网络逼近器可以在线上学习和更新，使得控制器能够适应系统的变化。 **策略迭代（PI）算法** 策略迭代（PI）算法是一种在线学习算法。该算法可以用来更新控制器策略，使其能够适应系统的变化。PI算法可以与神经网络逼近器结合使用，以获得系统的最优控制策略。 **实验结果** 本文中，作者给出了几个实验结果，以展示该方法的有效性。实验结果表明，该方法可以成功地生成一个最优控制器，以控制连续时间非线性系统。 **结论** 本文介绍了一种基于预补偿器的自适应动态规划的连续时间非线性系统无模型最优控制器设计方法。该方法可以在不需要系统模型的情况下，生成一个最优控制器，以控制连续时间非线性系统。该方法具有广泛的应用前景，可以应用于各种复杂系统的控制。

Model-free optimal controller design for continuous-time nonlinear

systems by adaptive dynamic programming based on

a precompensator

Jilie Zhang

a,b

, Huaguang Zhang

, Zhenwei Liu

, Yingchun Wang

School of Information Science and Engineering, Northeastern University, Shenyang, Liaoning, 110819, PR China

School of Information Science and Technology, Southwest Jiaotong University, Chengdu, 610031, PR China

article info

Article history:

Received 22 October 2013

Received in revised form

29 June 2014

Accepted 31 August 2014

Available online 20 February 2015

This paper was recommended for publica-

tion by Dr. Q.-G. Wang

Keywords:

Model-free controller

Optimal control

Precompensator

Adaptive dynamic programming

abstract

In this paper, we consider the problem of developing a controller for continuous-time nonlinear systems

where the equations governing the system are unknown. Using the measurements, two new online

schemes are presented for synthesizing a controller without building or assuming a model for the

system, by two new implementation schemes based on adaptive dynamic programming (ADP). To

circumvent the requirement of the prior knowledge for systems, a precompensator is introduced to

construct an augmented system. The corresponding Hamilton–Jacobi–Bellman (HJB) equation is solved

by adaptive dynamic programming, which consists of the least-squared technique, neural network

approximator and policy iteration (PI) algorithm. The main idea of our method is to sample the

information of state, state derivative and input to update the weighs of neural network by least-squared

technique. The update process is implemented in the framework of PI. In this paper, two new

implementation schemes are presented. Finally, several examples are given to illustrate the effectiveness

of our schemes.

1. Introduction

Since the optimal control problem for systems is ubiquity in

real world, such as [1–4], it is one of the most important problems

in control community. Recently, because the adaptive dynamic

programming (ADP) technology (it combines with adaptive con-

trol [5–7] and dynamic programming) rises, it has been developed

in a variety of areas for controlling systems with information of the

system model, such as [8–16]. However, these control schemes are

suited for systems whose dynamics can be characterized precisely.

When the plant's dynamics are poorly modeled, the controls

cannot provide satisfactory responses. This provides the motiva-

tion for developing a control procedure that does not require a

model for the underlying system. Therefore, some model-free or

partially model-free schemes[17–22] have been studied in

recent years.

[17] designs a model-free controller, but it is not optimal. The

designed controllers in [18] and [19] address the case that the

internal dynamics of systems is unknown, but the control matrix is

required. So, it is referred as to partially model-free controller.

While the designed controller in [20,21] and [22] addresses the

system without any a priori knowledge for systems. Although,

their methods are effective to design the model-free control, there

exist some restrictions and disadvantages. For instance, [20] and

[21] design the model-free control for the linear discrete-time and

continuous-time systems respectively, but they are only restricted

to linear systems. The schemes are difﬁcult to be used to solve the

nonlinear problem. [22] designed the model-free control by

identifying the system parameters, but the identifying process is

known to respond slowly to parameter variations from the plant. It

reduces the respond speed of the designed optimal control.

These restrictions and disadvantages motivate our research on

the model-free control for continuous-time nonlinear systems.

Owing to the emergence of computer, most continuous-time

systems can be addressed in discrete version. Here, a precompen-

sator [23] is employed to eliminate the dependence on the prior

knowledge of systems, such as [24,25]. Based on these ideas and

inspiration, we design the fully model-free control for continuous-

time nonlinear systems by two schemes, rather than by identifying

the system. One is to solve the corresponding Hamilton–Jacobi–

Bellman (HJB) equation for the augmented system with a pre-

compensator, in discrete version, by adaptive dynamic program-

ming, which consists of the least-squared technique, neural

network approximator (such as [26]) and policy iteration (PI)

Contents lists available at ScienceDirect

journal homepage: www.elsevier.com/locate/isatrans

ISA Transactions

http://dx.doi.org/10.1016/j.isatra.2014.08.018

Corresponding author.

E-mail addresses: jilie0226@163.com (J. Zhang), hgzhang@ieee.org (H. Zhang),

jzlzw@sina.com (Z. Liu), drwangyc@gmail.com (Y. Wang).

ISA Transactions 57 (2015) 63–70

下载后可阅读完整内容，剩余7页未读，立即下载

weixin_38659622

粉丝: 9
资源: 978

基于预补偿器的自适应动态规划无模型最优控制器设计

自适应动态规划综述

使用自适应动态规划的一类具有多个时滞的线性离散时间系统的无模型最优控制设计

基于自适应动态规划的执行器故障非线性系统故障补偿控制

执行器饱和度未知的非线性系统基于自适应动态规划的镇定

非线性动态突变系统的多模型自适应执行器故障补偿设计.docx

船舶航向模型参考自适应和最优控制研究.docx

leader-following 自适应动态规划仿真_hdp_神经网络控制_神经自适应_多智能体_自适应

具有饱和非线性的时滞系统的自适应容错控制

具有多级滞环非线性系统的自适应控制 (1991年)

自适应动态规划在非线性系统镇定中的应用：应对未知执行器饱和度

最新资源