自主水下航行器的迭代自适应动态规划轨迹跟踪控制算法

需积分: 9 31 浏览量更新于2024-09-03 收藏 1.13MB PDF 举报

本文档标题为《基于迭代自适应动态规划的自主水下航行器轨迹跟踪控制》(Nonlinear Trajectory-Tracking Control for Autonomous Underwater Vehicle Based on Iterative Adaptive Dynamic Programming)，发表在《智能与模糊系统》(Journal of Intelligent & Fuzzy Systems)第37卷(2019年)的4205-4215页，DOI为10.3233/JIFS-190294，由IOSPress出版。研究者是厦门大学航空航天工程学院的高峰、刘丽君和郑宇教授。文章的核心内容围绕自主水下航行器的最优轨迹跟踪控制问题展开。作者提出了一种迭代自适应动态编程算法，通过结合两种迭代过程，即i-iteration（迭代控制策略）和j-iteration（迭代价值函数），来解决这一复杂问题。i-iteration的目标是获取逐次优化的轨迹跟踪控制律，而j-iteration则在i-iteration过程中负责计算相应的价值函数。这种方法将原始的最优跟踪控制问题转化为一个最优调节问题，通过系统转换实现。算法的关键在于采用策略迭代和值迭代的交互策略，这是一种自适应动态规划的策略，旨在找到最优化的控制策略和状态反馈函数，从而确保系统的稳定性和效率。神经网络技术被应用到算法的实现中，以处理非线性系统的复杂性，并提高算法的适应性和泛化能力。作者详细分析了算法的收敛性和最优性特性，证明了该方法在解决自主水下航行器轨迹跟踪控制问题上的有效性。通过模拟例证，展示了迭代自适应动态编程算法在实际应用场景中的优良性能，能够有效地追踪预设的轨迹，同时保持系统的鲁棒性和稳定性。这篇论文为自主水下航行器的轨迹跟踪控制提供了一种创新且高效的解决方案，对于推进无人水下技术的发展以及实时控制系统的优化设计具有重要意义。通过阅读和理解这份文档，读者可以深入理解自适应动态规划在复杂控制系统中的应用，以及如何通过迭代策略提高控制性能。

G. Che et al. / Nonlinear trajectory-tracking control for AUV base on ADP 4207

J(η) =



(η)0

3×3

(η)



(2)

(η) =

⎛

⎜

⎝

cos ψ cos θ



sin ψ cos θ



− sin θ cos θ sin φ cos θ cos φ

⎞

⎟

⎠

(3)

(η) =

⎛

⎜

⎝

1 tan θ sin φ tan θ sin φ

0 cos φ − sin φ

0 sin φ sec θ cos φ sec θ

⎞

⎟

⎠

(4)

where 

= cos ψ sin θ sin φ − sin ψ cos φ;



= cos ψ sin θ cos φ + sin ψ sin φ;



= sin ψ sin θ sin φ + cos ψ cos φ; 

sin ψ sin θ cos φ − cos ψ sin φ.

The dynamic model system is established via laws

of Newton:

[M]

ξ + [C(ξ)]ξ + [D(ξ)]ξ + g(η) = τ (5)

where [M] ∈

6×6

is the inertia matrix and its inverse

[M]

−1

;[C(ξ)] ∈

6×6

is the Coriolis and centripetal

matrix; [D(ξ)] ∈

6×6

is the damping matrix; g(η) ∈



is the gravity and buoyancy forces vector; τ ∈

is the generalized thrust force vector.

The equation (5) can be transformed as follow:

ξ = [M]

−1

([C(ξ)] − [D(ξ)]ξ − g(η) + τ) (6)

2.2. Problem formulation

We suppose that the sample time is very short.

According to equation (6), the discrete-time dynamic

system is described as follows:

ξ(k + 1) = f (ξ(k)) + ι(ξ(k))u(ξ(k)) (7)

where u(ξ(k))is the system control input. For opti-

mal tracking control problem, the control objective

is to ﬁnd an optimal control u

∗

(ξ(k)), so as to make

equation (7) track the desired trajectory ξ

(k). For

simplicity, u(ξ(k)) is replaced by u(k).

The tracking error is deﬁned as:

e(k) = ξ(k) − ξ

(8)

The control input error is described as :

ν(k) = u(k) − u

(k) (9)

(k) = ι

−1

(ξ

(k))(ξ

(k + 1) − f (ξ

(k)) − [M]

−1

(10)

where u

(k) is the expected control input and

introduced for analytical purpose. By substituting

equations (8), (9) and (10), the new system is obtained

as follows:

e(k + 1) = f (e(k) + ξ

(k)) + ι(e(k) + ξ

(k))

−1

(ξ

(k))(ξ

(k + 1) − f (ξ(k)))

−ξ

(k + 1) + ι(e(k) + ξ

(k))ν(k)

(11)

where u

(k) and u

(k + 1) are desired control vec-

tors.

The equation (11) can be represented as:

e(k + 1) = F (e(k),ν(k)) (12)

where e(k) is the sate vector and ι(k) is the control

vector.

According to Bellman’s optimality principle, the

control performance index function

∗

(e(k)) = min

ν(k)

∞



i=k

i−k

U(e(i),ν(i)) (13)

where U(e(k),ι(k)) = e

(k)Qe(k) + ν

(k)Rν(k)is

the utility function,γ is the discount factor with 0 <

γ ≤ 1; and Q and R are symmetric and positive-

deﬁne matrices.

In other words,J

∗

(e(k)) satisﬁes the discrete-time

HJB equation. Therefore, the optimal control law can

be expressed as:

∗

(k) = arg min

ν(k)

{U(e(k),ν(k)) + γJ

∗

(e(k + 1))}

(14)

Bellman principle yields a backwards-in-time pro-

cedure for solving the optimal control problem,

because that must know the optimal policy at time

k + 1 according to equation (14) that used to deter-

mine the optimal policy at time k. However, it

always causes that it is often computationally unten-

able to run true dynamic programming due to

the backward numerical process required for its

solutions. In order to overcome this difﬁculty, approx-

imate the performance index function is proposed.

sectionTrajectory-tracking control based on ADP

algorithm

2.3. Derivative of policy iterative ADP algorithm

In this section, the policy iterative ADP algorithm

is presented, and the value function and control law

are updated by iterations. First of all, we start with an

initial admissible control law

(k), and let

(e(k +

1)) satisfy the HJB equation:

剩余10页未读，继续阅读

winner高峰

粉丝: 13
资源: 36

自主水下航行器的迭代自适应动态规划轨迹跟踪控制算法

IFS系统中的客户定制报表开发

IFS码设计与分形图构建探究

IFSDDK文件系统在DPC中断级的任务处理

熔纤机IFS-15M-中文用户手册V0.02.pdf

IFS-App7.5设备维护用户手册第一部分-基础数据维护.pdf

《金融科技推动中国绿色金融发展：案例与展望（2021）》-IFS&保尔森-2021-37页.pdf

IFS用户操作指南-界面-.doc

802.11-1999.pdf

20210924-SGX-IFS Capital： Advancing Enterprise Financing.pdf

2021迈向 2060 碳中和-聚焦脱碳之路上的机遇和挑战-IFS&高瓴.pdf

最新资源