动态规划原理与反射BSDE在随机递归最优控制中的应用

71 浏览量更新于2024-07-16 收藏 450KB PDF 举报

本文主要探讨一类特殊的随机递归最优控制问题，其核心是基于成本函数由反映倒向随机微分方程（Reflected Backward Stochastic Differential Equation, RBSDE）的解来定义。作者吴臻和Yu Zhiyong针对此类问题提出了动态规划原理，并展示了价值函数与哈密顿-雅可比-贝尔曼（Hamilton-Jacobi-Bellman, HJB）方程之间的关系。在随机控制理论中，动态规划原理是将长期决策分解为一系列短期决策，通过迭代方法求解最优策略的一种核心工具。在这个特定的研究中，作者考虑了带有障碍约束的情况，这使得优化过程更为复杂，因为策略必须在不违反障碍条件的前提下最大化期望效用。 RBSDEs在最优控制问题中扮演着关键角色，它们能够捕捉到随机环境中的不确定性，并且当涉及到边界条件时，反射机制确保了值函数不会越过障碍。反射机制反映了实际世界中的物理限制或市场规则，比如股票价格不能低于某个最低值，这就是为什么称为“反射”的原因。在最优控制问题中，哈密顿-雅可比-贝尔曼方程是描述最优策略和价值函数之间动态关系的偏微分方程。它在确定性控制理论中由汉密尔顿提出，在随机控制领域经由Pontryagin和Bellman发展成为现在的形式。HJB方程对于寻找动态系统下的最优策略具有重要意义，它的解通常提供了关于策略的明确表达式。本文的主要贡献在于，首先建立了该类随机递归最优控制问题的动态规划原理，证明了价值函数作为障碍问题下对应HJB方程的唯一 viscosity solution (维斯可粘性解)，这是一种适用于非光滑和非线性偏微分方程的弱解概念。维斯可粘性解的概念对于理解非标准随机最优控制问题中的不连续性和路径依赖性至关重要。这篇首发论文深入探讨了随机递归最优控制问题与RBSDE、动态规划原则以及HJB方程之间的连接，为理解和解决此类复杂的随机优化问题提供了新的理论框架。研究结果不仅有助于优化理论的发展，也为实际应用中的金融工程、经济决策等领域提供了数学工具支持。

An element of U is called an admissible control. Here U is a compact subset of R

, however this

restriction is often satisﬁed in practical applications.

For a given admissible control, we consider the following control system











t,ζ;v

= b(s, X

t,ζ;v

, v

)ds + σ(s, X

t,ζ;v

, v

)dW

, s ∈ [t, T],

t,ζ;v

= ζ,

(3.1)

where t ≥ 0 is regarded as the initial time, and ζ ∈ L

(Ω, F

, P; R

) as the initial state. The

mappings

b : [0, T] × R

× U → R

, σ : [0, T ] × R

× U → R

n×d

satisfy the following conditions:

(H3.1) b and σ are continuous in t;

(H3.2) for some L > 0, and all x, x

∈ R

, v, v

∈ U, a.s.

b(t, x, v) − b(t, x

, v

)

σ(t, x, v) − σ(t, x

, v

)

≤ L(

x − x

v − v

Obviously, under the above assumptions, for any v(·) ∈ U, control system (3.1) has a unique

strong solution {X

t,ζ;v

, 0 ≤ t ≤ s ≤ T }, and we also have the following estimates

Proposition 3.1 For all t ∈ [0, T ], ζ, ζ

∈ L

(Ω, F

, P; R

), v(·), v

(·) ∈ U,

(

sup

t≤s≤T

t,ζ;v

)

≤ C(1 + |ζ|

); (3.2)

(

sup

t≤s≤T

t,ζ;v

− X

t,ζ

)

≤ C

ζ − ζ

+ CE

− v

, (3.3)

where the constant C depends on L, T and the compact set U .

Proposition 3.2 For all t ∈ [0, T ], x ∈ R

, v(·) ∈ U, δ ∈ [0, T − t],

(

sup

t≤s≤t+δ

t,x;v

− x

)

≤ Cδ, (3.4)

where the constant C depends on x, L and the compact set U.

Now for any given admissible control v(·) ∈ U, we consider the following reﬂected BSDE

t,ζ;v

= Φ(X

t,ζ;v

) +

g(r, X

t,ζ;v

, Y

t,ζ;v

, Z

t,ζ;v

, v

)dr

t,ζ;v

− K

t,ζ;v

−

t,ζ;v

, t ≤ s ≤ T,

(3.5)

where

Φ = Φ(x) : R

→ R, h = h(t, x) : [0, T ] × R

→ R,

g = g(t, x, y, z, v) : [0, T ] × R

× R × R

× U → R

satisfy the following conditions

http://www.paper.edu.cn

剩余23页未读，继续阅读

weixin_38717579

粉丝: 2
资源: 887

动态规划原理与反射BSDE在随机递归最优控制中的应用

DYNAMIC PROGRAMMING AND OPTIMAL CONTROL

stochastic control in continuous-time and application in finance

python hjb方程

Dynamic event-triggered control for discrete Markov jump systems

an introduction to stochastic differential equations version微盘

stochastic models estimation and control

rank-based stochastic pooling

Robbins-Monro setting

stochastic models, estimation and control

最新资源