深度学习解决高维偏微分方程的新算法

深度学习

需积分: 12 154 浏览量更新于2024-07-09 收藏 1.9MB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

"这篇PDF文件名为'DGM深度学习求解PDE.pdf'，主要探讨了如何利用深度学习算法解决偏微分方程(PDEs)的问题。文章在'Journal of Computational Physics'上发表，作者是Justin Sirignano和Konstantinos Spiliopoulos，分别来自美国伊利诺伊大学厄巴纳-香槟分校和波士顿大学数学与统计学系。该研究关注的是高维PDEs的计算挑战，并提出了一种基于深度神经网络(DGM)的新方法来逼近解，该网络经过训练，能够满足微分算子、初始条件和边界条件。这种方法不需要网格，对于高维问题具有显著优势，因为传统的网格方法在高维空间中变得不可行。" 深度学习求解偏微分方程已经成为近年来的一个热门研究领域，这是因为传统的数值方法在处理高维度问题时往往面临计算复杂度和效率的挑战。深度学习，特别是深度神经网络，因其强大的非线性映射能力，为解决这一问题提供了新的思路。在本文中，作者提出的深度学习算法（DGM）基于神经网络架构，它能够学习并近似高维PDE的解。这个深度学习模型的训练过程包括让网络学习满足PDE的微分运算符、初始时刻的条件以及边界条件。通过反向传播优化算法，如梯度下降法，网络参数可以被逐步调整，以最小化这些条件的误差。 DGM的优势在于其无网格特性，这意味着它不需要创建和维护复杂的网格结构，这在高维空间中会变得极其复杂和耗时。因此，DGM提供了一个更加灵活和高效的方法，尤其适用于那些传统方法难以处理的高维度问题。此外，关键词"机器学习"和"高维PDE"表明，这项工作不仅限于深度学习技术，还涉及更广泛的机器学习理论，可能涉及到数据驱动的方法和模式识别，这对于理解和预测复杂系统的行为非常有用。而"高维PDEs"的强调则意味着，这项研究具有广泛的应用潜力，特别是在物理、工程、金融等领域，这些领域常常涉及到多变量和复杂相互作用的系统，可以用高维PDE来描述。 DGM算法为解决高维PDE提供了一种创新的深度学习解决方案，它克服了传统数值方法在高维空间中的局限性，有望在多个科学和工程领域带来突破性的进展。

资源详情

资源推荐

J. Sirignano, K. Spiliopoulos / Journal of Computational Physics 375 (2018) 1339–1364 1343

3. A Monte Carlo method for fast computation of second deriva tives

This section describes a modiﬁed algorithm which may be more computationally eﬃcient in some cases. The term

L f (t, x; θ) contains second derivatives

∂

∂x

(t, x; θ) which may be expensive to compute in higher dimensions. For instance,

20,000 second derivatives must be calculated in d = 200 dimensions.

The complicated architectures of neural networks can make it computationally costly to calculate the second derivatives

(for example, see the neural network architecture (4.2)). The computational cost for calculating second derivatives (in both

total arithmetic operations and memory) is O(d

× N) where d is the spatial dimension of x and N is the batch size.

In comparison, the computational cost for calculating ﬁrst derivatives is O(d × N). The cost associated with the second

derivatives is further increased since we actually need the third-order derivatives ∇

∂

∂x

(t, x; θ) for the stochastic gradient

descent algorithm. Instead of directly calculating these second derivatives, we approximate the second derivatives using a

Monte Carlo method.

Suppose the sum of the second derivatives in L f (t, x, ; θ) is of the form



, j=1

i, j

(x)σ

(x)

∂

∂x

(t, x; θ), assume

[ρ

i, j

]

, j=1

is a positive deﬁnite matrix, and deﬁne σ (x) =



(x), ..., σ

(x)



. For example, such PDEs arise when consid-

ering

expectations of functions of stochastic differential equations, where the σ (x) represents the diffusion coeﬃcient. See

equation (4.1) and the corresponding discussion. A generalization of the algorithm in this section to second derivatives with

nonlinear coeﬃcient dependence on u(t, x) is also possible. Then,



i, j=1

i, j

(x)σ

(x)

∂

∂x

(t, x;θ)= lim

→0





i=1

∂ f

∂x

(t, x +σ (x)W



;θ) −

∂ f

∂x

(t, x;θ)



(x)W





(3.1)

where W

∈R

is a Brownian motion and  ∈ R

is the step-size. The convergence rate for (3.1)is O(

√

).

Deﬁne:

(θ

, s

) :=



∂

∂t

, x

;θ

) +L f (t

, x

;θ

)



(θ

, s

) :=



f (τ

, z

;θ

) − g(τ

, z

)



(θ

, s

) :=



f (0, w

;θ

) −u

)



G(θ

, s

) := G

(θ

, s

) + G

(θ

, s

) + G

(θ

, s

The DGM algorithm use the gradient ∇

(θ

, s

), which requires the calculation of the second derivative terms in

L f (t

, x

; θ

). Deﬁne the ﬁrst derivative operators as

f (t

, x

;θ

) := L f (t

, x

;θ

) −



i, j=1

i, j

)σ

)

∂

∂x

, x

;θ).

Using (3.1), ∇

is approximated as

with a ﬁxed constant  > 0:

(θ

, s

) := 2



∂

∂t

, x

;θ

) +L

f (t

, x

;θ

) +



i=1

∂ f

∂x

(t, x

+σ (x



;θ) −

∂ f

∂x

(t, x

;θ)





×∇



∂

∂t

, x

;θ

) +L

f (t

, x

;θ

) +



i=1

∂ f

∂x

(t, x

+σ (x

)



;θ) −

∂ f

∂x

(t, x

;θ)



)





where W



is a d-dimensional normal random variable with E[W



] = 0and Cov[(W



)

, (W



)

] = ρ

i, j

.



has the same

distribution as W



. W



and



are independent.

(θ

, s

) is a Monte Carlo approximation of ∇

(θ

, s

(θ

, s

)

has O(

√

) bias as an approximation for ∇

(θ

, s

). This approximation error can be further improved via the following

scheme using “antithetic variates”:

Let f be a three-times differentiable function in x with bounded third-order derivatives in x. Then, it directly follows from a Taylor expansion that





, j=1

i, j

(x)σ

(x)

∂

∂x

(t, x; θ) − E





∂ f

∂x

(t,x+σ (x)W



;θ)−

∂ f

∂x

(t,x;θ)



(x)W







≤

C(x)

√

. The constant C(x) depends upon ρ, f

xxx

(t, x; θ) and σ (x).

1344 J. Sirignano, K. Spiliopoulos / Journal of Computational Physics 375 (2018) 1339–1364

(θ

, s

) :=

1,a

(θ

, s

) +

1,b

(θ

, s

) (3.2)

1,a

(θ

, s

) :=



∂

∂t

, x

;θ

) +L

f (t

, x

;θ

) +



i=1

∂ f

∂x

(t, x

+σ (x



;θ) −

∂ f

∂x

(t, x

;θ)





×∇



∂

∂t

, x

;θ

) +L

f (t

, x

;θ

) +



i=1

∂ f

∂x

(t, x

+σ (x

)



;θ) −

∂ f

∂x

(t, x

;θ)



)





1,b

(θ

, s

) :=



∂

∂t

, x

;θ

) +L

f (t

, x

;θ

) −



i=1

∂ f

∂x

(t, x

−σ (x



;θ) −

∂ f

∂x

(t, x

;θ)





×∇



∂

∂t

, x

;θ

) +L

f (t

, x

;θ

) −



i=1

∂ f

∂x

(t, x

−σ (x

)



;θ) −

∂ f

∂x

(t, x

;θ)



)





The approximation (3.2) has O() bias as an approximation for ∇

(θ

, s

). Eq. (3.2)uses antithetic variates in the sense

that

1,a

(θ

, s

) uses the random variables (W



) while

1,b

(θ

, s

) uses (−W



, −



). See [1]for a background on

antithetic variates in simulation algorithms. A Taylor expansion can be used to show the approximation error is O(). It is

important to highlight that there is no computational cost associated with the magnitude of ; an arbitrarily small  can

be chosen with no additional computational cost (although there may be numerical underﬂow or overﬂow problems). The

modiﬁed algorithm using the Monte Carlo approximation for the second derivatives is:

1. Generate

random points (t

, x

) from [0, T ] × and (τ

, z

) from [0, T ] ×∂ according to respective densities ν

and

. Also, draw the random point w

from  with density ν

2. Calculate

the step

G(θ

, s

) =

(θ

, s

) +∇

(θ

, s

) +∇

(θ

, s

) at the randomly sampled points s

={(t

, x

(

, z

), w

G(θ

, s

) is an approximation for ∇

G(θ

, s

3. Take

a step at the random point s

n+1

=θ

−α

G(θ

, s

)

4. Repeat until convergence criterion is satisﬁed.

conclusion, the modiﬁed algorithm here is computationally less expensive than the original algorithm in Section 2 but

introduces some bias and variance. The variance essentially increases the i.i.d. noise in the stochastic gradient descent step;

this noise averages out over a large number of samples though. The original algorithm in Section 2 is unbiased and has lower

variance, but is computationally more expensive. We numerically implement the algorithm for a class of free boundary PDEs

in Section 4. Future research may investigate other methods to further improve the computational evaluation of the second

derivative terms (for instance, multi-level Monte Carlo).

4. Numerical analysis for a high-dimensional free boundary PDE

We test our algorithm on a class of high-dimensional free boundary PDEs. These free boundary PDEs are used in ﬁnance

to price American options and are often referred to as “American option PDEs”. An American option is a ﬁnancial derivative

on a portfolio of stocks. The option owner may at any time t ∈[0, T ] choose to exercise the American option and receive

a payoff which is determined by the underlying prices of the stocks in the portfolio. T is called the maturity date of the

option and the payoff function is g(x) : R

→R. Let X

∈R

be the prices of d stocks. If at time t the stock prices X

= x, the

price of the option is u(t, x). The price function u(t, x) satisﬁes a free boundary PDE on [0, T ] × R

. For American options,

one is primarily interested in the solution u(0, X

) since this is the fair price to buy or sell the option.

Besides

the high dimensions and the free boundary, the American option PDE is challenging to numerically solve since

the payoff function g(x) (which both appears in the initial condition and determines the free boundary) is not continuously

differentiable.

Section 4.

1 states the free boundary PDE and the deep learning algorithm to solve it. To address the free boundary,

we supplement the algorithm presented in Section 2 with an iterative method; see Section 4.1. Section 4.2 describes the

architecture and implementation details for the neural network. Section 4.3 reports numerical accuracy for a case where a

semi-analytic solution exists. Section 4.4 reports numerical accuracy for a case where no semi-analytic solution exists.

4.1. The free boundary PDE

We now specify the free boundary PDE for u(t, x). The stock price dynamics and option price are:

=μ(X

)dt + σ (X

)dW

u(t, x) = sup

τ ≥t

E[e

−r(τ ∧T )

g( X

τ ∧T

)|X

= x],

剩余25页未读，继续阅读

cs架构娃

粉丝: 0
资源: 5

深度学习解决高维偏微分方程的新算法

基于用户的深度学习可视化分类.pdf

DGM(1,1)模型.pdf

那么基于深度学习的方法呢？

DGM预测matlab

DGM离散型灰色模型

6.试对序列 x(0)= (2.874, 3.278, 3.337, 3.39, 3.679, 3.8) 建立DGM(2,1)模型。

DGM（2，1）预测模型公式推导

小波灰色dgm（2，1）代码实现

ERROR: Could not find a version that satisfies the requirement DGM (from versions: none) ERROR: No matching distribution found for DGM怎么解决

我有07-17的商品贸易进出口总额，请用小波灰色dgm（2，1）的Matlab代码实现07-17的商品进出口贸易的预测，要求分解5层

matlabDGM（2，1）模型预测步骤

matlabDGM（2，1）预测模型公式推导

间断有限元方法相较于有限差分的优点

用Matlab语言编写

灰色预测模型预测图像代码

ubuntu cuda卸载nvidia自带的cuda

灰色预测模型 界面设计

怎么优化灰色预测GM(1,1)模型

最新资源

灰色预测模型界面设计