深度学习预测混沌动力学：一种数据驱动的方法

需积分: 9 136 浏览量更新于2024-07-16 收藏 1.61MB PDF 举报

"这篇论文探讨了使用神经机器对混沌动力学进行预测的方法，特别是通过深度循环神经网络（Deep Recurrent Neural Network, DRNN）架构来应对复杂系统中混沌行为的预测挑战。传统的基于物理模型的方法在面对混沌系统时可能会遇到困难，因为混沌系统的复杂性使得建立精确模型变得不易。此外，由于混沌运动的无周期性和有限精度，基于模型的预测在短期内可能准确，但长时间后误差会显著增长。作者提出的数据驱动建模方法提供了一种替代方案，以提高对混沌系统未来行为预测的可行性。" 在这篇名为"Neural machine-based forecasting of chaotic dynamics"的论文中，作者Rui Wang、Eugenia Kalnay和Balakumar Balachandran关注的是自然界的普遍现象——混沌动力学。他们指出，传统的预测方法依赖于对系统的良好模型，但当系统表现出复杂的混沌行为时，建立这样的模型变得非常困难。混沌动力学是描述那些看似随机但实际上遵循确定性规则的复杂系统的行为的科学。这些系统对初始条件极其敏感，微小的变化可能导致截然不同的结果，这被称为“蝴蝶效应”。在实际应用中，如天气预报，这种混沌特性使得长期预测极具挑战性。为了克服这些问题，作者引入了一种基于深度学习的方法，特别是使用深度循环神经网络。DRNN是一种能够处理序列数据的神经网络，它特别适合捕捉时间序列中的长期依赖关系，这在混沌系统的预测中至关重要。与传统的线性或非线性模型相比，DRNN可以从大量数据中学习模式，无需先验知识就能构建模型。在论文中，作者可能进行了大量的实验和数据分析，以验证DRNN在预测混沌系统动态方面的表现。他们可能比较了DRNN预测的准确性与基于物理模型的预测，展示了数据驱动方法在长期预测中的优势。这种方法的潜在应用可能包括但不限于气候预测、金融市场分析、工程系统控制等领域，这些领域都涉及到复杂且难以预测的动态系统。通过使用神经机器学习，研究者能够处理混沌系统中固有的不确定性，并在一定程度上延长了预测的准确性。这种方法不仅为混沌动力学的预测开辟了新的路径，也为解决其他复杂系统预测问题提供了可能的框架。尽管这种方法带来了希望，但混沌系统的本质意味着预测仍然存在局限性，需要持续的研究和改进。

Neural machine-based forecasting 2905

dynamical system, Y is the future time series that needs

to be predicted based on the preceding time series his-

tory X.

Let

Z = Y|X

be the event that Y happens after X and P

(Z;θ) be a

family of probability distributions over the same para-

metric space indexed by θ. In this paper, the authors use

a deep recurrent neural network, parameterized by θ as

the surrogate model G(θ) to determine the conditional

probability P

(Z;θ), as an approximation to the true

but unknown data-generating distribution P

(Z).Ifthe

time series of event Z is drawn from a dynamical sys-

tem with certain initial condition, then the conditional

probability

(Z) ≡ 1

due to the determinism. However, in practice, P

(Z;θ)

can only be close to 1 by adjusting the value of θ without

necessarily achieving the above equality, especially, for

complex dynamical systems. To understand how one

transforms a deterministic problem into a probabilistic

one, there are two viewpoints to consider.

First, following the maximum likelihood principle

[15], the estimator for θ can be deﬁned as

θ = argmax

(Z;θ), (3a)

= argmax



k=1

;θ), (3b)

where Z ={Z

, k = 1,...,r} are independent data

sequences with batch size r, generated by the true but

unknown P

(Z). The above Eq. (3b) can be problem-

atic in terms of numerical computation. Due to the

determination of the product over many probabilities

that all vary from 0 to 1, the computation is prone to

numerical underﬂow. Hence, it is more convenient to

take the logarithm of both sides of the equation. This

results in the following equivalent optimization prob-

lem:

θ = argmax



k=1

log P

;θ). (4)

Typically, a large value of batch size r can give a bet-

ter estimation of θ, resulting in P

;

θ) ≈ 1. There-

fore, the prediction of future response based on this

surrogate model is more accurate. But in reality during

the training stage, r is often limited and the probabil-

ity distribution represented by Z is an empirical data-

generating distribution, that is, labeled as

(Z).Asa

result, Eq. (4) can be written as an expectation over the

empirical distribution deﬁned by the training dataset:

θ = argmax

Z∼

log P

(Z;θ). (5)

The second viewpoint is related to the Kullback–

Leibler divergence or KL divergence [16]. It is a mea-

sure of the distance between two different probability

distributions. The KL divergence between

deﬁned

by the training dataset and P

related to the surrogate

model is given by

= E

Z∼

[log

(Z) − log P

(Z;θ)], (6a)

= E

Z∼

log

(Z) − E

Z∼

log P

(Z;θ). (6b)

The goal is to minimize D

by adjusting the model

parameters in G(θ ). The ﬁrst term in Eq. (6b) is only

associated with the probability of generating certain

true time series, not related to the surrogate model itself.

Hence, the estimation of θ should only come from the

second term, which is

θ =−argmin

Z∼

log P

(Z;θ). (7)

Comparing with the maximum likelihood principle

from the ﬁrst viewpoint, one can ﬁnd that Eqs. (5)

and (7) are essentially the same.

2.2 Probability distributions and loss functions

Now, the authors are ready to discuss the relations

between the surrogate model G(θ ) and conditional

probability P

. As mentioned earlier, G(θ) is a deep

recurrent neural network, which in essence is the fol-

lowing mapping function:

G(X;θ) = Y. (8)

Again, X and Y are the time series history and

future time series sequentially generated from a cer-

tain dynamical system. In reality, the mapping output

from the surrogate model is



Y = G(X;θ), which is an

approximation to the true target value Y with certain

types of associated errors. Here, three types of error dis-

tributions corresponding to three different P

(Y|X;θ)

and loss functions are considered, by using one time

step univariate time series x and y, without loss of gen-

erality.

123

剩余14页未读，继续阅读

win_小明

粉丝: 0
资源: 3

深度学习预测混沌动力学：一种数据驱动的方法

NIPS 2020强化学习：基于模型方法的最新论文研究

Python神经风格迁移库 - neural_style-0.0.2.dev1版本

CUDA安装包：cudnn-11.2-windows-x64-v*.*.*.** for TensorFlow2.5

Practical Recommendations for Gradient-Based Training of Deep Architectures.pdf

Neural-Network_Machine-Learning-Course-Stanford.z_machine learni

A_Survey_of_Recent_Advances_in_CNN-based_Single_Im.pdf

_fuzzy-neural-network-theory-and-application.pdf

Graph Neural Networks_ A Review of Methods and Applications----清华大学周杰.pdf

ExtremeLearningMachine资源共享-Nonlinear-discrete-time-neural-network-observer_2013_Neurocomputing.pdf

ExtremeLearningMachine资源共享-Neural-network-design-and-model-reduction-approach-for-black-box_2013_Neuroc.pdf

最新资源

CUDA安装包：cudnn-11.2-windows-x64-v..*.** for TensorFlow2.5