非线性时间序列检测：surrogate数据方法详解

需积分: 18 174 浏览量更新于2024-07-25 收藏 1.46MB PDF 举报

本文档深入探讨了在时间序列分析中检测非线性现象的常用方法——替代数据法（Surrogate Data）。替代数据最早由James Theiler提出，其核心思想是通过创建一组与原始数据具有相同统计特性但无实际物理意义的随机序列，来验证时间序列中的非线性特征是否源于随机过程，而非系统内在的复杂动力学。首先，我们了解什么是替代数据。替代数据不是对原始数据的真实复制，而是通过特定的生成过程，如非线性映射或者随机扰动，构建出一组与原始数据相似但不保留原始信号特定结构的数据集。这种方法的关键在于，通过对比原始数据和替代数据在特定测试下的表现，比如频率分布、功率谱或混沌指标，可以确定观察到的非线性行为是否仅仅是随机噪声引起的，还是反映了系统的内在复杂性。文章详细介绍了如何生成替代数据，这通常包括以下步骤： 1. **数据平滑**：对原始数据进行低通滤波，去除高频噪声，以便在较低频段中检测非线性。 2. **随机化**：使用某种随机过程，如白噪声，对平滑后的数据进行随机扰动，同时保持统计特性不变，例如均值和方差。 3. **重构**：使用相同的随机化过程生成一系列新的数据集，作为替代数据系列。然后，通过比较原始数据和替代数据在特定非线性测试中的结果，如最大Lyapunov指数、相位锁定区域等，可以评估非线性特征的显著性。如果原始数据的测试结果与大多数替代数据有显著差异，那么可以认为这些非线性特征可能是真实存在的，而非偶然的随机现象。文章最后提到，此方法适用于机械系统的时间序列分析，尤其是在排除其他可能的解释因素后，对于判断系统是否存在潜在的混沌或其他非线性动力学行为非常有用。此外，由于这种方法没有依赖于特定的模型假设，因此在理论和实践上都有广泛的应用价值。 LA-UR-91-3343文档提供了关于如何通过替代数据法检验时间序列中非线性特征的实用指导，这对于理解复杂系统的行为和验证理论预测具有重要意义。

-. s _=[Qo- u-I (2.1)

¢rH

The significance is properly a dimensionless quantity, but it is natural to call the units of _q"sigmas." Thus, one might

speak of a two sigma eff,'ct as not especially significant, but ten sigmas as extremely significant. If the distribution

of statistic values is gaussian (and numerical experiments indicate that this is often a reasonable approximation),

then the p-value associated with a signifcance S is given by p - erfc[,3/V_]; this is the probability of observing a

_ignificance S or larger if the null hypothesis is true.

If computational effort really were not a consideration, then a more robust way to define significance _vould be

directly in terms of p-values with rank statistics. In particular, if the observed time series has a statistic which is in

the lower one percentile of ali the surrogate statistics (and at least a hundred surrogates would be needed to make

this determination), then a (two-sided) p-value of p -- 0.02 could be quoted.

2.2 Hierarchy of null hypotheses

The null hypothesis defines the nature of the candidate process which may or may not adequately explain the data.

Our null hypotheses usually specify that certain properties of the original data are preserved -- such as mean and

variance m but that there is no further structure in the time series. The surrogate ¢ilata is then generated to mimic

these preserved features but to otherwise be random. There is some latitude in choosing which features ought to be

preserved: certainly mean and variance, and possibly also the Fourier power spectrum. If the raw data is discretized

to integer values, then the surrogate data should be similarly discretized.

Ultimately we envision a hierarchy (perhaps even a hierarchical tree) of null hypotheses against which time series

might be compared. Beginning with the simplest hypotheses, and increasing in generality, the following sections

outline some of the possibilities that we have considered.

2.2.1 Temporally uncorrelated noise

The null hypothesis of no temporal correlations is of particular interest in circumstances (e.g., stock market returns,

or outcomes on a roulette wheel) where any correlation at ali can potentially be exploited for profit. The simplest

null hypothesis in this case is that the observed data is fully described by independent and identically distributed

(liD) gaussian random variables. Surrogate data in this case are readily generated from a standard pseudorandom

number generator, normalized to the mean and variance of the original data.

A clever extension of this approach was used by Scheinkman and LeBaron [15] in sn analysis of stock market

returns. To test the hypothesis of IID noise with arbitrary amplitude distribution, they generated surrogate data by

shuffling the time-order the original time series. This more closely mimics the original data, but it destroys any

temporal correlations that may have been in the data.

2.2.2 Ornstein-Uhlenbeck noise

For most physical systems, it is usually obvious that there are temporal correlations, but the nature of these corre-

lations may not be so clear. The simplest case of non-IID noise is given by the Ornstein-Uhlenheck process [16]. For

a discrete time series, this can be produced by

zt = ao + alzt-i + _et (2.2)

where et is uncorrelated gaussian noise of unit variance. The coefficients ao, ai, and _ collectively determine the

mean, variance, and autocorrelation time of the time series. In fact, the autocorrelation fuaction is exponential in

this case:

(ztz,_,) - (_,:)2

A(r) = (z_) - (zt) 2 = e-al'l (2.3)

where 0 denotes an average over time t, and _ = -log at.

To make surrogate data sets, the mean p, variance v, and first autocorrelation A(1) are estimated from the

original time series; from these the coefficients are fit: ai = A(1), ao -/J(1 - ai), and _2 __ v(1 - a_). Finally, one

generates the surrogate data by iterating Eq. (2.2), using a pseudorandom number generator for the unit variance

oaussian et.

剩余17页未读，继续阅读

Li_Diana

粉丝: 1
资源: 4

非线性时间序列检测：surrogate数据方法详解

Compensation of Inverter Nonlinearity Based on Trapezoidal Voltage

matlab-code.zip_High Dimensional_fiber nonlinearity_matlab optic

INL DNL verilog代码

给出一个能计算4比特S盒非线性度结果为4，8比特S盒的非线性度结果为112的Python程序

DNL_INLmatlab

把这个初始化改成kaiming初始化

最新资源