自动构建与自然语言描述的非参数回归模型

需积分: 9 183 浏览量更新于2024-09-07 收藏 1.21MB PDF 举报

本文探讨了非参数回归模型的自动构造及其自然语言描述的前沿研究。作者们，来自剑桥大学和麻省理工学院的专家团队，提出了一个名为"自动统计员"的系统，专门针对回归问题进行深入分析。该系统的核心技术是基于高斯过程（Gaussian processes）的非参数方法，这是一种强大的工具，它能够捕捉和表达未知函数的高级特性，如平滑性、趋势、周期性和变化点。高斯过程模型的优势在于其灵活性，它能将复杂的函数分解为可解释的部分，通过这种加性结构，使得模型不仅在拟合数据时展现出优秀的性能，而且还能生成自然语言报告，将复杂的数据解释得通俗易懂。这种方法突破了传统参数化模型的局限，能够以开放式的方式生成具有先进预测能力的模型，这对于实际应用中的数据理解至关重要。研究者们通过在13个实时序列数据集上进行评估，展示了这一方法在处理各种实际问题中的实用性和有效性。通过自然语言描述，模型的结果不仅局限于数值结果，而是提供了丰富的上下文信息和直观的解读，使得数据分析更为直观且易于理解。这篇文章介绍了如何利用人工智能（AI）和自然语言处理（NLP）技术，通过非参数回归模型自动构建并生成详尽的报告，以提升数据分析的效率和透明度，对于推动统计学和机器学习领域的实践应用具有重要意义。

Next, we apply several simpliﬁcations to the kernel expres-

sion: The product of two SE kernels is another SE with dif-

ferent parameters. Multiplying WN by any stationary kernel

(C, WN, SE, or PER) gives another WN kernel. Multiplying

any kernel by C only changes the parameters of the original

kernel.

After applying these rules, the kernel can as be written as

a sum of terms of the form:

LIN

(m)

(n)

, (4.1)

where K is one of WN, C, SE,

PER

(k)

or SE

PER

(k)

and

(i)

denotes a product of kernels, each with different

parameters.

Sums of kernels are sums of functions Formally, if

(x) ∼ GP(0, k

) and independently f

(x) ∼ GP(0, k

)

then f

(x) + f

(x) ∼ GP(0, k

+ k

). This lets us de-

scribe each product of kernels separately.

Each kernel in a product modiﬁes a model in a consis-

tent way This allows us to describe the contribution of

each kernel in a product as an adjective, or more generally

as a modiﬁer of a noun. We now describe how each kernel

modiﬁes a model and how this can be described in natural

language:

• Multiplication by SE removes long range correlations

from a model since SE(x, x

) decreases monotonically to

0 as |x−x

| increases. This can be described as making an

existing model’s correlation structure ‘local’ or ‘approxi-

mate’.

• Multiplication by LIN is equivalent to multiplying the

function being modeled by a linear function. If f(x) ∼

GP(0, k), then xf (x) ∼ GP (0, k × LIN). This causes the

standard deviation of the model to vary linearly without

affecting the correlation and can be described as e.g. ‘with

linearly increasing standard deviation’.

• Multiplication by σ is equivalent to multiplying the

function being modeled by a sigmoid which means that

the function goes to zero before or after some point. This

can be described as e.g. ‘from [time]’ or ‘until [time]’.

• Multiplication by PER modiﬁes the correlation struc-

ture in the same way as multiplying the function

by an independent periodic function. Formally, if

(x) ∼ GP(0, k

) and f

(x) ∼ GP(0, k

) then

Cov [f

(x)f

(x), f

)] = k

(x, x

This can be loosely described as e.g. ‘modulated by a pe-

riodic function with a period of [period] [units]’.

Constructing a complete description of a product of ker-

nels We choose one kernel to act as a noun which is then

described by the functions it encodes for when unmodiﬁed

e.g. ‘smooth function’ for SE. Modiﬁers corresponding to

the other kernels in the product are then appended to this

description, forming a noun phrase of the form:

Determiner + Premodiﬁers + Noun + Postmodiﬁers

As an example, a kernel of the form SE × PER × LIN × σ

could be described as an

|{z}

approximately

× PER

|{z}

periodic function

× LIN

|{z}

with linearly growing amplitude

× σ

|{z}

until 1700.

where PER has been selected as the head noun.

In principle, any assignment of kernels in a product to

these different phrasal roles is possible, but in practice we

found certain assignments to produce more interpretable

phrases than others. The head noun is chosen according to

the following ordering:

PER > WN, SE, C >

LIN

(m)

(n)

i.e. PER is always chosen as the head noun when present.

Ordering additive components The reports generated by

ABCD attempt to present the most interesting or important

features of a data set ﬁrst. As a heuristic, we order com-

ponents by always adding next the component which most

reduces the 10-fold cross-validated mean absolute error.

4.1 Worked example

Suppose we start with a kernel of the form

SE × (WN × LIN + CP(C, PER)).

This is converted to a sum of products:

SE × WN × LIN + SE × C × σ + SE × PER × ¯σ.

which is simpliﬁed to

WN × LIN + SE × σ + SE × PER × ¯σ.

To describe the ﬁrst component, the head noun description

for WN, ‘uncorrelated noise’, is concatenated with a mod-

iﬁer for LIN, ‘with linearly increasing standard deviation’.

The second component is described as ‘A smooth function

with a lengthscale of [lengthscale] [units]’, corresponding

to the SE, ‘which applies until [changepoint]’, which corre-

sponds to the σ. Finally, the third component is described

as ‘An approximately periodic function with a period of [pe-

riod] [units] which applies from [changepoint]’.

5 Example descriptions of time series

We demonstrate the ability of our procedure to discover

and describe a variety of patterns on two time series. Full

automatically-generated reports for 13 data sets are provided

as supplementary material.

5.1 Summarizing 400 Years of Solar Activity

We show excerpts from the report automatically generated

on annual solar irradiation data from 1610 to 2011 (ﬁgure 2).

This time series has two pertinent features: a roughly 11-

year cycle of solar activity, and a period lasting from 1645 to

1715 with much smaller variance than the rest of the dataset.

This ﬂat region corresponds to the Maunder minimum, a pe-

riod in which sunspots were extremely rare (Lean, Beer, and

剩余10页未读，继续阅读

laurelyjh

粉丝: 0
资源: 1

自动构建与自然语言描述的非参数回归模型

Python-一个半参数图像合成SIMS的Tensorflow实现

基于RIME-SVR霜冰算法优化支持向量机的数据多输入单输出回归预测

多元线性回归显著性分析与偏回归标准偏差研究

R语言统计建模实战：回归与方差分析的R语言实现

深入理解Logistic回归：R语言中的参数估计和模型选择

Python与R语言回归分析对比：选择工具与代码实战的全方位解析

【R语言与mlr包实战演练】：回归分析与时间序列预测的专业指南

【LSTM全解析】：入门到精通，深度揭秘时间序列分析与自然语言处理

时间序列自回归模型：探索与实操技术详解

回归分析：Python预测模型构建的实用技巧

最新资源