强混合观测的正则化风险最小化器的oracle不等式

153 浏览量更新于2024-07-15 收藏 150KB PDF 举报

"这篇研究论文探讨了在强混合观察条件下正则化风险最小化器的Oracle不等式，由Feilong CAO和Xing XING撰写，发表于《Frontiers of Mathematics in China》2013年8卷2期，第301-315页。该论文扩展了之前针对独立同分布样本的研究结果，适用于指数级强混合观测的情况，主要关注点是学习算法的泛化性能。" 在机器学习和统计预测领域，Oracle不等式是一个重要的理论工具，它提供了一种评估学习算法性能的界限，理想情况下，这种算法可以接近最优的未知模型，即Oracle。这篇论文的核心贡献是建立了一个通用的Oracle不等式，该不等式适用于处理强混合序列的数据，这些数据在实际应用中非常常见，如时间序列分析和非独立同分布样本的问题。强混合（Strong Mixing）是一种统计依赖性的概念，它描述了随机变量序列中的远距离依赖性减弱的程度。在学习算法中，考虑这种依赖性对于理解算法在非独立数据上的表现至关重要。指数级强混合是一个更强的假设，它比常见的弱混合假设更严格，但也使得分析更为精确。论文中，作者将Oracle不等式的应用扩展到了支持向量机（SVM）类型的算法。SVM是一种广泛使用的监督学习方法，用于分类和回归问题，通过构造最大边距超平面来实现。在处理非独立数据时，SVM的性能和泛化能力可能会受到影响，而Oracle不等式可以帮助量化这种影响，并为算法优化提供指导。关键词包括Oracle不等式、指数级强混合、正则化风险最小化器，表明论文主要讨论的是这三个关键概念的相互关系。分类号68T05和62J02分别对应于计算机科学的机器学习和概率论与统计的回归分析领域，反映了论文的学科定位。论文的介绍部分指出了学习理论领域的增长兴趣，特别是对学习算法泛化性能的研究。Vapnik和Chervonenkis的工作被认为是这一领域的先驱，他们的理论为后续的深入研究奠定了基础。通过扩展这些早期成果，这篇论文为处理复杂依赖结构的数据提供了新的理论框架，对理解和改进机器学习算法在实际问题中的表现具有重要意义。

Oracle inequality for regularized risk minimizers 303

for some α>0,β>0, and c>0, where the constants β and c are assumed to

be known.

Denote by z the sample set of size n observations

z = {(x

), (x

),...,(x

)}

drawn from the exponentially strongly mixing sequence Z . Set

(α)







1/(β+1)



−1



where n denotes the number of observations drawn from Z and u (resp. u)

denotes the greatest (resp. least) integer less (resp. greater) than or equal to u.

The goal of machine learning is to learn a function f from the sample set

such that for future x ∈ X, the function f will forecast accurately. Let

E (f)=E[l(f,z)] =



l(f,z)dP

be generalization error of function f, where the function l(f,z) is usually called

loss function. In this paper, we would like to establish a general oracle inequality

which includes classiﬁcation (learning of binary-valued function) and regression

(learning of real-valued function).

A learning task is to ﬁnd the minimizer of the expected risk E (f )overa

given hypothesis space H. Since P is unknown, the minimizer of the expected

risk cannot be computed directly. According to the empirical risk minimizing

(ERM) (see [17]), we minimize the empirical risk

(f)=



i=1

l(f,z

Deﬁne f

to be a function minimizing the expected risk E (f)overf ∈ H,

i.e.,

=argmin

f∈H

E (f).

The ERM algorithm is known to be a classical learning theory. However,

when the complexity of the given function set H is high, the ERM algorithm

is usually very time-consuming and overﬁtting may happen (see [21]). Thus,

regularization techniques are frequently adopted. In this paper, we use

Ω: P × H → [0, ∞)

as a penalty function, where P = ∅ is a set of parameters.

Now, we deﬁne the target function. Let (p

) be a minimizer of (p, f ) →

(f)+Ω(p, f), that is,

)=arg min

p∈P,f∈H

(f)+Ω(p, f)}.

剩余14页未读，继续阅读

weixin_38591615

粉丝: 8
资源: 907

强混合观测的正则化风险最小化器的oracle不等式

Young_s inequality for products - Wikipedia.pdf

write an IELTS sample article on the topic of advertisement with advanced vocabulary and phrases

What element or part of the British educational system you like best, or dislike most?

Element-wise inequality 是啥意思

If you are a sociology expert, analyze the current social structure of Korea and explain the key points in the form of a list

module 'inequality' has no attribute 'intergroup_theil_index'

继上，More precisely, we consider positive solutions δ > 0 to the ζ-based critical inequality，请解释这里的critical inequality要怎么理解

how to solve nonlinear inquality

Traceback (most recent call last): File "<string>", line 8, in <module> TypeError: unsupported operand type(s) for |: 'Inequality' and 'Inequality'

最新资源