局部加权k邻近算法：从全局到局部

需积分: 8 200 浏览量更新于2024-09-11 收藏 535KB PDF 举报

"k⇤-Nearest Neighbors- From Global to Local" k星最近邻（k⇤-Nearest Neighbors）算法是一种重要的非参数机器学习方法，广泛应用于模式识别和机器学习领域。它基于一个基本思想：对于未知数据点的预测，依赖于其最接近的k个训练样本的加权平均。这个算法的效率和准确性在很大程度上取决于两个关键因素：邻居的数量（k值）和权重的分配。传统的k-NN方法通常在全球范围内应用，即在整个数据集上使用相同的k值和权重策略。然而，这种方法可能无法很好地适应局部特征的变化，因为不同区域的数据分布可能具有不同的复杂性。因此，本地化（Local）的k-NN方法应运而生，它允许根据每个数据点的局部环境动态调整k值和权重。这篇论文提出了一个简单但有效的局部加权回归/分类方法，通过明确地处理偏差-方差权衡问题来改进k-NN。作者们定义了一个优化权重的概念，并提出了一种算法，可以有效地为每个数据点找到最优的权重和k值。这种适应性方法的优势在于，它能够针对每个需要预测的数据点进行自适应的调整，从而提高预测性能。论文中，作者Oren Anava和Kﬁr Y. Levy展示了他们的方法在多个数据集上的应用，结果显示该方法相对于标准的局部加权方法具有优越的性能。Nadaraya-Watson估计作为非参数学习的基石之一，也被提到与k-NN算法相比较，体现了k-NN在非参数学习中的重要地位。 Nadaraya-Watson估计是一种回归方法，也基于k-NN的思想，但通常使用高斯核来加权邻居，以平滑预测。相比之下，k⇤-NN方法的重点在于如何动态调整权重和k值，以更好地适应数据的局部结构。总结来说，k⇤-Nearest Neighbors- From Global to Local这篇论文探讨了如何通过局部优化策略改进k-NN算法，从而在处理非均匀分布或复杂数据时提高预测准确性和模型泛化能力。这种方法对于理解和改进非参数学习方法，特别是k-NN，提供了新的视角，并且其代码可在作者的主页和GitHub上获取，便于研究者和实践者进一步探索和应用。

⇤

-Nearest Neighbors: From Global to Local

Oren Anava

The Voleon Group

oren@voleon.com

Kﬁr Y. Levy

ETH Zurich

yehuda.levy@inf.ethz.ch

Abstract

The weighted

-nearest neighbors algorithm is one of the most fundamental non-

parametric methods in pattern recognition and machine learning. The question of

setting the optimal number of neighbors as well as the optimal weights has received

much attention throughout the years, nevertheless this problem seems to have

remained unsettled. In this paper we offer a simple approach to locally weighted

regression/classiﬁcation, where we make the bias-variance tradeoff explicit. Our

formulation enables us to phrase a notion of optimal weights, and to efﬁciently ﬁnd

these weights as well as the optimal number of neighbors efﬁciently and adaptively,

for each data point whose value we wish to estimate. The applicability of our

approach is demonstrated on several datasets, showing superior performance over

standard locally weighted methods.

1 Introduction

The

-nearest neighbors (

-NN) algorithm [

], and Nadarays-Watson estimation [

] are the

cornerstones of non-parametric learning. Owing to their simplicity and ﬂexibility, these procedures

had become the methods of choice in many scenarios [

], especially in settings where the underlying

model is complex. Modern applications of the

-NN algorithm include recommendation systems [

text categorization [

], heart disease classiﬁcation [

], and ﬁnancial market prediction [

], amongst

others.

A successful application of the weighted

-NN algorithm requires a careful choice of three ingredients:

the number of nearest neighbors k, the weight vector ↵, and the distance metric. The latter requires

domain knowledge and is thus henceforth assumed to be set and known in advance to the learner.

Surprisingly, even under this assumption, the problem of choosing the optimal

and

↵

is not fully

understood and has been studied extensively since the

1950

’s under many different regimes. Most

of the theoretic work focuses on the asymptotic regime in which the number of samples

goes to

inﬁnity [

], and ignores the practical regime in which

is ﬁnite. More importantly, the vast

majority of

-NN studies aim at ﬁnding an optimal value of

per dataset, which seems to overlook

the speciﬁc structure of the dataset and the properties of the data points whose labels we wish to

estimate. While kernel based methods such as Nadaraya-Watson enable an adaptive choice of the

weight vector

↵

, theres still remains the question of how to choose the kernel’s bandwidth



, which

could be thought of as the parallel of the number of neighbors

-NN. Moreover, there is no

principled approach towards choosing the kernel function in practice.

In this paper we offer a coherent and principled approach to adaptively choosing the number of

neighbors

and the corresponding weight vector

↵ 2 R

per decision point. Given a new decision

point, we aim to ﬁnd the best locally weighted predictor, in the sense of minimizing the distance

between our prediction and the ground truth. In addition to yielding predictions, our approach

enbles us to provide a per decision point guarantee for the conﬁdence of our predictions. Fig. 1

illustrates the importance of choosing

adaptively. In contrast to previous works on non-parametric

regression/classiﬁcation, we do not assume that the data

{(x

)}

i=1

arrives from some (unknown)

underlying distribution, but rather make a weaker assumption that the labels

}

i=1

are independent

30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain.

下载后可阅读完整内容，剩余8页未读，立即下载

kaikaiaizuqiu

粉丝: 7
资源: 4

局部加权k邻近算法：从全局到局部

k-nearest-neighbors-from-global-to-local

in Time Series Forecasting: Unveiling Trends, Predicting the Future, and New Insights from Data ...

Advanced Feature Engineering Techniques: 10 Methods to Power Up Your Models

Java核心技术.卷2.高级特性.原书第12版.中文

【java毕业设计】springboot共享经济背景下校园闲置物品交易平台(springboot+mysql+说明文档).zip

anscombe.csv

大模型备案推进，AI生成内容商业化进程加快

基于Android ——MyDate 好看的日历，效果明显。.zip

大学心理咨询管理子系统 SSM毕业设计 附带论文.zip

mysql8 docker 镜像

最新资源

大学心理咨询管理子系统 SSM毕业设计附带论文.zip