采用python语言编写K近邻算法,选取一定量数据进行模型训练,剩余数据进行性能测试,并给出预测结果
时间: 2023-12-23 17:01:54 浏览: 69
好的,我可以给您提供一个简单的K近邻算法的Python代码,供您参考。
假设我们有一个数据集,其中包含了两个特征和一个标签,数据如下:
```
X = [[1, 2], [2, 1], [3, 4], [4, 3]]
y = [0, 0, 1, 1]
```
我们采用欧氏距离作为距离度量方法,选取k值为2,代码如下:
```python
import math
# 计算两个样本之间的欧氏距离
def euclidean_distance(x1, x2):
distance = 0.0
for i in range(len(x1)):
distance += (x1[i] - x2[i]) ** 2
return math.sqrt(distance)
# K近邻算法
def knn(X_train, y_train, X_test, k):
predictions = []
for test_sample in X_test:
# 计算测试样本与所有训练样本之间的距离
distances = []
for train_sample in X_train:
distance = euclidean_distance(test_sample, train_sample)
distances.append((train_sample, distance))
# 根据距离排序,选取前k个最近的样本
distances.sort(key=lambda x: x[1])
neighbors = [distances[i][0] for i in range(k)]
# 统计前k个样本中标签出现的次数
counts = {}
for neighbor in neighbors:
label = y_train[X_train.index(neighbor)]
counts[label] = counts.get(label, 0) + 1
# 找到出现次数最多的标签,作为预测结果
prediction = max(counts, key=counts.get)
predictions.append(prediction)
return predictions
# 测试算法性能
X_train = [[1, 2], [2, 1], [3, 4]]
y_train = [0, 0, 1]
X_test = [[4, 3]]
y_test = [1]
k = 2
predictions = knn(X_train, y_train, X_test, k)
print(predictions) # 输出预测结果
```
以上代码实现了一个简单的K近邻算法,并使用给定的数据集进行了性能测试。我们选取第4个样本作为测试样本,其他样本作为训练样本,预测结果为1,与真实标签相同,说明算法预测准确。
阅读全文