python实现knn算法鸢尾花
时间: 2023-08-30 17:09:14 浏览: 123
KNN(K-Nearest Neighbors)算法是一种常用的分类算法,它的基本思想是找到距离待分类样本最近的K个已知类别的样本点,然后将待分类样本归为出现最多的类别。
下面是Python实现KNN算法对鸢尾花分类的代码:
```python
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
class KNN:
def __init__(self, k):
self.k = k
def fit(self, X_train, y_train):
self.X_train = X_train
self.y_train = y_train
def predict(self, X_test):
y_pred = []
for x in X_test:
distances = []
for i in range(len(self.X_train)):
distance = np.sqrt(np.sum((x - self.X_train[i]) ** 2))
distances.append((distance, self.y_train[i]))
distances = sorted(distances)
k_nearest_neighbors = distances[:self.k]
k_nearest_neighbors_labels = [label for _, label in k_nearest_neighbors]
most_common_label = max(set(k_nearest_neighbors_labels), key=k_nearest_neighbors_labels.count)
y_pred.append(most_common_label)
return y_pred
data = load_iris()
X = data.data
y = data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
model = KNN(k=5)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
```
上述代码首先导入必要的库,然后定义了一个KNN类,其中`__init__`方法初始化了一个K值,`fit`方法用于训练模型,`predict`方法用于对测试样本进行预测。在`predict`方法中,首先计算测试样本与训练样本之间的距离,然后将距离最近的K个样本点的标签保存到一个列表中,最后统计标签列表中出现最多的标签作为预测结果。
接下来,我们加载鸢尾花数据集并将其分为训练集和测试集。然后,我们使用KNN模型对训练集进行训练,并使用测试集进行预测。最后,我们计算模型的精度并打印出来。
阅读全文