用python代码实现 1.利用KNN方法对红酒数据集进行分类,并对不同K值的模型性能进行可视化。 2.将红酒数据集按照3:1的比例划分为训练集与数据集,从合理的范围内选取适量的K值,并验证它们在测试集上的准确率。
时间: 2023-05-20 17:03:21 浏览: 93
KNN.ZIP_Glass数据集_K-NN python_K._knn_knn.py glass.txt
好的,以下是我的回答:
1. 以下是用Python实现KNN方法对红酒数据集进行分类的代码:
```python
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
import matplotlib.pyplot as plt
# 加载红酒数据集
wine = pd.read_csv('wine.csv')
# 将数据集分为特征和标签
X = wine.iloc[:, 1:]
y = wine.iloc[:, 0]
# 将数据集分为训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)
# 定义不同的K值
k_values = [1, 3, 5, 7, 9, 11, 13, 15, 17, 19]
# 计算不同K值的模型性能
train_scores = []
test_scores = []
for k in k_values:
knn = KNeighborsClassifier(n_neighbors=k)
knn.fit(X_train, y_train)
train_scores.append(knn.score(X_train, y_train))
test_scores.append(knn.score(X_test, y_test))
# 可视化不同K值的模型性能
plt.plot(k_values, train_scores, label='Train')
plt.plot(k_values, test_scores, label='Test')
plt.xlabel('K')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
```
2. 以下是将红酒数据集按照3:1的比例划分为训练集与数据集,并验证它们在测试集上的准确率的代码:
```python
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
# 加载红酒数据集
wine = pd.read_csv('wine.csv')
# 将数据集分为特征和标签
X = wine.iloc[:, 1:]
y = wine.iloc[:, 0]
# 将数据集分为训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)
# 定义不同的K值
k_values = [3, 5, 7, 9, 11]
# 计算不同K值的模型性能
for k in k_values:
knn = KNeighborsClassifier(n_neighbors=k)
knn.fit(X_train, y_train)
train_score = knn.score(X_train, y_train)
test_score = knn.score(X_test, y_test)
print('K={}, Train Accuracy={}, Test Accuracy={}'.format(k, train_score, test_score))
```
希望我的回答能够帮到你!
阅读全文