用网格搜索法对KNN回归超参数寻优,并画出来,python代码
时间: 2024-05-01 18:20:45 浏览: 126
下面是使用网格搜索法对KNN回归超参数寻优的Python代码:
```python
from sklearn.datasets import load_boston
from sklearn.model_selection import GridSearchCV, train_test_split
from sklearn.neighbors import KNeighborsRegressor
import matplotlib.pyplot as plt
# 加载数据集
boston = load_boston()
X = boston.data
y = boston.target
# 划分数据集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# 定义超参数搜索范围
param_grid = {'n_neighbors': range(1, 21),
'weights': ['uniform', 'distance'],
'p': [1, 2, 3]}
# 定义KNN回归模型
knn = KNeighborsRegressor()
# 定义网格搜索法对象
grid_search = GridSearchCV(knn, param_grid, cv=5, n_jobs=-1)
# 在训练集上进行网格搜索
grid_search.fit(X_train, y_train)
# 输出最优超参数组合
print("Best parameters: {}".format(grid_search.best_params_))
print("Best cross-validation score: {:.2f}".format(grid_search.best_score_))
# 画出不同超参数组合下的模型性能图像
results = grid_search.cv_results_
params = results['params']
mean_test_scores = results['mean_test_score']
plt.figure(figsize=(12, 6))
plt.title("GridSearchCV evaluating using multiple scorers simultaneously", fontsize=16)
plt.xlabel("Hyperparameters")
plt.ylabel("Score")
plt.grid()
# 画出不同p值下的模型性能图像
p1_uniform_scores = []
p1_distance_scores = []
p2_uniform_scores = []
p2_distance_scores = []
p3_uniform_scores = []
p3_distance_scores = []
for i, param in enumerate(params):
if param['p'] == 1 and param['weights'] == 'uniform':
p1_uniform_scores.append(mean_test_scores[i])
elif param['p'] == 1 and param['weights'] == 'distance':
p1_distance_scores.append(mean_test_scores[i])
elif param['p'] == 2 and param['weights'] == 'uniform':
p2_uniform_scores.append(mean_test_scores[i])
elif param['p'] == 2 and param['weights'] == 'distance':
p2_distance_scores.append(mean_test_scores[i])
elif param['p'] == 3 and param['weights'] == 'uniform':
p3_uniform_scores.append(mean_test_scores[i])
elif param['p'] == 3 and param['weights'] == 'distance':
p3_distance_scores.append(mean_test_scores[i])
plt.plot(range(1, 21), p1_uniform_scores, label="p=1, weights='uniform'")
plt.plot(range(1, 21), p1_distance_scores, label="p=1, weights='distance'")
plt.plot(range(1, 21), p2_uniform_scores, label="p=2, weights='uniform'")
plt.plot(range(1, 21), p2_distance_scores, label="p=2, weights='distance'")
plt.plot(range(1, 21), p3_uniform_scores, label="p=3, weights='uniform'")
plt.plot(range(1, 21), p3_distance_scores, label="p=3, weights='distance'")
plt.legend()
plt.show()
```
该代码首先加载波士顿房价数据集,然后划分训练集和测试集。接着定义了超参数搜索范围,并定义了KNN回归模型和网格搜索法对象。在训练集上进行网格搜索后,输出最优超参数组合和最优交叉验证得分。最后,使用Matplotlib库画出了不同超参数组合下的模型性能图像。
阅读全文