在聚类模型中,用GridSearchCV优化,scoring=silhouette,和scoring = make_scorer(calinski_harabasz_score)时报错UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details: Traceback (most recent call last): File "D:\python\lib\site-packages\sklearn\model_selection_validation.py", line 759, in _score scores = scorer(estimator, X_test) TypeError: call() missing 1 required positional argument: 'y_true'
时间: 2024-03-20 15:45:19 浏览: 216
python Calinski-Harabasz指数评价K-means聚类模型
这个错误的原因是GridSearchCV默认需要使用带有y_true参数的scorer,而silhouette和calinski_harabasz_score没有y_true参数。解决的方法是使用make_scorer函数将它们转换成带有y_true参数的scorer。具体的实现方式如下:
```python
from sklearn.metrics import silhouette_score, make_scorer, calinski_harabasz_score
from sklearn.model_selection import GridSearchCV
# 定义需要调参的参数和参数取值范围
param_grid = {'n_clusters':[2,3,4,5,6]}
# 定义需要使用的scorer
scoring = {'Silhouette': make_scorer(silhouette_score),
'Calinski_Harabasz': make_scorer(calinski_harabasz_score)}
# 定义聚类模型
model = KMeans()
# 定义GridSearchCV
grid = GridSearchCV(model, param_grid=param_grid, scoring=scoring, refit=False)
# 训练模型
grid.fit(X)
# 输出最佳参数
print(grid.best_params_)
```
在这个例子中,我们使用了make_scorer函数将silhouette_score和calinski_harabasz_score转换成了带有y_true参数的scorer,然后在GridSearchCV中使用了这两个scorer来进行调参。最后输出最佳参数。
阅读全文