首页使用 GradientBoostingClassifier 时怎么进行交叉验证

使用 GradientBoostingClassifier 时怎么进行交叉验证

时间: 2024-05-05 07:20:46 浏览: 156

可以使用 Scikit-learn 提供的 GridSearchCV 或 RandomizedSearchCV 方法进行交叉验证。 GridSearchCV 方法会在给定的参数网格中进行完全搜索，每个参数组合都会尝试一遍。而 RandomizedSearchCV 方法则会在给定的参数分布中进行随机采样，以尝试更多的参数组合，但不保证能够找到全局最优解。以下是使用 GridSearchCV 进行交叉验证的示例代码： ```python from sklearn.ensemble import GradientBoostingClassifier from sklearn.model_selection import GridSearchCV from sklearn.datasets import make_classification # 生成一些分类数据作为示例 X, y = make_classification(n_samples=1000, random_state=42) # 定义要搜索的参数网格 param_grid = { 'learning_rate': [0.1, 0.05, 0.01], 'max_depth': [3, 5, 7], 'n_estimators': [50, 100, 200] } # 创建一个 GradientBoostingClassifier 模型 clf = GradientBoostingClassifier(random_state=42) # 使用 GridSearchCV 进行交叉验证 grid_search = GridSearchCV(clf, param_grid=param_grid, cv=5) grid_search.fit(X, y) # 输出最佳参数组合和最佳得分 print("Best parameter combination: ", grid_search.best_params_) print("Best score: ", grid_search.best_score_) ``` 以上代码会将数据集分成 5 折进行交叉验证，然后在给定的参数网格中进行搜索，找到最佳参数组合和最佳得分。

阅读全文