首页利用网格搜索寻找随机森林回归模型的最佳超参数，给出代码

利用网格搜索寻找随机森林回归模型的最佳超参数，给出代码

时间: 2024-11-03 10:20:45 浏览: 91

网格搜索是一种常用的优化机器学习模型参数的方法，通过遍历预定义的超参数组合来找到最佳性能。对于随机森林回归模型（RandomForestRegressor），我们可以使用Python的sklearn库中的`GridSearchCV`函数来进行。以下是一个简单的例子，展示如何针对`RandomForestRegressor`找寻`n_estimators`（树的数量）和`max_depth`（树的最大深度）这两个常见的超参数： ```python from sklearn.ensemble import RandomForestRegressor from sklearn.model_selection import GridSearchCV from sklearn.datasets import load_boston # 加载波士顿房价数据集 boston = load_boston() X, y = boston.data, boston.target # 定义参数网格 param_grid = { 'n_estimators': [50, 100, 200], 'max_depth': [None, 10, 20, 30] } # 创建随机森林回归模型 rf_reg = RandomForestRegressor() # 使用GridSearchCV grid_search = GridSearchCV(rf_reg, param_grid, cv=5, scoring='r2') # 'cv=5' 表示交叉验证次数，'scoring'选择评估指标（这里我们选R^2） # 训练模型并找到最佳参数组合 grid_search.fit(X, y) # 输出最佳参数及其对应评分 best_params = grid_search.best_params_ best_score = grid_search.best_score_ print(f"Best parameters: {best_params}") print(f"Best R^2 score: {best_score}") # 获取训练好的最佳模型 best_rf_reg = grid_search.best_estimator_ ``` 在这个例子中，`GridSearchCV`会遍历`param_grid`中的所有组合，并在每次迭代中训练一个随机森林模型，然后计算评估指标（在这里是R^2分数）。最终返回的是最佳参数和对应的最高得分。

阅读全文