首页用Python确定随机森林回归的最佳参数取值，以RMSE最小作为评价标准，并将结果可视化

用Python确定随机森林回归的最佳参数取值，以RMSE最小作为评价标准，并将结果可视化

时间: 2024-05-14 22:15:15 浏览: 233

以下是使用Python确定随机森林回归的最佳参数取值的示例代码： ```python import pandas as pd from sklearn.ensemble import RandomForestRegressor from sklearn.metrics import mean_squared_error from sklearn.model_selection import GridSearchCV import matplotlib.pyplot as plt # 加载数据 data = pd.read_csv('data.csv') X = data.drop('target', axis=1) y = data['target'] # 定义参数范围 param_grid = { 'n_estimators': [100, 200, 300], 'max_depth': [3, 5, 7], 'min_samples_split': [2, 4, 6], 'min_samples_leaf': [1, 2, 4] } # 定义模型 rf = RandomForestRegressor(random_state=42) # 使用GridSearchCV确定最佳参数 grid_search = GridSearchCV(rf, param_grid, cv=5, scoring='neg_mean_squared_error') grid_search.fit(X, y) best_params = grid_search.best_params_ print('最佳参数：', best_params) # 训练模型 rf = RandomForestRegressor(**best_params, random_state=42) rf.fit(X, y) # 预测并计算RMSE y_pred = rf.predict(X) rmse = mean_squared_error(y, y_pred, squared=False) print('RMSE：', rmse) # 可视化预测结果 plt.scatter(y, y_pred) plt.plot([0, 1], [0, 1], '--k') plt.xlabel('真实值') plt.ylabel('预测值') plt.title('随机森林回归预测结果') plt.show() ``` 这个示例代码首先加载数据，然后定义了随机森林回归的参数范围。接着使用GridSearchCV确定最佳参数，并训练模型和预测结果。最后，使用matplotlib库可视化真实值和预测值的比较。需要注意的是，在实际应用中，可能需要更复杂的参数调优策略，例如使用随机搜索或贝叶斯优化等方法来寻找最佳参数。

阅读全文