python写出一个随机森林回归任务超参数调优
时间: 2024-09-28 22:09:33 浏览: 39
PCA降维+分类器 python语言写的
在Python中,我们可以使用sklearn库中的GridSearchCV工具来进行随机森林回归任务的超参数调优。首先,你需要安装必要的库,如`sklearn`, `numpy`, 和 `pandas`。以下是一个简单的步骤:
1. 导入库和模型:
```python
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import GridSearchCV
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
import numpy as np
```
2. 加载数据集:
```python
boston = load_boston()
X = boston.data
y = boston.target
```
3. 划分训练集和测试集:
```python
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
```
4. 定义参数网格(超参数搜索范围):
```python
param_grid = {
'n_estimators': [50, 100, 200],
'max_depth': [None, 10, 20, 30],
'min_samples_split': [2, 5, 10],
'min_samples_leaf': [1, 2, 4]
}
```
这里的参数包括树的数量(n_estimators)、最大深度(max_depth)、最小分裂样本数(min_samples_split)和最小叶子节点样本数(min_samples_leaf)。
5. 创建随机森林回归模型并进行网格搜索:
```python
rf_reg = RandomForestRegressor(random_state=42)
grid_search = GridSearchCV(rf_reg, param_grid, cv=5, scoring='r2', return_train_score=True) # 使用交叉验证
grid_search.fit(X_train, y_train)
```
6. 查看结果:
```python
best_params = grid_search.best_params_
best_score = grid_search.best_score_
print(f"Best parameters: {best_params}")
print(f"Best score (R^2): {best_score}")
```
7. 预测:
```python
grid_search_best_model = grid_search.best_estimator_
predictions = grid_search_best_model.predict(X_test)
```
阅读全文