使用CatBoost、OPTUNA进行多输入多输出回归建模的示例程序
时间: 2024-10-01 10:08:09 浏览: 38
使用CatBoost和Optuna进行多输入多输出(Multi-output Regression)回归建模的一般步骤如下:
首先,你需要安装必要的库,包括`catboost`, `optuna`, 和 `scikit-learn` 等。下面是一个简单的Python示例代码:
```python
import catboost as cb
from optuna import study, Trial
import numpy as np
from sklearn.datasets import make_multivariate_regression
from sklearn.model_selection import train_test_split
# 创建一个多输出数据集
X, y = make_multivariate_regression(n_samples=1000, n_features=5, n_informative=3, random_state=42)
y = np.stack(y, axis=-1) # 将多输出转换为单数组
# 划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
def objective(trial: Trial):
# 定义超参数搜索空间
learning_rate = trial.suggest_float("learning_rate", 0.01, 0.1, log=True)
depth = trial.suggest_int("depth", 3, 10)
model = cb.CatBoostRegressor(
iterations=100,
learning_rate=learning_rate,
depth=depth,
loss_function='RMSE', # 设置损失函数为均方根误差
eval_metric='RMSE',
random_seed=trial.number # 添加随机种子保持每个试验独立
)
# 训练模型
model.fit(X_train, y_train, use_best_model=True, early_stopping_rounds=20)
# 验证并记录结果
predictions = model.predict(X_test)
val_rmse = np.sqrt(np.mean((predictions - y_test)**2))
return val_rmse # 目标是最小化RMSE
# 创建优化研究
study = study.create_study(direction="minimize")
# 开始优化过程
study.optimize(objective, n_trials=50)
# 最优参数
best_params = study.best_params
# 使用最佳参数训练模型
final_model = cb.CatBoostRegressor(**best_params)
final_model.fit(X_train, y_train)
# 测试模型性能
test_predictions = final_model.predict(X_test)
print("Best parameters:", best_params)
print("Final RMSE on test set:", np.sqrt(np.mean((test_predictions - y_test)**2)))
阅读全文