python中xgboost的校准曲线代码
时间: 2024-08-14 20:05:27 浏览: 172
在Python中使用XGBoost创建校准曲线(Calibration Curve)可以帮助我们评估模型预测概率的准确性。首先,你需要安装xgboost库和一些辅助工具,如matplotlib和sklearn。以下是一个简单的示例:
```python
import xgboost as xgb
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelBinarizer
from sklearn.calibration import calibration_curve
import matplotlib.pyplot as plt
# 假设你有一个二分类数据集df,特征列是X,目标列是y
X = df['features']
y = df['target']
# 划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# 训练XGBoost模型
model = xgb.XGBClassifier()
model.fit(X_train, y_train)
# 预测并获取概率值
probs = model.predict_proba(X_test)[:, 1] # 对测试集的概率输出
# 将类别标签转换为二进制形式
lb = LabelBinarizer()
y_test_bin = lb.fit_transform(y_test)
# 创建校准曲线
fraction_of_positives, mean_predicted_value = calibration_curve(y_test_bin, probs, n_bins=10)
# 绘制校准曲线
plt.plot(mean_predicted_value, fraction_of_positives, marker='o')
plt.xlabel('Predicted Probability')
plt.ylabel('Fraction of Positives')
plt.title('Calibration Curve (XGBoost)')
plt.show()
阅读全文