基于sklearn鸢尾花数据集,编写代码训练一个AdaBoost模型,并给出accuracy、recall、precision、F1和auc等分类模型精度指标
时间: 2024-11-25 11:32:09 浏览: 8
机器学习用 adaboost来处理鸢尾花数据集.zip
基于`sklearn`的鸢尾花数据集训练AdaBoost模型的代码如下:
```python
# 导入所需的库
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import AdaBoostClassifier
from sklearn.metrics import accuracy_score, recall_score, precision_score, f1_score, roc_auc_score, classification_report, confusion_matrix
# 加载数据
iris = load_iris()
X = iris.data
y = iris.target
# 数据预处理 (这里假设我们不需要标准化)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# 创建AdaBoost模型
ada_clf = AdaBoostClassifier(n_estimators=100, learning_rate=1, random_state=42)
# 训练模型
ada_clf.fit(X_train, y_train)
# 预测
y_pred = ada_clf.predict(X_test)
# 计算精度指标
accuracy = accuracy_score(y_test, y_pred)
recall = recall_score(y_test, y_pred, average='macro')
precision = precision_score(y_test, y_pred, average='macro')
f1 = f1_score(y_test, y_pred, average='weighted')
# AUC-ROC曲线
roc_auc = roc_auc_score(y_test, ada_clf.predict_proba(X_test)[:, 1])
# 输出分类报告
print("Classification Report:")
print(classification_report(y_test, y_pred))
# 输出混淆矩阵
conf_mat = confusion_matrix(y_test, y_pred)
print("\nConfusion Matrix:\n", conf_mat)
print(f"Accuracy: {accuracy*100:.2f}%")
print(f"Recall: {recall*100:.2f}%")
print(f"Precision: {precision*100:.2f}%")
print(f"F1 Score: {f1*100:.2f}%")
print(f"AUC-ROC: {roc_auc}")
```
在这个例子中,我们使用了`AdaBoostClassifier`的默认设置,但可以根据需要调整`n_estimators`和`learning_rate`。注意`roc_auc_score`需要概率输出,所以我们使用的是`predict_proba`而不是`predict`。
阅读全文