调用sklearn库实现对鸢尾花数据的分类,采用逻辑回归分类方法,并计算以下多分类性能评价指标(两种方式):混淆矩阵、准确率、精确率、召回率、F1值、分类报告,并绘制PR曲线与ROC曲线。
时间: 2023-08-18 10:06:15 浏览: 82
好的,我可以为您解答这个问题。
首先,我们需要导入所需的库和数据集:
```python
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix, accuracy_score, precision_score, recall_score, f1_score, classification_report, plot_precision_recall_curve, plot_roc_curve
# 导入数据集
iris = load_iris()
X = iris.data
y = iris.target
# 划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
```
接下来,我们可以使用逻辑回归模型进行分类:
```python
# 定义逻辑回归模型
logreg = LogisticRegression()
# 训练模型
logreg.fit(X_train, y_train)
# 预测测试集结果
y_pred = logreg.predict(X_test)
```
然后,我们可以计算所需的多分类性能评价指标:
```python
# 计算混淆矩阵
cm = confusion_matrix(y_test, y_pred)
print("Confusion matrix:")
print(cm)
# 计算准确率
acc = accuracy_score(y_test, y_pred)
print("Accuracy: {:.2f}%".format(acc*100))
# 计算精确率
prec = precision_score(y_test, y_pred, average='macro')
print("Precision: {:.2f}%".format(prec*100))
# 计算召回率
rec = recall_score(y_test, y_pred, average='macro')
print("Recall: {:.2f}%".format(rec*100))
# 计算F1值
f1 = f1_score(y_test, y_pred, average='macro')
print("F1 score: {:.2f}%".format(f1*100))
# 输出分类报告
report = classification_report(y_test, y_pred)
print("Classification report:")
print(report)
```
最后,我们可以绘制PR曲线与ROC曲线:
```python
# 绘制PR曲线
plot_precision_recall_curve(logreg, X_test, y_test)
# 绘制ROC曲线
plot_roc_curve(logreg, X_test, y_test)
```
完整代码如下:
```python
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix, accuracy_score, precision_score, recall_score, f1_score, classification_report, plot_precision_recall_curve, plot_roc_curve
# 导入数据集
iris = load_iris()
X = iris.data
y = iris.target
# 划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# 定义逻辑回归模型
logreg = LogisticRegression()
# 训练模型
logreg.fit(X_train, y_train)
# 预测测试集结果
y_pred = logreg.predict(X_test)
# 计算混淆矩阵
cm = confusion_matrix(y_test, y_pred)
print("Confusion matrix:")
print(cm)
# 计算准确率
acc = accuracy_score(y_test, y_pred)
print("Accuracy: {:.2f}%".format(acc*100))
# 计算精确率
prec = precision_score(y_test, y_pred, average='macro')
print("Precision: {:.2f}%".format(prec*100))
# 计算召回率
rec = recall_score(y_test, y_pred, average='macro')
print("Recall: {:.2f}%".format(rec*100))
# 计算F1值
f1 = f1_score(y_test, y_pred, average='macro')
print("F1 score: {:.2f}%".format(f1*100))
# 输出分类报告
report = classification_report(y_test, y_pred)
print("Classification report:")
print(report)
# 绘制PR曲线
plot_precision_recall_curve(logreg, X_test, y_test)
# 绘制ROC曲线
plot_roc_curve(logreg, X_test, y_test)
```
希望可以帮到您!
阅读全文