调用sklearn库实现对鸢尾花数据的分类,采用逻辑回归分类方法,并计算以下多分类性能评价指标(两种方式):混淆矩阵、准确率、精确率、召回率、F1值、分类报告,并绘制PR曲线与ROC曲线
时间: 2023-08-18 19:06:15 浏览: 177
好的,我会尽力回答你的问题。
首先,我们需要加载鸢尾花数据集,代码如下:
```python
from sklearn.datasets import load_iris
data = load_iris()
X = data.data
y = data.target
```
接下来,我们将数据集分为训练集和测试集,并使用逻辑回归分类器进行训练和预测,代码如下:
```python
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
clf = LogisticRegression(max_iter=1000)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
```
现在,我们可以使用以下两种方式计算分类性能评价指标:
### 方法一:使用sklearn.metrics库
```python
from sklearn.metrics import confusion_matrix, accuracy_score, precision_score, recall_score, f1_score, classification_report, plot_precision_recall_curve, plot_roc_curve
# 混淆矩阵
cm = confusion_matrix(y_test, y_pred)
print("Confusion Matrix:")
print(cm)
# 准确率
acc = accuracy_score(y_test, y_pred)
print("Accuracy:", acc)
# 精确率
prec = precision_score(y_test, y_pred, average='weighted')
print("Precision:", prec)
# 召回率
rec = recall_score(y_test, y_pred, average='weighted')
print("Recall:", rec)
# F1值
f1 = f1_score(y_test, y_pred, average='weighted')
print("F1 Score:", f1)
# 分类报告
report = classification_report(y_test, y_pred)
print("Classification Report:")
print(report)
# 绘制PR曲线
plot_precision_recall_curve(clf, X_test, y_test)
# 绘制ROC曲线
plot_roc_curve(clf, X_test, y_test)
```
### 方法二:手动计算
```python
import numpy as np
# 混淆矩阵
n_classes = len(np.unique(y))
cm = np.zeros((n_classes, n_classes))
for i in range(len(y_test)):
cm[y_test[i], y_pred[i]] += 1
print("Confusion Matrix:")
print(cm)
# 准确率
correct = 0
for i in range(len(y_test)):
if y_test[i] == y_pred[i]:
correct += 1
acc = correct / len(y_test)
print("Accuracy:", acc)
# 精确率、召回率、F1值
precisions = []
recalls = []
f1s = []
for i in range(n_classes):
tp = cm[i, i]
fp = sum(cm[:, i]) - tp
fn = sum(cm[i, :]) - tp
precision = tp / (tp + fp)
recall = tp / (tp + fn)
f1 = 2 * precision * recall / (precision + recall)
precisions.append(precision)
recalls.append(recall)
f1s.append(f1)
prec = np.mean(precisions)
rec = np.mean(recalls)
f1 = np.mean(f1s)
print("Precision:", prec)
print("Recall:", rec)
print("F1 Score:", f1)
# 分类报告
report = " precision recall f1-score support\n\n"
for i in range(n_classes):
report += f" {i} {precisions[i]:.4f} {recalls[i]:.4f} {f1s[i]:.4f} {np.sum(y_test==i)}\n"
report += f"\n micro avg {prec:.4f} {rec:.4f} {f1:.4f} {len(y_test)}\n"
report += f" macro avg {prec:.4f} {rec:.4f} {f1:.4f} {len(y_test)}\n"
report += f" weighted avg {prec:.4f} {rec:.4f} {f1:.4f} {len(y_test)}\n"
print("Classification Report:")
print(report)
```
以上就是使用sklearn库实现对鸢尾花数据的分类,采用逻辑回归分类方法,并计算多分类性能评价指标的方法。
阅读全文