对pandas读取的数据进行逻辑回归评价预测,包括绘制二分类混淆矩阵、计算精确率与召回率、F1-score、ROC曲线与AUC指标代码
时间: 2023-12-24 14:06:07 浏览: 32
好的,以下是一份基于Pandas和Scikit-learn库的逻辑回归预测模型评价代码示例,包括绘制二分类混淆矩阵、计算精确率与召回率、F1-score、ROC曲线与AUC指标:
```python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix, accuracy_score, precision_score, recall_score, f1_score, roc_curve, roc_auc_score
# 1. 数据读取与处理
data = pd.read_csv('data.csv')
X = data.drop('target', axis=1)
y = data['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# 2. 模型训练
model = LogisticRegression()
model.fit(X_train, y_train)
# 3. 模型预测
y_pred = model.predict(X_test)
y_prob = model.predict_proba(X_test)[:, 1]
# 4. 绘制二分类混淆矩阵
confusion_mat = confusion_matrix(y_test, y_pred)
plt.imshow(confusion_mat, cmap=plt.cm.Blues)
plt.title('Confusion Matrix')
plt.colorbar()
tick_marks = np.arange(2)
plt.xticks(tick_marks, ['0', '1'])
plt.yticks(tick_marks, ['0', '1'])
plt.xlabel('Predicted Label')
plt.ylabel('True Label')
for i in range(2):
for j in range(2):
plt.text(j, i, confusion_mat[i, j], ha='center', va='center', color='white' if confusion_mat[i, j] > confusion_mat.max() / 2 else 'black')
plt.show()
# 5. 计算精确率、召回率和F1-score
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)
# 6. 计算AUC指标和绘制ROC曲线
auc = roc_auc_score(y_test, y_prob)
fpr, tpr, thresholds = roc_curve(y_test, y_prob)
plt.plot(fpr, tpr, label='ROC curve (area = %0.2f)' % auc)
plt.plot([0, 1], [0, 1], 'k--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC Curve')
plt.legend(loc="lower right")
plt.show()
# 7. 输出结果
print('Precision:', precision)
print('Recall:', recall)
print('F1-score:', f1)
print('AUC:', auc)
```
这份代码使用了Pandas库读取了名为"data.csv"的数据集,并将其分成训练集和测试集,使用Scikit-learn库中的LogisticRegression类进行模型训练和预测,最后通过Scikit-learn库中的指标函数计算了预测结果的精确率、召回率、F1-score和AUC指标,同时绘制了二分类混淆矩阵和ROC曲线。需要注意的是,上述代码仅作为示例,具体实现需要根据实际情况进行调整。