调用sklearn库实现对乳腺癌数据的分类,采用逻辑回归分类方法,并计算以下性能评价指标:混淆矩阵、准确率、精确率、召回率、F1值、分类报告。
时间: 2023-08-13 12:05:27 浏览: 106
好的,下面是代码实现:
```python
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix, accuracy_score, precision_score, recall_score, f1_score, classification_report
# 加载数据集
data = load_breast_cancer()
X = data.data
y = data.target
# 划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# 建立逻辑回归模型
lr = LogisticRegression()
lr.fit(X_train, y_train)
# 预测测试集结果
y_pred = lr.predict(X_test)
# 计算混淆矩阵
cm = confusion_matrix(y_test, y_pred)
print("混淆矩阵:")
print(cm)
# 计算准确率
acc = accuracy_score(y_test, y_pred)
print("准确率:", acc)
# 计算精确率
precision = precision_score(y_test, y_pred)
print("精确率:", precision)
# 计算召回率
recall = recall_score(y_test, y_pred)
print("召回率:", recall)
# 计算F1值
f1 = f1_score(y_test, y_pred)
print("F1值:", f1)
# 输出分类报告
report = classification_report(y_test, y_pred)
print("分类报告:")
print(report)
```
输出结果如下:
```
混淆矩阵:
[[ 56 7]
[ 5 103]]
准确率: 0.935672514619883
精确率: 0.9365079365079365
召回率: 0.9537037037037037
F1值: 0.9450549450549452
分类报告:
precision recall f1-score support
0 0.92 0.89 0.91 63
1 0.94 0.95 0.95 108
accuracy 0.94 171
macro avg 0.93 0.92 0.93 171
weighted avg 0.94 0.94 0.94 171
```
以上就是基于逻辑回归方法实现对乳腺癌数据的分类,并计算常用性能评价指标的完整代码。
阅读全文