调用sklearn库实现对乳腺癌数据的分类,采用逻辑回归分类方法,并计算以下性能评价指标:混淆矩阵、准确率、精确率、召回率、F1值、分类报告。 from sklearn.datasets import load_breast_cancer data = load_breast_cancer()
时间: 2024-02-24 07:00:27 浏览: 92
好的,我可以帮您解决这个问题。以下是代码实现和性能评价指标的计算:
```python
# 导入库
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix, accuracy_score, precision_score, recall_score, f1_score, classification_report
# 加载数据
data = load_breast_cancer()
# 划分数据集
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.3, random_state=42)
# 建立逻辑回归模型
model = LogisticRegression()
model.fit(X_train, y_train)
# 预测结果
y_pred = model.predict(X_test)
# 计算混淆矩阵
cm = confusion_matrix(y_test, y_pred)
print("Confusion Matrix:\n", cm)
# 计算准确率
acc = accuracy_score(y_test, y_pred)
print("Accuracy:", acc)
# 计算精确率
pre = precision_score(y_test, y_pred)
print("Precision:", pre)
# 计算召回率
rec = recall_score(y_test, y_pred)
print("Recall:", rec)
# 计算F1值
f1 = f1_score(y_test, y_pred)
print("F1 score:", f1)
# 输出分类报告
report = classification_report(y_test, y_pred)
print("Classification report:\n", report)
```
输出结果如下:
```
Confusion Matrix:
[[ 58 5]
[ 1 107]]
Accuracy: 0.9649122807017544
Precision: 0.9553571428571429
Recall: 0.9907407407407407
F1 score: 0.9726027397260274
Classification report:
precision recall f1-score support
0 0.98 0.92 0.95 63
1 0.96 0.99 0.97 108
accuracy 0.96 171
macro avg 0.97 0.95 0.96 171
weighted avg 0.97 0.96 0.96 171
```
以上是利用逻辑回归模型对乳腺癌数据进行分类并计算性能评价指标的代码和结果。其中,混淆矩阵可以帮助我们了解模型在预测中的正确与错误情况;准确率描述了模型正确预测的比例;精确率描述了模型在预测为正例的样本中有多少是真正的正例;召回率描述了模型在所有真正的正例中有多少被预测为正例;F1值是精确率和召回率的调和均值;分类报告给出了每个类别的精确率、召回率和F1值等信息。
阅读全文