利用sklearn中的函数画出手写数字体的ROC曲线
时间: 2023-09-08 22:12:16 浏览: 37
首先,我们需要加载手写数字数据集并进行预处理:
```python
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import label_binarize
from sklearn.metrics import roc_curve, auc
from sklearn.linear_model import LogisticRegression
import matplotlib.pyplot as plt
# Load the digits dataset
digits = load_digits()
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(digits.data, digits.target, test_size=0.5, random_state=0)
# Binarize the labels
y_train = label_binarize(y_train, classes=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
y_test = label_binarize(y_test, classes=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
```
接下来,我们可以训练一个逻辑回归模型,并使用该模型预测测试集上的概率:
```python
# Train a logistic regression model
clf = LogisticRegression()
clf.fit(X_train, y_train)
# Predict probabilities for the test set
y_score = clf.predict_proba(X_test)
```
最后,我们可以使用 `sklearn.metrics.roc_curve` 函数计算 ROC 曲线的假正例率 (FPR) 和真正例率 (TPR),并使用 `matplotlib.pyplot` 库绘制 ROC 曲线:
```python
# Compute ROC curve and ROC area for each class
fpr = dict()
tpr = dict()
roc_auc = dict()
for i in range(10):
fpr[i], tpr[i], _ = roc_curve(y_test[:, i], y_score[:, i])
roc_auc[i] = auc(fpr[i], tpr[i])
# Plot ROC curve for each class
plt.figure()
colors = ['red', 'green', 'blue', 'orange', 'purple', 'brown', 'pink', 'gray', 'olive', 'cyan']
for i, color in zip(range(10), colors):
plt.plot(fpr[i], tpr[i], color=color, lw=2, label='ROC curve of class %d (area = %0.2f)' % (i, roc_auc[i]))
plt.plot([0, 1], [0, 1], 'k--', lw=2)
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver operating characteristic for multi-class')
plt.legend(loc="lower right")
plt.show()
```
这将绘制出一个包含所有类别的平均 ROC 曲线以及每个类别的 ROC 曲线。