from sklearn.linear_model import LogisticRegression from sklearn.ensemble import RandomForestClassifier from sklearn.svm import SVC from sklearn.metrics import classification_report from sklearn.metrics import roc_auc_score from sklearn.metrics import accuracy_score import datetime from time import time models = [RandomForestClassifier(random_state=123, min_samples_split=3, min_samples_leaf=0.01, max_depth=5), LogisticRegression(random_state=123), SVC(kernel='rbf',gamma='auto',random_state=123,probability=True)] # 训练 for model in models: time0=time() model.fit(X_train, y_train) y_pred = model.predict(X_test) accuracy = accuracy_score(y_test, y_pred) rf_roc_auc = roc_auc_score(y_test,y_pred) print(type(model).__name__, 'accuracy:', accuracy) print('======='*10) print(type(model).__name__, 'roc:', rf_roc_auc) print('======='*10) print(classification_report(y_test, y_pred,target_names=['良性', '恶性'])) print('======='*10)代码解释
时间: 2024-01-09 16:05:45 浏览: 146
Python数据集乳腺癌数据集(from sklearn.datasets import load-breast-cancer)
5星 · 资源好评率100%
这段代码使用了三种分类器(RandomForestClassifier, LogisticRegression, SVC)对数据进行训练和预测,并输出了各自的准确率(accuracy)、ROC曲线下的面积(rf_roc_auc)以及分类报告(classification_report)。
具体解释如下:
1. 导入需要的库和模块:
```python
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.metrics import classification_report
from sklearn.metrics import roc_auc_score
from sklearn.metrics import accuracy_score
import datetime
from time import time
```
2. 定义三个分类器模型:
```python
models = [RandomForestClassifier(random_state=123,
min_samples_split=3,
min_samples_leaf=0.01,
max_depth=5),
LogisticRegression(random_state=123),
SVC(kernel='rbf',gamma='auto',random_state=123,probability=True)]
```
其中,RandomForestClassifier是随机森林分类器模型,LogisticRegression是逻辑回归分类器模型,SVC是支持向量机分类器模型。
3. 训练和预测:
```python
for model in models:
time0=time()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
rf_roc_auc = roc_auc_score(y_test,y_pred)
print(type(model).__name__, 'accuracy:', accuracy)
print('======='*10)
print(type(model).__name__, 'roc:', rf_roc_auc)
print('======='*10)
print(classification_report(y_test, y_pred,target_names=['良性', '恶性']))
print('======='*10)
```
其中,for循环遍历三个模型,对每个模型进行训练和预测,并输出准确率、ROC曲线下的面积和分类报告。其中,time()函数用于计算训练时间。
阅读全文