python实现用SVM模型对sklearn数据库中的红酒数据集进行分类,通过五折交叉验证和网格搜索法确定最佳参数,再计算准确率、精度、F1_score和召回率
时间: 2024-11-12 13:33:50 浏览: 32
在Python中,我们可以使用Scikit-learn库来实现支持向量机(SVM)模型对红酒数据集进行分类。这里是一个基本的步骤指南:
1. **导入必要的库**:
```python
import numpy as np
import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split, cross_val_score, GridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, precision_score, f1_score, recall_score
```
2. **加载红酒数据集**:
```python
wine = datasets.load_wine()
X = wine.data
y = wine.target
```
3. **数据预处理** (如果需要):
```python
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
```
4. **拆分训练集和测试集**:
```python
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)
```
5. **创建SVM分类器并进行网格搜索**:
```python
param_grid = {'C': [0.1, 1, 10, 100], 'kernel': ['linear', 'poly', 'rbf'], 'gamma': ['scale', 'auto']}
grid_search = GridSearchCV(SVC(), param_grid, cv=5)
grid_search.fit(X_train, y_train)
best_params = grid_search.best_params_
```
6. **应用最优参数训练模型**:
```python
svm_model = SVC(**best_params)
svm_model.fit(X_train, y_train)
```
7. **交叉验证评估**:
```python
cv_scores = cross_val_score(svm_model, X_scaled, y, cv=5)
mean_cv_score = cv_scores.mean()
```
8. **预测并计算指标**:
```python
y_pred = svm_model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred, average='weighted')
recall = recall_score(y_test, y_pred, average='weighted')
f1 = f1_score(y_test, y_pred, average='weighted')
print(f"Best parameters: {best_params}")
print(f"Cross-validation mean score: {mean_cv_score:.3f}")
print(f"Accuracy: {accuracy:.3f}")
print(f"Precision: {precision:.3f}")
print(f"F1 Score: {f1:.3f}")
print(f"Recall: {recall:.3f}")
```
阅读全文