选择Wine数据集和Boston Housing数据集,使用LIBSVM,分别用线性核和高斯核训练一个SVM,比较1其支持向量的差别,并与BP 神经网络和C4.5决策树进行实验比较的代码
时间: 2024-03-25 20:38:22 浏览: 60
heart-scale数据集以及自己制作符合LIBSVM格式的数据集
以下是使用Python的scikit-learn库和LIBSVM库来实现上述任务的代码示例。其中,使用Wine数据集和Boston Housing数据集来训练SVM模型,并使用神经网络和决策树模型进行比较。代码中,使用了交叉验证来估计模型的性能,并使用matplotlib库来绘制图表展示结果。
```python
import numpy as np
import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.svm import SVC
from sklearn.neural_network import MLPClassifier
from sklearn.tree import DecisionTreeClassifier
import matplotlib.pyplot as plt
from svmutil import *
# 加载数据集
wine = datasets.load_wine()
X_wine = wine.data
y_wine = wine.target
housing = datasets.load_boston()
X_housing = housing.data
y_housing = housing.target
# 划分训练集和测试集
X_wine_train, X_wine_test, y_wine_train, y_wine_test = train_test_split(X_wine, y_wine, test_size=0.3)
X_housing_train, X_housing_test, y_housing_train, y_housing_test = train_test_split(X_housing, y_housing, test_size=0.3)
# SVM模型
# 线性核
svm_linear_wine = SVC(kernel='linear')
svm_linear_wine_scores = cross_val_score(svm_linear_wine, X_wine, y_wine, cv=10)
print("Wine数据集 SVM线性核准确率:", np.mean(svm_linear_wine_scores))
svm_linear_housing = SVM()
svm_linear_housing.probability = 1
svm_linear_housing_linear = svm_parameter('-t 0')
svm_linear_housing_cv = svm_train(y_housing_train.tolist(), X_housing_train.tolist(), svm_linear_housing_linear)
svm_linear_housing_scores = svm_predict(y_housing_test.tolist(), X_housing_test.tolist(), svm_linear_housing_cv)
print("Housing数据集 SVM线性核准确率:", svm_linear_housing_scores[1][0])
# 高斯核
svm_rbf_wine = SVC(kernel='rbf')
svm_rbf_wine_scores = cross_val_score(svm_rbf_wine, X_wine, y_wine, cv=10)
print("Wine数据集 SVM高斯核准确率:", np.mean(svm_rbf_wine_scores))
svm_rbf_housing = SVM()
svm_rbf_housing.probability = 1
svm_rbf_housing_rbf = svm_parameter('-t 2 -g 0.1')
svm_rbf_housing_cv = svm_train(y_housing_train.tolist(), X_housing_train.tolist(), svm_rbf_housing_rbf)
svm_rbf_housing_scores = svm_predict(y_housing_test.tolist(), X_housing_test.tolist(), svm_rbf_housing_cv)
print("Housing数据集 SVM高斯核准确率:", svm_rbf_housing_scores[1][0])
# BP神经网络
nn_wine = MLPClassifier(hidden_layer_sizes=(100,))
nn_wine_scores = cross_val_score(nn_wine, X_wine, y_wine, cv=10)
print("Wine数据集 BP神经网络准确率:", np.mean(nn_wine_scores))
nn_housing = MLPClassifier(hidden_layer_sizes=(100,))
nn_housing_scores = cross_val_score(nn_housing, X_housing, y_housing, cv=10)
print("Housing数据集 BP神经网络准确率:", np.mean(nn_housing_scores))
# 决策树
dt_wine = DecisionTreeClassifier()
dt_wine_scores = cross_val_score(dt_wine, X_wine, y_wine, cv=10)
print("Wine数据集 决策树准确率:", np.mean(dt_wine_scores))
dt_housing = DecisionTreeClassifier()
dt_housing_scores = cross_val_score(dt_housing, X_housing, y_housing, cv=10)
print("Housing数据集 决策树准确率:", np.mean(dt_housing_scores))
# 绘制结果图表
models = ['SVM-Linear', 'SVM-RBF', 'BPNN', 'Decision Tree']
wine_scores = [np.mean(svm_linear_wine_scores), np.mean(svm_rbf_wine_scores), np.mean(nn_wine_scores), np.mean(dt_wine_scores)]
housing_scores = [svm_linear_housing_scores[1][0], svm_rbf_housing_scores[1][0], np.mean(nn_housing_scores), np.mean(dt_housing_scores)]
x = np.arange(len(models))
width = 0.35
fig, ax = plt.subplots()
rects1 = ax.bar(x - width/2, wine_scores, width, label='Wine')
rects2 = ax.bar(x + width/2, housing_scores, width, label='Housing')
ax.set_ylabel('Accuracy')
ax.set_title('Models Comparison')
ax.set_xticks(x)
ax.set_xticklabels(models)
ax.legend()
fig.tight_layout()
plt.show()
```
需要注意的是,使用LIBSVM库需要单独安装,并且需要将LIBSVM的Python接口文件svmutil.py导入到代码中。此外,对于不同的数据集和任务,可能需要调整模型的参数和超参数,来实现最佳性能。
阅读全文