帮我用鸢尾花(Iris)数据集在python环境下实现线性可分SVM,线性SVM和非线性SVM,计算分类结果的准确率并可视化数据点和分类边界
时间: 2024-10-06 17:05:38 浏览: 75
首先,你需要安装必要的库,如`scikit-learn`来进行机器学习操作,以及`matplotlib`用于可视化。以下是在Python环境下使用鸢尾花(Iris)数据集实现线性可分SVM、线性SVM和非线性SVM的步骤:
1. 导入库和加载数据集:
```python
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC, LinearSVC
from sklearn.preprocessing import StandardScaler
iris = datasets.load_iris()
X = iris.data
y = iris.target
```
2. 划分训练集和测试集,并标准化数据:
```python
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
```
3. 实现线性可分SVM:
```python
linear_svm = LinearSVC().fit(X_train_scaled, y_train)
linear_pred = linear_svm.predict(X_test_scaled)
# 计算准确率
linear_accuracy = linear_svm.score(X_test_scaled, y_test)
print(f"Linear SVM Accuracy: {linear_accuracy * 100:.2f}%")
```
4. 线性SVM:
```python
linear_svc = SVC(kernel='linear').fit(X_train_scaled, y_train)
linear_svc_pred = linear_svc.predict(X_test_scaled)
# 类别准确率计算
linear_svc_accuracy = linear_svc.score(X_test_scaled, y_test)
print(f"Linear SVM with kernel 'linear' Accuracy: {linear_svc_accuracy * 100:.2f}%")
```
5. 非线性SVM(例如使用RBF核):
```python
rbf_svm = SVC(kernel='rbf').fit(X_train_scaled, y_train)
rbf_pred = rbf_svm.predict(X_test_scaled)
# 非线性SVM的准确率
rbf_accuracy = rbf_svm.score(X_test_scaled, y_test)
print(f"RBF SVM Accuracy: {rbf_accuracy * 100:.2f}%")
```
6. 可视化数据点和分类边界:
```python
def plot_decision_boundaries(clf, X, y, ax=None):
if ax is None:
fig, ax = plt.subplots()
# 颜色编码鸢尾花的种类
colors = ["navy", "turquoise", "darkorange"]
class_names = iris.target_names
for color, i, target_name in zip(colors, [0, 1, 2], class_names):
indices = y == i
ax.scatter(
X[indices, 0],
X[indices, 1],
color=color,
alpha=.8,
label=target_name,
)
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
XX, YY = np.meshgrid(np.arange(x_min, x_max, .02), np.arange(y_min, y_max, .02))
Z = clf.decision_function(np.c_[XX.ravel(), YY.ravel()])
Z = Z.reshape(XX.shape)
ax.contour(XX, YY, Z, levels=[-1, 0, 1], colors="k", linewidths=2, alpha=.7)
# 绘制分类边界
fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(15, 5))
plot_decision_boundaries(linear_svm, X_train_scaled, y_train, ax=ax1)
plot_decision_boundaries(linear_svc, X_train_scaled, y_train, ax=ax2)
plot_decision_boundaries(rbf_svm, X_train_scaled, y_train, ax=ax3)
for ax in [ax1, ax2, ax3]:
ax.set_title("Classifier Decision Boundaries")
ax.legend(loc="upper center", bbox_to_anchor=(1.3, 1))
plt.show()
```
执行以上代码后,你会得到线性可分SVM、线性和非线性SVM的分类准确度,并分别展示了它们的决策边界。注意实际运行时可能需要调整参数,这里仅作为基本示例。
阅读全文
相关推荐


















