采用KNN,SVM,决策树、集成学习方法实现中鸢尾花数据集分类,挑选准确率最高的模型代码
时间: 2024-02-23 13:00:03 浏览: 60
好的,以下是使用Python语言实现中鸢尾花数据集分类的代码,使用了KNN,SVM,决策树和集成学习方法,并进行了十折交叉验证来评估模型性能,最后选择准确率最高的模型进行测试。
```python
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split, KFold
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier, VotingClassifier
# 加载数据集
iris = load_iris()
X = iris.data
y = iris.target
# 划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# 定义KNN,SVM和决策树分类器
knn = KNeighborsClassifier()
svm = SVC()
dt = DecisionTreeClassifier()
# 定义集成学习分类器
ensemble = VotingClassifier(estimators=[('knn', knn), ('svm', svm), ('dt', dt)], voting='hard')
# 定义十折交叉验证
kf = KFold(n_splits=10, shuffle=True, random_state=42)
# 定义模型准确率列表
accuracies = []
# 训练和评估KNN分类器
for train_index, test_index in kf.split(X):
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
knn.fit(X_train, y_train)
accuracy = knn.score(X_test, y_test)
accuracies.append(accuracy)
# 训练和评估SVM分类器
for train_index, test_index in kf.split(X):
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
svm.fit(X_train, y_train)
accuracy = svm.score(X_test, y_test)
accuracies.append(accuracy)
# 训练和评估决策树分类器
for train_index, test_index in kf.split(X):
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
dt.fit(X_train, y_train)
accuracy = dt.score(X_test, y_test)
accuracies.append(accuracy)
# 训练和评估集成学习分类器
for train_index, test_index in kf.split(X):
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
ensemble.fit(X_train, y_train)
accuracy = ensemble.score(X_test, y_test)
accuracies.append(accuracy)
# 打印模型准确率列表
print(accuracies)
# 选择准确率最高的模型
best_model = max(knn.score(X_test, y_test), svm.score(X_test, y_test), dt.score(X_test, y_test), ensemble.score(X_test, y_test))
# 打印最佳模型
print(best_model)
```
请注意,这里的代码仅供参考,具体实现方式可能会因环境和需求的不同而有所变化。同时,这里的代码只是一个基础示例,您可能需要根据具体情况进行更多的调整和优化。
阅读全文