采用KNN,SVM,决策树、集成学习方法实现中鸢尾花数据集分类,挑选准确率最高的模型代码
时间: 2024-02-23 20:00:35 浏览: 80
基于SVM鸢尾花分类数据集
好的,下面是使用 Python 实现中鸢尾花数据集分类的代码:
首先,导入所需的库和数据集:
```python
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
data = load_iris()
X = data.data
y = data.target
```
接着,对数据集进行划分,设置训练集占比为 0.7:
```python
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
```
然后,使用 KNN、SVM、决策树、随机森林等方法进行分类:
```python
# KNN
knn = KNeighborsClassifier()
knn.fit(X_train, y_train)
knn_score = knn.score(X_test, y_test)
# SVM
svm = SVC()
svm.fit(X_train, y_train)
svm_score = svm.score(X_test, y_test)
# 决策树
dtc = DecisionTreeClassifier()
dtc.fit(X_train, y_train)
dtc_score = dtc.score(X_test, y_test)
# 随机森林
rfc = RandomForestClassifier(n_estimators=10)
rfc.fit(X_train, y_train)
rfc_score = rfc.score(X_test, y_test)
```
最后,比较各个方法的准确率,选择准确率最高的方法作为最终的分类器:
```python
scores = [knn_score, svm_score, dtc_score, rfc_score]
methods = ['KNN', 'SVM', 'Decision Tree', 'Random Forest']
best_score = max(scores)
best_method = methods[scores.index(best_score)]
print(f"The best method is {best_method} with a score of {best_score:.2f}")
```
完整代码如下:
```python
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
data = load_iris()
X = data.data
y = data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# KNN
knn = KNeighborsClassifier()
knn.fit(X_train, y_train)
knn_score = knn.score(X_test, y_test)
# SVM
svm = SVC()
svm.fit(X_train, y_train)
svm_score = svm.score(X_test, y_test)
# 决策树
dtc = DecisionTreeClassifier()
dtc.fit(X_train, y_train)
dtc_score = dtc.score(X_test, y_test)
# 随机森林
rfc = RandomForestClassifier(n_estimators=10)
rfc.fit(X_train, y_train)
rfc_score = rfc.score(X_test, y_test)
scores = [knn_score, svm_score, dtc_score, rfc_score]
methods = ['KNN', 'SVM', 'Decision Tree', 'Random Forest']
best_score = max(scores)
best_method = methods[scores.index(best_score)]
print(f"The best method is {best_method} with a score of {best_score:.2f}")
```
希望能对您有所帮助!
阅读全文