采用KNN,SVM,决策树、集成学习方法实现中鸢尾花数据集分类,挑选准确率最高的模型。写出代码
时间: 2024-02-23 08:00:50 浏览: 20
好的,我会给您提供代码示例。在这之前请确保您已经安装了必要的Python库,例如`scikit-learn`,`numpy`,`pandas`等等。
首先,我们需要加载Iris数据集并将其分为训练集和测试集。代码如下所示:
```python
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2, random_state=42)
```
接下来,我们可以使用KNN分类器。代码如下所示:
```python
from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier()
knn.fit(X_train, y_train)
knn_score = knn.score(X_test, y_test)
print("KNN准确率为:", knn_score)
```
接下来,我们可以使用SVM分类器。代码如下所示:
```python
from sklearn.svm import SVC
svm = SVC()
svm.fit(X_train, y_train)
svm_score = svm.score(X_test, y_test)
print("SVM准确率为:", svm_score)
```
然后,我们可以使用决策树分类器。代码如下所示:
```python
from sklearn.tree import DecisionTreeClassifier
dt = DecisionTreeClassifier()
dt.fit(X_train, y_train)
dt_score = dt.score(X_test, y_test)
print("决策树准确率为:", dt_score)
```
最后,我们可以使用集成学习方法,例如随机森林或梯度提升树。代码如下所示:
```python
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
rf = RandomForestClassifier()
rf.fit(X_train, y_train)
rf_score = rf.score(X_test, y_test)
print("随机森林准确率为:", rf_score)
gb = GradientBoostingClassifier()
gb.fit(X_train, y_train)
gb_score = gb.score(X_test, y_test)
print("梯度提升树准确率为:", gb_score)
```
最后,我们可以比较不同模型的准确率,并选择准确率最高的模型作为我们的最终模型。代码如下所示:
```python
models = [('KNN', knn_score), ('SVM', svm_score), ('决策树', dt_score), ('随机森林', rf_score), ('梯度提升树', gb_score)]
best_model = max(models, key=lambda x: x[1])
print("最佳模型为:", best_model[0], "准确率为:", best_model[1])
```
完整代码如下所示:
```python
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2, random_state=42)
knn = KNeighborsClassifier()
knn.fit(X_train, y_train)
knn_score = knn.score(X_test, y_test)
print("KNN准确率为:", knn_score)
svm = SVC()
svm.fit(X_train, y_train)
svm_score = svm.score(X_test, y_test)
print("SVM准确率为:", svm_score)
dt = DecisionTreeClassifier()
dt.fit(X_train, y_train)
dt_score = dt.score(X_test, y_test)
print("决策树准确率为:", dt_score)
rf = RandomForestClassifier()
rf.fit(X_train, y_train)
rf_score = rf.score(X_test, y_test)
print("随机森林准确率为:", rf_score)
gb = GradientBoostingClassifier()
gb.fit(X_train, y_train)
gb_score = gb.score(X_test, y_test)
print("梯度提升树准确率为:", gb_score)
models = [('KNN', knn_score), ('SVM', svm_score), ('决策树', dt_score), ('随机森林', rf_score), ('梯度提升树', gb_score)]
best_model = max(models, key=lambda x: x[1])
print("最佳模型为:", best_model[0], "准确率为:", best_model[1])
```
希望这个代码示例能够帮助您找到最佳的模型!