基于鸢尾花数据集,采用KNN、SVM等多个方法对鸢尾花进行分类,通过对比准确率、精确率等指标,选取最佳算法。
时间: 2024-03-11 11:50:30 浏览: 41
首先,我们需要加载鸢尾花数据集并进行数据预处理,将数据集分为训练集和测试集,并对特征数据进行标准化处理。
```python
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
# 加载数据集
iris = load_iris()
X = iris.data
y = iris.target
# 数据预处理
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
```
接下来,我们可以采用KNN、SVM等多个算法对鸢尾花进行分类,并计算准确率、精确率等指标,选取最佳算法。
```python
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
# KNN
knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X_train, y_train)
y_pred_knn = knn.predict(X_test)
print("KNN准确率:", accuracy_score(y_test, y_pred_knn))
print("KNN精确率:", precision_score(y_test, y_pred_knn, average='macro'))
print("KNN召回率:", recall_score(y_test, y_pred_knn, average='macro'))
print("KNN F1得分:", f1_score(y_test, y_pred_knn, average='macro'))
# SVM
svm = SVC(kernel='linear', C=1.0, random_state=42)
svm.fit(X_train, y_train)
y_pred_svm = svm.predict(X_test)
print("SVM准确率:", accuracy_score(y_test, y_pred_svm))
print("SVM精确率:", precision_score(y_test, y_pred_svm, average='macro'))
print("SVM召回率:", recall_score(y_test, y_pred_svm, average='macro'))
print("SVM F1得分:", f1_score(y_test, y_pred_svm, average='macro'))
```
通过对比准确率、精确率等指标,我们可以选取表现最好的算法。需要注意的是,不同算法的表现可能会受到超参数的影响,因此需要对超参数进行调优才能得到最优结果。
阅读全文