以鸢尾花数据集iris.csv为例,训练sklearn中的多种分类模型,并对比其分类效果
时间: 2024-12-16 19:25:24 浏览: 108
鸢尾花数据集(Iris dataset)是一个经典的机器学习入门案例,通常用于演示各种分类算法的效果。它包含150个样本,每个样本有四个连续变量(花瓣长度、花瓣宽度、萼片长度和萼片宽度),以及一个类别标签(三种鸢尾花品种)。使用sklearn库中的几种常见的分类模型对这个数据集进行训练,可以包括线性回归(Linear Regression, 不适用于分类)、决策树(Decision Tree)、K近邻(K-Nearest Neighbors, KNN)、朴素贝叶斯(Naive Bayes)、支持向量机(SVM)等。
以下是步骤概述:
导入所需库并加载数据:
import pandas as pd from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.metrics import accuracy_score, classification_report from sklearn.linear_model import LinearRegression from sklearn.tree import DecisionTreeClassifier from sklearn.neighbors import KNeighborsClassifier from sklearn.naive_bayes import GaussianNB from sklearn.svm import SVC
加载和预处理数据:
iris = load_iris() df = pd.DataFrame(data=iris.data, columns=iris.feature_names) df['target'] = iris.target X_train, X_test, y_train, y_test = train_test_split(df.drop('target', axis=1), df['target'], test_size=0.2, random_state=42) scaler = StandardScaler() X_train_scaled = scaler.fit_transform(X_train) X_test_scaled = scaler.transform(X_test)
训练和评估模型:
models = { 'Linear Regression': LinearRegression(), 'Decision Tree': DecisionTreeClassifier(), 'KNN': KNeighborsClassifier(n_neighbors=3), 'Naive Bayes': GaussianNB(), 'SVM': SVC(kernel='linear') } for name, model in models.items(): model.fit(X_train_scaled, y_train) predictions = model.predict(X_test_scaled) print(f"{name} - Accuracy: {accuracy_score(y_test, predictions)}\n{classification_report(y_test, predictions)}\n")
这里展示了每种模型的预测准确性和详细的分类报告。
相关推荐















