首页从外部导入鸢尾花数据集并进行二分类并可视化的python代码

从外部导入鸢尾花数据集并进行二分类并可视化的python代码

时间: 2023-09-08 22:09:25 浏览: 116

好的，下面是Python代码： ``` import pandas as pd import numpy as np import matplotlib.pyplot as plt from sklearn.datasets import load_iris from sklearn.linear_model import LogisticRegression from sklearn.model_selection import train_test_split # 导入数据集 iris = load_iris() X = iris.data[:, :2] # 取前两个特征 y = (iris.target != 0) * 1 # 二分类 # 数据可视化 plt.scatter(X[y == 0][:, 0], X[y == 0][:, 1], color='red', label='0') plt.scatter(X[y == 1][:, 0], X[y == 1][:, 1], color='blue', label='1') plt.xlabel('Sepal length') plt.ylabel('Sepal width') plt.legend() # 数据集划分 X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) # 模型训练 clf = LogisticRegression() clf.fit(X_train, y_train) # 模型评估 print("Train accuracy:", clf.score(X_train, y_train)) print("Test accuracy:", clf.score(X_test, y_test)) ``` 这段代码实现了以下几个步骤： 1. 从`sklearn`库中导入鸢尾花数据集。 2. 取鸢尾花数据集的前两个特征，并将数据集转化为二分类问题。 3. 可视化鸢尾花数据集。 4. 将数据集划分为训练集和测试集。 5. 用逻辑回归模型进行训练，并输出模型在训练集和测试集上的准确率。

阅读全文