1.读取指定离线鸢尾花数据集,按照要求完成如下任务. 1.1建立Logistic的3阶多项式; 1.2验证所有训练集,并输出准确率; 1.3绘制最终分类图;
时间: 2024-10-19 20:03:02 浏览: 18
在Python中,你可以使用scikit-learn库来处理这个问题,它包含了Iris数据集以及用于机器学习的各种工具。首先,你需要导入所需的库并加载数据:
```python
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import PolynomialFeatures
import matplotlib.pyplot as plt
# 1. 读取数据集
iris = load_iris()
X = iris.data
y = iris.target
# 1.1 构建3阶多项式特征
poly_features = PolynomialFeatures(degree=3)
X_poly = poly_features.fit_transform(X)
# 1.2 划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X_poly, y, test_size=0.2, random_state=42)
# 使用Logistic Regression模型
logreg = LogisticRegression()
logreg.fit(X_train, y_train)
# 验证训练集并计算准确率
y_pred = logreg.predict(X_train)
train_accuracy = accuracy_score(y_train, y_pred)
print(f"训练集准确率: {train_accuracy * 100}%")
# 1.3 计算测试集准确率
y_pred_test = logreg.predict(X_test)
test_accuracy = accuracy_score(y_test, y_pred_test)
print(f"测试集准确率: {test_accuracy * 100}%")
# 绘制决策边界
plt.figure(figsize=(8,6))
decision_boundary = np.linspace(min(X[:,0]),max(X[:,0]),50).reshape(-1,1)
boundary_predictions = logreg.predict(poly_features.transform(decision_boundary))
plt.scatter(X[y==0,0], X[y==0,1], color='blue', label="Setosa")
plt.scatter(X[y==1,0], X[y==1,1], color='green', label="Versicolor")
plt.scatter(X[y==2,0], X[y==2,1], color='red', label="Virginica")
plt.plot(decision_boundary, boundary_predictions, 'k-', linewidth=2, label='Decision Boundary')
plt.xlabel('Polynomial Feature 1')
plt.ylabel('Polynomial Feature 2')
plt.legend(loc='upper left')
plt.show()
```
这个代码首先构建了3阶多项式特征,然后分割数据集进行训练和测试,训练一个Logistic回归模型,计算并输出训练和测试的准确率,最后绘制了分类图表。
阅读全文