1.读取指定离线鸢尾花数据集,按照要求完成如下任务. 1.1建立Logistic的3阶多项式; 1.2验证所有训练集,并输出准确率; 1.3绘制最终分类图; 1.4代码关键部分需要加以注释。写出Python代码
时间: 2024-10-20 11:09:42 浏览: 31
为了帮助您完成这个任务,我们将分步骤编写Python代码并提供必要的解释。首先,我们需要导入所需的库,如`pandas`, `numpy`, `matplotlib`, 和 `sklearn`(包括`datasets`, `preprocessing`, `model_selection`, 和 `linear_model`模块)。请注意,这里我们假设数据集已经下载到本地并保存为CSV文件。
```python
import pandas as pd
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import PolynomialFeatures
import matplotlib.pyplot as plt
# 1. 读取鸢尾花数据集
iris = datasets.load_iris()
X = iris.data
y = iris.target
# 1.1 构建3阶多项式特征
poly = PolynomialFeatures(degree=3)
X_poly = poly.fit_transform(X)
# 1.2 划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X_poly, y, test_size=0.2, random_state=42)
# 1.2 验证训练集并计算准确率
logreg = LogisticRegression()
logreg.fit(X_train, y_train)
y_pred = logreg.predict(X_test)
accuracy = np.mean(y_pred == y_test)
print(f"训练集准确率: {accuracy * 100:.2f}%")
# 1.3 绘制最终分类图
def plot_decision_regions(X, y, model, resolution=0.02):
markers = ('o', 's', '^', 'v')
colors = ('red', 'blue', 'lightgreen', 'gray')
cmap = mpl.colors.ListedColormap(colors[:len(np.unique(y))])
for idx, cl in enumerate(np.unique(y)):
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, resolution),
np.arange(y_min, y_max, resolution))
Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
plt.contourf(xx, yy, Z, alpha=0.8, cmap=cmap)
plt.scatter(X[y==cl, 0], X[y==cl, 1],
marker=markers[idx], c=colors[idx], label=f'Class {cl}',
edgecolor='black', s=40)
plt.legend(loc='upper left', fontsize='medium')
plot_decision_regions(X_poly, y, logreg)
plt.xlabel('Feature 1 (Polynomial)')
plt.ylabel('Feature 2 (Polynomial)')
plt.title('Logistic Regression Decision Boundaries with 3rd Degree Polynomial Features')
plt.show()
```
阅读全文