wine数据集逻辑回归算法分类并可视化代码
时间: 2023-07-26 10:49:58 浏览: 158
好的,以下是使用逻辑回归算法对wine数据集进行分类并可视化的代码:
```python
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
# 加载wine数据集
wine = load_wine()
# 转换为pandas的DataFrame格式
wine_df = pd.DataFrame(wine.data, columns=wine.feature_names)
wine_df['target'] = wine.target
# 查看数据集信息
print(wine_df.info())
# 数据可视化
sns.pairplot(wine_df, vars=['alcohol', 'malic_acid', 'ash', 'alcalinity_of_ash'], hue='target')
plt.show()
# 分割数据集
X_train, X_test, y_train, y_test = train_test_split(wine_df.drop('target', axis=1), wine_df['target'], test_size=0.3, random_state=42)
# 训练模型
lr = LogisticRegression()
lr.fit(X_train, y_train)
# 预测
y_pred = lr.predict(X_test)
# 准确率评估
acc_score = accuracy_score(y_test, y_pred)
print('Accuracy:', acc_score)
```
这段代码首先加载wine数据集,并将其转换为pandas的DataFrame格式。然后,使用seaborn库进行数据可视化,将四个特征(alcohol、malic_acid、ash、alcalinity_of_ash)两两组合,并按照目标变量(target)进行着色。接着,使用train_test_split函数将数据集分为训练集和测试集,使用LogisticRegression算法进行训练,并对测试集进行预测。最后,使用accuracy_score函数计算准确率评估模型的性能。
阅读全文