使用三种集成算法分类葡萄酒数据集显示代码
时间: 2024-12-18 21:28:46 浏览: 12
在Python中,我们可以使用sklearn库中的集成学习算法来处理葡萄酒数据集并进行分类。这里以随机森林、梯度提升机和AdaBoost三种常见集成算法为例:
首先,你需要安装必要的库,如`sklearn`, `pandas`, 和 `numpy`,如果尚未安装,可以使用pip安装:
```bash
pip install pandas sklearn numpy
```
然后,你可以按照以下步骤操作:
1. 导入库和数据集:
```python
import pandas as pd
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier, AdaBoostClassifier
from sklearn.metrics import accuracy_score
# 加载葡萄酒数据集
wine = load_wine()
df = pd.DataFrame(data=wine.data, columns=wine.feature_names)
target = wine.target
```
2. 划分训练集和测试集:
```python
X_train, X_test, y_train, y_test = train_test_split(df, target, test_size=0.2, random_state=42)
```
3. 定义并训练模型:
```python
# 随机森林
rf_clf = RandomForestClassifier(n_estimators=100, random_state=42)
rf_clf.fit(X_train, y_train)
# 梯度提升机
gb_clf = GradientBoostingClassifier(n_estimators=100, learning_rate=0.1, random_state=42)
gb_clf.fit(X_train, y_train)
# AdaBoost
ab_clf = AdaBoostClassifier(n_estimators=100, random_state=42)
ab_clf.fit(X_train, y_train)
```
4. 预测并评估:
```python
y_pred_rf = rf_clf.predict(X_test)
y_pred_gb = gb_clf.predict(X_test)
y_pred_ab = ab_clf.predict(X_test)
accuracy_rf = accuracy_score(y_test, y_pred_rf)
accuracy_gb = accuracy_score(y_test, y_pred_gb)
accuracy_ab = accuracy_score(y_test, y_pred_ab)
print(f"Random Forest Accuracy: {accuracy_rf}")
print(f"Gradient Boosting Accuracy: {accuracy_gb}")
print(f"AdaBoost Accuracy: {accuracy_ab}")
```
这会分别打印出每个模型在测试集上的准确率。
阅读全文