python实现使用ID3决策树作为基分类器,通过Bagging算法学习一个强分类器
时间: 2024-04-29 16:21:44 浏览: 120
Bagging算法可以用来构建强分类器,其中基分类器可以使用ID3决策树。以下是Python实现的步骤:
1. 导入需要的库
```python
import numpy as np
from sklearn.tree import DecisionTreeClassifier
```
2. 定义Bagging类
```python
class Bagging:
def __init__(self, base_estimator=DecisionTreeClassifier(), n_estimators=10):
self.base_estimator = base_estimator
self.n_estimators = n_estimators
self.estimators = [None] * n_estimators
def fit(self, X, y):
for i in range(self.n_estimators):
# 随机选择数据集的样本进行训练
indices = np.random.choice(X.shape[0], size=X.shape[0], replace=True)
X_subset, y_subset = X[indices], y[indices]
# 训练基分类器
estimator = self.base_estimator.fit(X_subset, y_subset)
self.estimators[i] = estimator
def predict(self, X):
# 对于每个测试样本,基分类器的结果进行投票
predictions = np.zeros((X.shape[0], self.n_estimators))
for i in range(self.n_estimators):
predictions[:, i] = self.estimators[i].predict(X)
return np.apply_along_axis(lambda x: np.bincount(x).argmax(), axis=1, arr=predictions)
```
3. 使用Bagging类对数据进行分类
```python
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# 加载数据集
iris = load_iris()
X, y = iris.data, iris.target
# 切分数据集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# 创建Bagging分类器
bagging = Bagging(base_estimator=DecisionTreeClassifier(criterion='entropy', max_depth=3), n_estimators=10)
# 训练模型
bagging.fit(X_train, y_train)
# 预测测试集
y_pred = bagging.predict(X_test)
# 计算准确率
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
```
在上面的代码中,我们使用鸢尾花数据集作为例子。首先,我们将数据集划分为训练和测试集,然后创建一个Bagging分类器,并使用ID3决策树作为基分类器。我们训练模型并使用测试集进行预测。最后,我们计算模型的准确率。
阅读全文