python id3决策树代码
时间: 2023-10-12 10:05:24 浏览: 196
ID3决策树python代码
以下是一个示例的 Python ID3 决策树代码:
```python
import numpy as np
def entropy(y):
"""
计算给定数据集 y 的熵值
"""
unique, count = np.unique(y, return_counts=True)
probs = count / len(y)
entropy = np.sum(-probs * np.log2(probs))
return entropy
def information_gain(X, y, feature_index):
"""
计算给定特征的信息增益
"""
parent_entropy = entropy(y)
unique_vals = np.unique(X[:, feature_index])
weighted_entropy = 0
for value in unique_vals:
subset = y[X[:, feature_index] == value]
subset_entropy = entropy(subset)
weighted_entropy += (len(subset) / len(y)) * subset_entropy
information_gain = parent_entropy - weighted_entropy
return information_gain
def id3(X, y, features):
"""
使用 ID3 算法构建决策树
"""
# 如果所有实例都属于同一类别,则返回这个类别
if len(np.unique(y)) == 1:
return y[0]
# 如果没有特征可供划分,则返回实例中出现次数最多的类别
if len(features) == 0:
unique, count = np.unique(y, return_counts=True)
return unique[np.argmax(count)]
# 计算所有特征的信息增益,并选择最大增益的特征
gains = [information_gain(X, y, i) for i in range(len(features))]
best_feature_index = np.argmax(gains)
best_feature = features[best_feature_index]
# 创建一个新的决策树节点
tree = {best_feature: {}}
# 从特征列表中移除已选择的特征
features = np.delete(features, best_feature_index)
# 递归构建子树
unique_vals = np.unique(X[:, best_feature_index])
for value in unique_vals:
subset_indices = np.where(X[:, best_feature_index] == value)
subset_X = X[subset_indices]
subset_y = y[subset_indices]
subtree = id3(subset_X, subset_y, features)
tree[best_feature][value] = subtree
return tree
```
这段代码实现了一个简单的 ID3 决策树算法,可以用于分类问题。它包括计算熵、计算信息增益、递归构建决策树等功能。你可以根据自己的需求进行修改和优化。希望对你有所帮助!
阅读全文