id3决策树python代码
时间: 2024-03-31 20:31:04 浏览: 21
ID3决策树是一种经典的机器学习算法,用于分类问题。下面是一个简单的ID3决策树的Python代码示例:
```python
import numpy as np
def entropy(y):
_, counts = np.unique(y, return_counts=True)
probabilities = counts / len(y)
return -np.sum(probabilities * np.log2(probabilities))
def information_gain(X, y, feature_index):
unique_values = np.unique(X[:, feature_index])
entropy_before_split = entropy(y)
weighted_entropy_after_split = 0
for value in unique_values:
subset_indices = np.where(X[:, feature_index] == value)
subset_y = y[subset_indices]
weighted_entropy_after_split += len(subset_y) / len(y) * entropy(subset_y)
return entropy_before_split - weighted_entropy_after_split
def id3(X, y, features):
if len(np.unique(y)) == 1:
return y[0]
if len(features) == 0:
return np.bincount(y).argmax()
information_gains = [information_gain(X, y, i) for i in range(X.shape[1])]
best_feature_index = np.argmax(information_gains)
best_feature = features[best_feature_index]
tree = {best_feature: {}}
remaining_features = [f for i, f in enumerate(features) if i != best_feature_index]
unique_values = np.unique(X[:, best_feature_index])
for value in unique_values:
subset_indices = np.where(X[:, best_feature_index] == value)
subset_X = X[subset_indices]
subset_y = y[subset_indices]
tree[best_feature][value] = id3(subset_X, subset_y, remaining_features)
return tree
# 示例用法
X = np.array([[1, 0, 1],
[1, 1, 0],
[0, 1, 1],
[0, 0, 0]])
y = np.array([1, 1, 0, 0])
features = ['feature1', 'feature2', 'feature3']
decision_tree = id3(X, y, features)
print(decision_tree)
```
这段代码实现了一个简单的ID3决策树算法,包括计算熵、计算信息增益、构建决策树等功能。你可以根据自己的数据和特征进行相应的修改和调整。