python实现c4.5决策树代码

以下是Python实现C4.5决策树的代码： ```python import numpy as np import pandas as pd import math class C45DecisionTree: def __init__(self, epsilon=0.1): self.epsilon = epsilon self.tree = {} def calc_entropy(self, data): n = len(data) label_counts = {} for row in data: label = row[-1] if label not in label_counts: label_counts[label] = 0 label_counts[label] += 1 entropy = 0.0 for key in label_counts: prob = float(label_counts[key]) / n entropy -= prob * math.log(prob, 2) return entropy def split_data(self, data, axis, value): ret_data = [] for row in data: if row[axis] == value: reduced_row = row[:axis] reduced_row.extend(row[axis+1:]) ret_data.append(reduced_row) return ret_data def choose_best_feature(self, data): num_features = len(data[0]) - 1 base_entropy = self.calc_entropy(data) best_info_gain_ratio = 0.0 best_feature = -1 for i in range(num_features): feat_list = [row[i] for row in data] unique_vals = set(feat_list) new_entropy = 0.0 split_info = 0.0 for value in unique_vals: sub_data = self.split_data(data, i, value) prob = len(sub_data) / float(len(data)) new_entropy += prob * self.calc_entropy(sub_data) split_info -= prob * math.log(prob, 2) info_gain = base_entropy - new_entropy if split_info == 0: continue info_gain_ratio = info_gain / split_info if info_gain_ratio > best_info_gain_ratio: best_info_gain_ratio = info_gain_ratio best_feature = i return best_feature def majority_cnt(self, label_list): label_counts = {} for vote in label_list: if vote not in label_counts: label_counts[vote] = 0 label_counts[vote] += 1 sorted_label_counts = sorted(label_counts.items(), key=lambda x: x[1], reverse=True) return sorted_label_counts[0][0] def create_tree(self, data, labels): class_list = [row[-1] for row in data] if class_list.count(class_list[0]) == len(class_list): return class_list[0] if len(data[0]) == 1: return self.majority_cnt(class_list) best_feat = self.choose_best_feature(data) best_feat_label = labels[best_feat] my_tree = {best_feat_label: {}} del(labels[best_feat]) feat_values = [row[best_feat] for row in data] unique_vals = set(feat_values) for value in unique_vals: sub_labels = labels[:] my_tree[best_feat_label][value] = self.create_tree(self.split_data(data, best_feat, value), sub_labels) return my_tree def fit(self, X, y): data = pd.concat([X, y], axis=1).values.tolist() labels = list(X.columns) + ['label'] self.tree = self.create_tree(data, labels) def predict(self, X): X = X.values.tolist() res = [] for x in X: res.append(self.predict_single(x)) return res def predict_single(self, x): input_tree = self.tree while True: (key, value), = input_tree.items() if isinstance(value, dict): index = list(labels).index(key) input_tree = value[x[index]] else: return value # 测试代码 from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score iris = load_iris() X = pd.DataFrame(iris.data, columns=iris.feature_names) y = pd.Series(iris.target) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) clf = C45DecisionTree() clf.fit(X_train, y_train) y_pred = clf.predict(X_test) print('Accuracy:', accuracy_score(y_test, y_pred)) ```

阅读全文

python实现c4.5决策树代码

相关推荐

python实现C4.5决策树算法

C4.5决策树代码

python实现决策树（C4.5算法）

Python实现C4.5决策树算法详解

Python实现C4.5决策树详解与实战示例

Python实现C4.5决策树算法及数据样本

python实现c4.5决策树

Python实现C4.5决策树：高效数据挖掘与分类工具.zip

Python实现C4.5决策树算法：源码、数据集及应用指南

超轻量级Python实现C4.5决策树库——pyC45

Python实现C4.5决策树鸢尾花分类与可视化

Python实现C4.5决策树分类预测与机器学习应用

Python实现C4.5决策树算法及莺尾花数据集应用

Python实现C4.5决策树：机器学习与数据分类利器

id3决策树 鸢尾花 python_C4.5决策树Python代码实现

给一个csv文件数据,python实现c4.5决策树算法

python利用c4.5决策树对鸢尾花卉数据集进行分类（iris）(代码全)

python利用c4.5决策树对鸢尾花卉数据集进行分类（iris）

python实现C4.5算法的代码

用python写含注释的c4.5决策树代码

大家在看

TPS54160实现24V转正负15V双输出电源AD设计全方案

节的一些关于非传统-华为hcnp-数通题库2020/1/16（h12-221）v2.5

深圳大学《数据结构》1-4章练习题

【电子版】校招面试题库（附答案与解析）java篇-破解密码.pdf

ICCV2019无人机集群人体动作捕捉文章

最新推荐

基于ID3决策树算法的实现(Python版)

基于微信小程序的社区门诊管理系统php.zip

白色大气风格的设计师作品模板下载.zip

RStudio中集成Connections包以优化数据库连接管理

管理建模和仿真的文件

Keil uVision5全面精通指南

flink提交给yarn19个全量同步MYsqlCDC的作业，flink的配置参数怎样设置

PHP博客旅游的探索之旅

"互动学习：行动中的多样性与论文攻读经历"

【单片机编程实战】：掌握流水灯与音乐盒同步控制的高级技巧

id3决策树鸢尾花 python_C4.5决策树Python代码实现