c4.5对鸢尾花数据集分类并可视化
时间: 2024-06-13 18:06:27 浏览: 100
以下是使用C4.5算法对鸢尾花数据集进行分类并可视化的步骤:
1.导入所需库和数据集
```python
import pandas as pd
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.tree import export_graphviz
import graphviz
iris = load_iris()
X = iris.data
y = iris.target
```
2.将数据集分为训练集和测试集
```python
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
```
3.使用C4.5算法构建决策树模型
```python
clf = DecisionTreeClassifier(criterion='entropy')
clf.fit(X_train, y_train)
```
4.可视化决策树模型
```python
dot_data = export_graphviz(clf, out_file=None,
feature_names=iris.feature_names,
class_names=iris.target_names,
filled=True, rounded=True,
special_characters=True)
graph = graphviz.Source(dot_data)
graph
```
5.预测测试集并计算准确率
```python
y_pred = clf.predict(X_test)
accuracy = np.mean(y_pred == y_test)
print('Accuracy:', accuracy)
```
阅读全文