我已经用数据训练好决策树和随机森林,现在我需要设置一组数据分别测试决策树和随机森林,预测这组数据描述的病人是否患有心脏病,请给我测试的代码。下面是我的训练过程import pandas as pd from sklearn import tree from sklearn.ensemble import RandomForestClassifier import matplotlib.pyplot as plt import numpy as np heart=pd.read_csv("D:\Anaconda1\heart.csv") heart.head() heart.info() clf_tree = tree.DecisionTreeClassifier() from sklearn.model_selection import train_test_split x = heart.drop('target',axis = 1) y = heart['target'] x_train,x_test,y_train,y_test = train_test_split(x,y,random_state=0) clf_tree.fit(x_train,y_train) print(clf_tree.score(x_train,y_train)) print(clf_tree.score(x_test,y_test)) tree.export_graphviz(clf, out_file='tree.dot') score_list=[] for i in range(10,100,10): clf_forest = RandomForestClassifier(n_estimators = i, random_state = 0) clf_forest.fit(x_train,y_train) score_list.append(clf_forest.score(x_test,y_test)) plt.plot(range(10,100,10), score_list) plt.show() print(np.argmax(score_list)) print(np.max(score_list))
时间: 2023-12-24 22:22:45 浏览: 79
Heart_disease_prediction:(决策树)目的是预测该人是否会患有心脏病
好的,可以使用以下代码对决策树和随机森林进行测试:
```
# 导入测试数据
test_data = pd.read_csv("D:\Anaconda1\test_data.csv")
# 分离特征和标签
X_test = test_data.drop('target', axis=1)
y_test = test_data['target']
# 测试决策树
tree_score = clf_tree.score(X_test, y_test)
print("决策树的准确率为:", tree_score)
# 测试随机森林
forest_score = clf_forest.score(X_test, y_test)
print("随机森林的准确率为:", forest_score)
```
其中,`test_data` 是你准备用来测试的数据集, `X_test` 是测试数据的特征, `y_test` 是测试数据的标签。`tree_score` 和 `forest_score` 分别是决策树和随机森林的准确率。
你可以将这段代码添加到你的训练代码后面,就可以进行测试了。
阅读全文