首页随机森林从数以千计的输入变量中找出最重要的变量 python实现

随机森林从数以千计的输入变量中找出最重要的变量 python实现

时间: 2024-02-17 07:02:35 浏览: 69

在Python中使用随机森林找出最重要的变量，可以使用scikit-learn库中的RandomForestRegressor或RandomForestClassifier类。下面是一个示例代码： ``` from sklearn.ensemble import RandomForestRegressor # X为输入变量的数据，y为对应的输出结果 rf = RandomForestRegressor(n_estimators=100, random_state=42) rf.fit(X, y) # 得到每个变量的重要性排名 importances = rf.feature_importances_ indices = np.argsort(importances)[::-1] # 打印重要性排名 for f in range(X.shape[1]): print("%d. feature %d (%f)" % (f + 1, indices[f], importances[indices[f]])) ``` 其中，n_estimators参数指定了随机森林中决策树的数量，random_state参数用于设置随机种子，保证结果可重复。在训练完成后，可以通过feature_importances_属性得到每个变量的重要性排名，indices数组即为排名。

阅读全文