使用extre tree和random forest做特征选择代码
时间: 2024-05-10 17:20:38 浏览: 98
以下是使用Extra Trees和Random Forest进行特征选择的Python示例代码。假设我们有一个包含特征和目标变量的数据集,并且我们要选择最重要的特征。
```python
import numpy as np
import pandas as pd
from sklearn.ensemble import ExtraTreesClassifier, RandomForestClassifier
# 读取数据集
data = pd.read_csv('dataset.csv')
X = data.drop(columns=['target'])
y = data['target']
# 创建Extra Trees分类器并拟合数据
et_clf = ExtraTreesClassifier()
et_clf.fit(X, y)
# 创建Random Forest分类器并拟合数据
rf_clf = RandomForestClassifier()
rf_clf.fit(X, y)
# 输出每个特征的重要性得分
print('Extra Trees feature importance scores:')
print(et_clf.feature_importances_)
print('Random Forest feature importance scores:')
print(rf_clf.feature_importances_)
# 选择最重要的特征
et_indices = np.argsort(et_clf.feature_importances_)[::-1]
rf_indices = np.argsort(rf_clf.feature_importances_)[::-1]
num_features = 10 # 选择前10个特征
et_selected_indices = et_indices[:num_features]
rf_selected_indices = rf_indices[:num_features]
et_selected_features = X.columns[et_selected_indices]
rf_selected_features = X.columns[rf_selected_indices]
print('Selected features using Extra Trees:', et_selected_features)
print('Selected features using Random Forest:', rf_selected_features)
```
上述代码将数据集加载到Pandas DataFrame中,然后使用`ExtraTreesClassifier`和`RandomForestClassifier`拟合数据,并输出每个特征的重要性得分。然后,我们选择最重要的前10个特征,并输出它们的名称。
阅读全文