详细讲解随机森林代码
时间: 2023-06-30 20:04:00 浏览: 99
随机森林是一种集成学习方法,它将多个决策树进行组合来达到更好的预测效果。以下是使用 Python 实现随机森林的代码示例:
首先,需要导入所需的库:
```python
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_classification
```
接着,生成一个模拟数据集:
```python
X, y = make_classification(n_samples=1000, n_features=4,
n_informative=2, n_redundant=0,
random_state=0, shuffle=False)
```
然后,将数据集分为训练集和测试集:
```python
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25,
random_state=0)
```
接下来,创建一个随机森林分类器:
```python
clf = RandomForestClassifier(n_estimators=100, max_depth=2,
random_state=0)
```
在这里,`n_estimators` 表示生成的决策树数量,`max_depth` 表示每个决策树的最大深度。
然后,使用训练集进行拟合:
```python
clf.fit(X_train, y_train)
```
最后,使用测试集进行预测并计算准确率:
```python
accuracy = clf.score(X_test, y_test)
print(f"Accuracy: {accuracy}")
```
完整的随机森林代码示例:
```python
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_classification
# 生成模拟数据集
X, y = make_classification(n_samples=1000, n_features=4,
n_informative=2, n_redundant=0,
random_state=0, shuffle=False)
# 将数据集分为训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25,
random_state=0)
# 创建随机森林分类器
clf = RandomForestClassifier(n_estimators=100, max_depth=2,
random_state=0)
# 使用训练集进行拟合
clf.fit(X_train, y_train)
# 使用测试集进行预测并计算准确率
accuracy = clf.score(X_test, y_test)
print(f"Accuracy: {accuracy}")
```
阅读全文