sklearn随机森林鸢尾花
时间: 2023-09-04 11:13:30 浏览: 76
使用sklearn库中的随机森林模型对鸢尾花数据集进行分类,可以按照以下步骤进行:
1. 导入sklearn库中的datasets和ensemble模块:
```python
from sklearn import datasets
from sklearn.ensemble import RandomForestClassifier
```
2. 加载鸢尾花数据集:
```python
iris = datasets.load_iris()
```
3. 将数据集分为训练集和测试集:
```python
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.3, random_state=0)
```
4. 创建随机森林模型:
```python
rfc = RandomForestClassifier(n_estimators=100, random_state=0)
```
其中,n_estimators表示决策树的数量,random_state表示随机数种子。
5. 使用训练集进行模型训练:
```python
rfc.fit(X_train, y_train)
```
6. 对测试集进行预测:
```python
y_pred = rfc.predict(X_test)
```
7. 输出模型评估结果:
```python
from sklearn.metrics import accuracy_score, classification_report
print('Accuracy:', accuracy_score(y_test, y_pred))
print('Classification Report:\n', classification_report(y_test, y_pred))
```
其中,accuracy_score表示准确率,classification_report包含了模型的召回率、精确率和F1值等指标。
完整代码如下:
```python
from sklearn import datasets
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report
iris = datasets.load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.3, random_state=0)
rfc = RandomForestClassifier(n_estimators=100, random_state=0)
rfc.fit(X_train, y_train)
y_pred = rfc.predict(X_test)
print('Accuracy:', accuracy_score(y_test, y_pred))
print('Classification Report:\n', classification_report(y_test, y_pred))
```