xgboost预测鸢尾花数据集
时间: 2023-09-04 21:08:05 浏览: 95
基于鸢尾花的数据预测模型
首先,我们需要导入必要的库和鸢尾花数据集:
```python
from sklearn.datasets import load_iris
import xgboost as xgb
iris = load_iris()
X = iris.data
y = iris.target
```
接着,我们将数据集拆分成训练集和测试集:
```python
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
```
然后,我们需要将数据集转换成 DMatrix 格式:
```python
dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)
```
接着,我们需要设置模型参数:
```python
params = {
'max_depth': 3,
'eta': 0.1,
'objective': 'multi:softmax',
'num_class': 3}
```
这里我们设置了树的最大深度为3,学习率为0.1,目标函数为多分类问题,类别数为3。
接下来,我们可以训练模型:
```python
num_round = 50
bst = xgb.train(params, dtrain, num_round)
```
我们设置了训练轮数为50轮,训练出来的模型保存在 `bst` 变量中。
最后,我们可以用测试集来测试模型:
```python
preds = bst.predict(dtest)
```
最后,我们可以用测试集来测试模型的准确率:
```python
from sklearn.metrics import accuracy_score
acc = accuracy_score(y_test, preds)
print("Accuracy: {:.2f}%".format(acc * 100))
```
这就是使用 xgboost 预测鸢尾花数据集的过程。完整代码如下:
```python
from sklearn.datasets import load_iris
import xgboost as xgb
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
iris = load_iris()
X = iris.data
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)
params = {
'max_depth': 3,
'eta': 0.1,
'objective': 'multi:softmax',
'num_class': 3}
num_round = 50
bst = xgb.train(params, dtrain, num_round)
preds = bst.predict(dtest)
acc = accuracy_score(y_test, preds)
print("Accuracy: {:.2f}%".format(acc * 100))
```
阅读全文