用python写一个xgboost
时间: 2024-05-05 14:22:44 浏览: 88
好的,以下是一个简单的 XGBoost 的 Python 实现。首先,我们需要导入需要的库:numpy、pandas、sklearn 和 xgboost。我将使用鸢尾花数据集作为示例数据集。
```python
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import xgboost as xgb
from xgboost import XGBClassifier
from sklearn.metrics import accuracy_score
```
接下来,我们加载数据集并将其拆分为训练集和测试集。
```python
iris = load_iris()
X = iris.data
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
```
然后,我们需要将数据集转换为 DMatrix 格式。这个格式是 xgboost 中用于存储数据的一种格式。
```python
dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)
```
接下来,我们定义模型参数。
```python
params = {
'max_depth': 3,
'eta': 0.3,
'objective': 'multi:softmax',
'num_class': 3
}
```
这些参数包括模型的最大深度、学习率、目标函数和分类数。
然后,我们使用 XGBClassifier() 函数创建分类器并使用 fit() 函数拟合模型。
```python
model = XGBClassifier(**params)
model.fit(X_train, y_train)
```
然后,我们使用 predict() 函数在测试集上进行预测,并计算模型的准确性。
```python
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy}')
```
完整的代码如下:
```python
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import xgboost as xgb
from xgboost import XGBClassifier
from sklearn.metrics import accuracy_score
iris = load_iris()
X = iris.data
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)
params = {
'max_depth': 3,
'eta': 0.3,
'objective': 'multi:softmax',
'num_class': 3
}
model = XGBClassifier(**params)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy}')
```
阅读全文