xgboost代码实现
时间: 2023-07-23 22:04:41 浏览: 210
XGBoost是一种基于决策树的集成机器学习算法,它在Kaggle等数据科学竞赛中得到了广泛应用。下面是一个简单的XGBoost分类器的代码实现:
首先,我们需要导入必要的库:
```python
import xgboost as xgb
from sklearn.metrics import accuracy_score
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
```
然后,我们需要加载数据:
```python
digits = load_digits()
X_train, X_test, y_train, y_test = train_test_split(digits.data, digits.target, test_size=0.3, random_state=42)
```
接下来,我们需要将数据转换为DMatrix格式:
```python
dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)
```
然后,我们需要设置参数:
```python
params = {
'booster': 'gbtree',
'objective': 'multi:softmax',
'num_class': 10,
'gamma': 0.1,
'max_depth': 8,
'lambda': 2,
'subsample': 0.7,
'colsample_bytree': 0.7,
'min_child_weight': 3,
'silent': 1,
'eta': 0.1,
'seed': 1000,
'nthread': 4,
}
```
在这里,我们使用了多分类的softmax目标函数,设置了树的深度为8,使用了L2正则化,以及其他一些参数。
接下来,我们可以训练模型:
```python
num_rounds = 100
model = xgb.train(params, dtrain, num_rounds)
```
最后,我们可以使用测试集来评估模型:
```python
y_pred = model.predict(dtest)
accuracy = accuracy_score(y_test, y_pred)
print('Accuracy: %.2f%%' % (accuracy * 100.0))
```
完整的代码如下:
```python
import xgboost as xgb
from sklearn.metrics import accuracy_score
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
digits = load_digits()
X_train, X_test, y_train, y_test = train_test_split(digits.data, digits.target, test_size=0.3, random_state=42)
dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)
params = {
'booster': 'gbtree',
'objective': 'multi:softmax',
'num_class': 10,
'gamma': 0.1,
'max_depth': 8,
'lambda': 2,
'subsample': 0.7,
'colsample_bytree': 0.7,
'min_child_weight': 3,
'silent': 1,
'eta': 0.1,
'seed': 1000,
'nthread': 4,
}
num_rounds = 100
model = xgb.train(params, dtrain, num_rounds)
y_pred = model.predict(dtest)
accuracy = accuracy_score(y_test, y_pred)
print('Accuracy: %.2f%%' % (accuracy * 100.0))
```
阅读全文