机器学习中使用预处理后的鲍鱼数据集,尝试调整支持向量机的gamma参数和C参数,观察模型R平方分数的变化的代码
在Python中,特别是在Scikit-learn库中,你可以使用GridSearchCV
函数结合SupportVectorRegressor
(SVM回归器)来调整gamma
和C
参数,并通过交叉验证计算每个参数组合下的R²得分。这里是一个简单的示例:
```python
导入所需的库
from sklearn import datasets from sklearn.preprocessing import StandardScaler from sklearn.svm import SVR from sklearn.model_selection import train_test_split, GridSearchCV, cross_val_score import numpy as np
加载鲍鱼数据集(假设已准备好并名为'bivalves_data')
bivalves_data = ... # 负责加载数据集,例如 bivalves = load_boston() from sklearn.datasets
预处理数据,标准化数值特征
scaler = StandardScaler() bivalves_data_scaled = scaler.fit_transform(bivalves_data.data)
划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(bivalves_data_scaled, bivalves_data.target, test_size=0.2, random_state=42)
定义搜索参数网格
param_grid = {'kernel': ['linear', 'rbf'], # 支持向量机的内核类型 'C': [0.1, 1, 10], # 正则化强度 'gamma': [1e-3, 1e-2, 1]} # gamma参数对径向基函数影响很大
创建SVM回归模型,并进行网格搜索
svm_reg = SVR() grid_search = GridSearchCV(svm_reg, param_grid, scoring='r2', cv=5) # 使用交叉验证计算R² grid_search.fit(X_train, y_train)
获取最佳参数及对应的R²得分
best_params = grid_search.best_params_ best_r2_score = grid_search.best_score_
使用最佳参数在测试集上评估模型
test_scores = cross_val_score(grid_search.best_estimator_, X_test, y_test, scoring='r2') mean_test_r2 = test_scores.mean()
print(f"Best parameters: {best_params}") print(f"Best R² score with training data: {best_r2_score:.3f}") print(f"Mean R² score on testing data: {mean_test_r2:.3f}")