以波士顿房价预测为例,编写程序,分别用岭回归、神经网络、支持向量机方法建立房价预测模型,涵盖完整的数据建模步骤,从预测误差(MSE和MAE)和决定系数(R2)对比不同模型的预测性能,并绘制不同模型预测结果和真实结果的对比曲线图。鼓励尝试新的预测模型和讨论模型中参数对预测性能的影响。
时间: 2024-04-19 09:25:11 浏览: 195
对于波士顿房价预测任务,我们将使用岭回归、神经网络和支持向量机三种方法来建立预测模型。下面是一种可能的实现方式:
1. 导入所需的库和数据集:
```python
import numpy as np
import pandas as pd
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import Ridge
from sklearn.neural_network import MLPRegressor
from sklearn.svm import SVR
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
import matplotlib.pyplot as plt
# 加载波士顿房价数据集
boston = load_boston()
X = boston.data
y = boston.target
# 数据集划分为训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# 特征归一化
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
```
2. 岭回归模型:
```python
# 岭回归模型训练和预测
ridge = Ridge(alpha=0.5)
ridge.fit(X_train, y_train)
ridge_pred = ridge.predict(X_test)
# 计算误差和决定系数
ridge_mse = mean_squared_error(y_test, ridge_pred)
ridge_mae = mean_absolute_error(y_test, ridge_pred)
ridge_r2 = r2_score(y_test, ridge_pred)
# 绘制预测结果和真实结果对比曲线图
plt.scatter(y_test, ridge_pred)
plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'k--', lw=2)
plt.xlabel('True Values')
plt.ylabel('Predictions')
plt.title('Ridge Regression')
plt.show()
```
3. 神经网络模型:
```python
# 神经网络模型训练和预测
mlp = MLPRegressor(hidden_layer_sizes=(100, 50), max_iter=1000, random_state=42)
mlp.fit(X_train, y_train)
mlp_pred = mlp.predict(X_test)
# 计算误差和决定系数
mlp_mse = mean_squared_error(y_test, mlp_pred)
mlp_mae = mean_absolute_error(y_test, mlp_pred)
mlp_r2 = r2_score(y_test, mlp_pred)
# 绘制预测结果和真实结果对比曲线图
plt.scatter(y_test, mlp_pred)
plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'k--', lw=2)
plt.xlabel('True Values')
plt.ylabel('Predictions')
plt.title('Neural Network')
plt.show()
```
4. 支持向量机模型:
```python
# 支持向量机模型训练和预测
svm = SVR(kernel='linear')
svm.fit(X_train, y_train)
svm_pred = svm.predict(X_test)
# 计算误差和决定系数
svm_mse = mean_squared_error(y_test, svm_pred)
svm_mae = mean_absolute_error(y_test, svm_pred)
svm_r2 = r2_score(y_test, svm_pred)
# 绘制预测结果和真实结果对比曲线图
plt.scatter(y_test, svm_pred)
plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'k--', lw=2)
plt.xlabel('True Values')
plt.ylabel('Predictions')
plt.title('Support Vector Machine')
plt.show()
```
5. 输出各模型的评估结果:
```python
print("岭回归模型的预测误差:")
print("MSE:", ridge_mse)
print("MAE:", ridge_mae)
print("决定系数 R2:", ridge_r2)
print("神经网络模型的预测误差:")
print("MSE:", mlp_mse)
print("MAE:", mlp_mae)
print("决定系数 R2:", mlp_r2)
print("支持向量机模型的预测误差:")
print("MSE:", svm_mse)
print("MAE:", svm_mae)
print("决定系数 R2:", svm_r2)
```
这样,我们就完成了岭回归、神经网络和支持向量机三种方法的建模和评估,并绘制了预测结果和真实结果的对比曲线图。你可以根据需要调整参数、尝试其他模型,并讨论参数对预测性能的影响。
阅读全文