使用线性回归对房屋价格(数据集:题目1数据.csv)进行预测,第一列是样本编号,第2-6列是5个特征,第7列是方法。使用前350行作为训练数据,351-414行作为测试数据,根据5个特征预测房价,对测试数据输出评价指标r2和mse。 用梯度下降法实现,贴出代码
时间: 2023-06-19 11:04:00 浏览: 125
以下是使用梯度下降法实现线性回归的代码:
```python
import numpy as np
import pandas as pd
# 读取数据集
data = pd.read_csv("题目1数据.csv")
# 划分训练集和测试集
train_data = data.iloc[:350,:]
test_data = data.iloc[351:,:]
# 特征和标签
X_train = train_data.iloc[:,1:6].values
y_train = train_data.iloc[:,6].values
X_test = test_data.iloc[:,1:6].values
y_test = test_data.iloc[:,6].values
# 特征缩放
mean = np.mean(X_train, axis=0)
std_dev = np.std(X_train, axis=0)
X_train = (X_train - mean) / std_dev
X_test = (X_test - mean) / std_dev
# 添加截距项
X_train = np.insert(X_train, 0, 1, axis=1)
X_test = np.insert(X_test, 0, 1, axis=1)
# 学习率、迭代次数和初始权重
alpha = 0.01
num_iters = 1000
theta = np.zeros(6)
# 定义代价函数和梯度函数
def cost_function(X, y, theta):
m = len(y)
J = np.sum((X.dot(theta) - y) ** 2) / (2 * m)
return J
def gradient(X, y, theta):
m = len(y)
grad = X.T.dot(X.dot(theta) - y) / m
return grad
# 梯度下降算法
def gradient_descent(X, y, theta, alpha, num_iters):
J_history = []
for i in range(num_iters):
theta -= alpha * gradient(X, y, theta)
J_history.append(cost_function(X, y, theta))
return theta, J_history
# 训练模型
theta, J_history = gradient_descent(X_train, y_train, theta, alpha, num_iters)
# 预测
y_pred = X_test.dot(theta)
# 评价指标
from sklearn.metrics import r2_score, mean_squared_error
r2 = r2_score(y_test, y_pred)
mse = mean_squared_error(y_test, y_pred)
print("r2 score:", r2)
print("mean squared error:", mse)
```
其中,`cost_function` 和 `gradient` 分别计算代价函数和梯度,`gradient_descent` 是梯度下降算法的实现,`r2_score` 和 `mean_squared_error` 是评价指标。