梯度下降法房价预测流程python代码
时间: 2023-12-01 09:00:39 浏览: 71
以下是使用梯度下降法进行房价预测的Python代码示例:
```python
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# 加载数据集
data = pd.read_csv('house_prices.csv')
# 特征缩放
data = (data - data.mean()) / data.std()
# 将数据集划分为训练集和测试集
train_data = data.sample(frac=0.8, random_state=200)
test_data = data.drop(train_data.index)
# 提取训练集和测试集的特征和标签
train_features = train_data.iloc[:, :-1]
train_labels = train_data.iloc[:, -1]
test_features = test_data.iloc[:, :-1]
test_labels = test_data.iloc[:, -1]
# 构建模型
def gradient_descent(X, y, theta, alpha, m, num_iterations):
X_transpose = X.transpose()
for i in range(num_iterations):
hypothesis = np.dot(X, theta)
loss = hypothesis - y
cost = np.sum(loss ** 2) / (2 * m)
gradient = np.dot(X_transpose, loss) / m
theta -= alpha * gradient
if i % 1000 == 0:
print("Iteration %d | Cost: %f" % (i, cost))
return theta
# 添加一列全为1的特征,用于计算截距
train_features['intercept'] = 1
test_features['intercept'] = 1
# 将特征和标签转换为Numpy数组
X = np.array(train_features)
y = np.array(train_labels).reshape(-1, 1)
# 初始化参数
theta = np.zeros((X.shape[1], 1))
# 设置超参数
alpha = 0.01
num_iterations = 10000
# 运行梯度下降算法
theta = gradient_descent(X, y, theta, alpha, len(X), num_iterations)
# 计算在测试集上的预测误差
X_test = np.array(test_features)
y_test = np.array(test_labels).reshape(-1, 1)
y_pred = np.dot(X_test, theta)
test_cost = np.sum((y_pred - y_test) ** 2) / (2 * len(y_test))
print("Test Cost:", test_cost)
# 绘制预测结果与真实值的散点图
plt.scatter(y_test, y_pred)
plt.xlabel("True Values")
plt.ylabel("Predictions")
plt.show()
```
在上面的代码中,我们首先加载数据集,然后进行特征缩放,将数据集划分为训练集和测试集,并提取训练集和测试集的特征和标签。接着,我们定义了一个梯度下降函数,用于训练模型。在函数中,我们首先计算假设函数的值,然后计算损失和代价,并计算梯度更新参数。运行完梯度下降算法后,我们使用测试集计算预测误差,并绘制预测结果与真实值的散点图。
需要注意的是,上述代码仅作为示例,实际应用中需要根据具体情况进行调整和优化。
阅读全文