房价预测 (Numpy实现正规方程法)
时间: 2023-11-27 09:03:32 浏览: 99
这里是一个使用NumPy实现正规方程法来预测房价的示例代码。我们将使用波士顿房价数据集作为我们的示例数据集。
首先,我们需要加载数据集并将其分成训练集和测试集。
```python
import numpy as np
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
# 加载数据集
data = load_boston()
X, y = data['data'], data['target']
# 将数据集分为训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
```
接下来,我们需要对训练数据进行预处理。我们将对特征进行归一化处理,这将有助于我们的模型更快地收敛。
```python
# 对特征进行归一化处理
mean = np.mean(X_train, axis=0)
std = np.std(X_train, axis=0)
X_train = (X_train - mean) / std
X_test = (X_test - mean) / std
# 添加偏置列
X_train = np.hstack((np.ones((X_train.shape[0], 1)), X_train))
X_test = np.hstack((np.ones((X_test.shape[0], 1)), X_test))
```
现在,我们可以使用正规方程法来训练我们的模型。正规方程法是一种关闭形式解的方法,可以在一次操作中计算出最优解。我们可以使用以下公式计算最优解:
$$\theta = (X^T X)^{-1} X^T y$$
```python
# 计算最优解
theta = np.linalg.inv(X_train.T.dot(X_train)).dot(X_train.T).dot(y_train)
```
最后,我们可以使用训练好的模型进行预测并计算其性能指标。
```python
# 进行预测
y_pred = X_test.dot(theta)
# 计算性能指标
mse = np.mean((y_pred - y_test) ** 2)
mae = np.mean(np.abs(y_pred - y_test))
print('Mean Squared Error: {:.2f}'.format(mse))
print('Mean Absolute Error: {:.2f}'.format(mae))
```
完整代码如下:
```python
import numpy as np
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
# 加载数据集
data = load_boston()
X, y = data['data'], data['target']
# 将数据集分为训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# 对特征进行归一化处理
mean = np.mean(X_train, axis=0)
std = np.std(X_train, axis=0)
X_train = (X_train - mean) / std
X_test = (X_test - mean) / std
# 添加偏置列
X_train = np.hstack((np.ones((X_train.shape[0], 1)), X_train))
X_test = np.hstack((np.ones((X_test.shape[0], 1)), X_test))
# 计算最优解
theta = np.linalg.inv(X_train.T.dot(X_train)).dot(X_train.T).dot(y_train)
# 进行预测
y_pred = X_test.dot(theta)
# 计算性能指标
mse = np.mean((y_pred - y_test) ** 2)
mae = np.mean(np.abs(y_pred - y_test))
print('Mean Squared Error: {:.2f}'.format(mse))
print('Mean Absolute Error: {:.2f}'.format(mae))
```
阅读全文