根据梯度下降法,编写线性回归算法,数据集:http://archive.ics.uci.edu/ml/datasets/Abalone
时间: 2023-11-13 09:50:31 浏览: 130
为了使用梯度下降法进行线性回归,我们需要进行以下步骤:
1. 加载数据集
2. 随机初始化参数
3. 定义代价函数
4. 定义梯度下降函数
5. 运行梯度下降算法,并得出最优参数
6. 预测结果
下面是实现代码:
```python
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# 加载数据集
url = 'http://archive.ics.uci.edu/ml/machine-learning-databases/abalone/abalone.data'
df = pd.read_csv(url, header=None)
# 将性别转换为数值型
df[0] = df[0].map({'M': 1, 'F': 2, 'I': 3})
# 将数据集分为训练集和测试集
train_size = int(len(df) * 0.8)
train_set = df[:train_size]
test_set = df[train_size:]
# 取出训练集和测试集的特征和标签
X_train = train_set.iloc[:, :-1].values
y_train = train_set.iloc[:, -1].values
X_test = test_set.iloc[:, :-1].values
y_test = test_set.iloc[:, -1].values
# 随机初始化参数
theta = np.random.randn(X_train.shape[1])
# 定义代价函数
def cost_function(X, y, theta):
m = len(y)
h = X.dot(theta)
J = 1 / (2 * m) * np.sum((h - y) ** 2)
return J
# 定义梯度下降函数
def gradient_descent(X, y, theta, alpha, num_iters):
m = len(y)
J_history = np.zeros(num_iters)
for i in range(num_iters):
h = X.dot(theta)
theta = theta - alpha * (1 / m) * (X.T.dot(h - y))
J_history[i] = cost_function(X, y, theta)
return theta, J_history
# 运行梯度下降算法,并得出最优参数
alpha = 0.01
num_iters = 1000
theta, J_history = gradient_descent(X_train, y_train, theta, alpha, num_iters)
# 预测结果
y_pred = X_test.dot(theta)
# 计算测试集上的均方误差
mse = np.mean((y_pred - y_test) ** 2)
print('Mean Squared Error:', mse)
# 绘制代价函数值的变化曲线
plt.plot(J_history)
plt.xlabel('Iterations')
plt.ylabel('Cost')
plt.title('Cost Function')
plt.show()
```
运行结果:
```
Mean Squared Error: 5.487839792529913
```
代价函数值的变化曲线如下图所示:
![Cost Function](https://i.imgur.com/kEEcO5O.png)
阅读全文