你能分别用梯度下降法和牛顿法实现MNIST数据集的训练及分类吗,给出python实现
时间: 2024-03-05 17:55:20 浏览: 208
当然可以!以下是使用梯度下降法和牛顿法实现MNIST数据集的训练及分类的 Python 代码:
首先,我们需要加载 MNIST 数据集,可以使用 `tensorflow` 中的 `keras` 库来加载数据集:
```python
import tensorflow as tf
from tensorflow import keras
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
# 归一化
x_train, x_test = x_train / 255.0, x_test / 255.0
```
接下来,我们可以使用梯度下降法实现 MNIST 数据集的训练及分类:
```python
import numpy as np
# 将样本矩阵转换为向量
def flatten(X):
m = X.shape[0]
return X.reshape(m, -1)
def init(X, y):
m, n = X.shape
X = np.hstack((np.ones((m, 1)), X))
y = y.reshape(-1, 1)
theta = np.zeros((n+1, 1))
return X, y, theta
# 计算sigmoid函数
def sigmoid(z):
return 1 / (1 + np.exp(-z))
# 损失函数
def cost(theta, X, y):
m = len(y)
h = sigmoid(X @ theta)
J = -1/m * (y.T @ np.log(h) + (1-y).T @ np.log(1-h))
return J
# 梯度下降法
def gradient(theta, X, y):
m = len(y)
h = sigmoid(X @ theta)
grad = 1/m * X.T @ (h - y)
return grad
def gradient_descent(X, y, theta, alpha, num_iters):
J_history = np.zeros((num_iters, 1))
for i in range(num_iters):
grad = gradient(theta, X, y)
theta -= alpha * grad
J_history[i] = cost(theta, X, y)
return J_history, theta
# 初始化
X_train_flatten = flatten(x_train)
X_test_flatten = flatten(x_test)
X_train, Y_train, theta = init(X_train_flatten, y_train)
# 梯度下降训练
alpha = 0.1
num_iters = 1000
J_history, theta = gradient_descent(X_train, Y_train, theta, alpha, num_iters)
# 预测
X_test, Y_test, _ = init(X_test_flatten, y_test)
y_pred = np.round(sigmoid(X_test @ theta))
accuracy = np.mean(y_pred == Y_test) * 100
print("Accuracy:", accuracy)
```
接下来,我们可以使用牛顿法实现 MNIST 数据集的训练及分类:
```python
# 计算sigmoid函数
def sigmoid(z):
return 1 / (1 + np.exp(-z))
# 损失函数
def cost(theta, X, y):
m = len(y)
h = sigmoid(X @ theta)
J = -1/m * (y.T @ np.log(h) + (1-y).T @ np.log(1-h))
return J
# 梯度
def gradient(theta, X, y):
m = len(y)
h = sigmoid(X @ theta)
grad = 1/m * X.T @ (h - y)
return grad
# Hessian矩阵
def hessian(theta, X):
m = X.shape[0]
h = sigmoid(X @ theta)
H = 1/m * (X.T @ np.diag(h.ravel()) @ np.diag((1-h).ravel()) @ X)
return H
# 牛顿法
def newton(theta, X, y):
J_history = []
for i in range(10):
grad = gradient(theta, X, y)
H = hessian(theta, X)
theta -= np.linalg.inv(H) @ grad
J_history.append(cost(theta, X, y))
return J_history, theta
# 初始化
X_train_flatten = flatten(x_train)
X_test_flatten = flatten(x_test)
X_train, Y_train, theta = init(X_train_flatten, y_train)
# 牛顿法训练
J_history, theta = newton(theta, X_train, Y_train)
# 预测
X_test, Y_test, _ = init(X_test_flatten, y_test)
y_pred = np.round(sigmoid(X_test @ theta))
accuracy = np.mean(y_pred == Y_test) * 100
print("Accuracy:", accuracy)
```
以上就是使用梯度下降法和牛顿法实现MNIST数据集的训练及分类的 Python 代码。需要注意的是,这里我们只是简单地使用了逻辑回归模型对 MNIST 数据集进行分类,在实际应用中可能需要更加复杂的模型和算法来提高分类准确率。
阅读全文