numpy实现bp神经网络并训练mnist数据集以及保存模型
时间: 2023-12-30 14:03:38 浏览: 153
以下是使用numpy实现bp神经网络并训练mnist数据集的代码:
```python
import numpy as np
import matplotlib.pyplot as plt
from keras.datasets import mnist
# 加载数据集并进行预处理
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(x_train.shape[0], -1) / 255.
x_test = x_test.reshape(x_test.shape[0], -1) / 255.
y_train = np.eye(10)[y_train]
y_test = np.eye(10)[y_test]
# 定义神经网络模型
class NeuralNetwork:
def __init__(self, input_size, hidden_size, output_size):
self.W1 = np.random.randn(input_size, hidden_size) / np.sqrt(input_size)
self.b1 = np.zeros(hidden_size)
self.W2 = np.random.randn(hidden_size, output_size) / np.sqrt(hidden_size)
self.b2 = np.zeros(output_size)
def forward(self, X):
self.z1 = np.dot(X, self.W1) + self.b1
self.a1 = np.tanh(self.z1)
self.z2 = np.dot(self.a1, self.W2) + self.b2
exp_scores = np.exp(self.z2)
self.probs = exp_scores / np.sum(exp_scores, axis=1, keepdims=True)
def backward(self, X, y, learning_rate):
delta3 = self.probs - y
dW2 = np.dot(self.a1.T, delta3)
db2 = np.sum(delta3, axis=0)
delta2 = np.dot(delta3, self.W2.T) * (1 - np.power(self.a1, 2))
dW1 = np.dot(X.T, delta2)
db1 = np.sum(delta2, axis=0)
self.W1 -= learning_rate * dW1
self.b1 -= learning_rate * db1
self.W2 -= learning_rate * dW2
self.b2 -= learning_rate * db2
def train(self, X, y, learning_rate=0.01, num_epochs=10000, print_loss=False):
for epoch in range(num_epochs):
self.forward(X)
self.backward(X, y, learning_rate)
if print_loss and epoch % 1000 == 0:
loss = self.calculate_loss(X, y)
print("Epoch {0}: loss = {1}".format(epoch, loss))
def predict(self, X):
self.forward(X)
return np.argmax(self.probs, axis=1)
def calculate_loss(self, X, y):
self.forward(X)
return -np.sum(y * np.log(self.probs))
# 创建神经网络对象并进行训练
nn = NeuralNetwork(784, 128, 10)
nn.train(x_train, y_train, learning_rate=0.1, num_epochs=10000, print_loss=True)
# 在测试集上评估模型性能
pred = nn.predict(x_test)
accuracy = np.mean(pred == np.argmax(y_test, axis=1))
print("Test accuracy: {0}".format(accuracy))
# 保存模型
np.savez("model.npz", W1=nn.W1, b1=nn.b1, W2=nn.W2, b2=nn.b2)
# 加载模型
model = np.load("model.npz")
nn = NeuralNetwork(784, 128, 10)
nn.W1 = model["W1"]
nn.b1 = model["b1"]
nn.W2 = model["W2"]
nn.b2 = model["b2"]
# 在测试集上评估加载的模型性能
pred = nn.predict(x_test)
accuracy = np.mean(pred == np.argmax(y_test, axis=1))
print("Loaded model test accuracy: {0}".format(accuracy))
```
在该代码中,我们首先使用Keras库加载mnist数据集,并将其预处理为每个样本包含784个特征和10个类别。然后我们定义了一个NeuralNetwork类,该类包含了一个具有一个隐藏层的神经网络模型。我们使用tanh作为隐藏层的激活函数,并使用softmax作为输出层的激活函数。在训练过程中,我们使用反向传播算法更新模型参数。最后,我们将模型保存到一个npz文件中,并加载该文件以在测试集上评估模型性能。
注意,由于该模型是使用numpy实现的,因此训练时间可能会比较长。如果您想要更快的训练速度,您可以使用深度学习框架如TensorFlow或PyTorch来实现该模型。
阅读全文