请基于tensorflow搭建Lenet-5以实现手写数字识别,数据集是mnist,请对数据集进行加强,以使模型有较强的泛化能力
时间: 2024-03-08 12:47:35 浏览: 97
好的,我会根据您的要求为您回答问题。下面是答案:
首先,我们需要导入必要的库:tensorflow、numpy和matplotlib。接着,我们需要准备数据集,使用TensorFlow内置的MNIST数据集。
```
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
# 加载 MNIST 数据集
mnist = tf.keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
# 显示部分训练数据
fig, axes = plt.subplots(nrows=1, ncols=4, figsize=(10, 3))
for i, ax in enumerate(axes):
ax.imshow(train_images[i], cmap='gray')
ax.set_title(f"label: {train_labels[i]}")
ax.axis('off')
plt.show()
```
接下来,我们需要对数据集进行一些增强操作,以提高模型的泛化能力。常见的增强操作包括:
- 随机翻转
- 随机旋转
- 随机裁剪
- 添加高斯噪声
这里我们以随机翻转和随机旋转为例:
```
# 随机翻转
def random_flip(image):
image = tf.image.random_flip_left_right(image)
image = tf.image.random_flip_up_down(image)
return image
# 随机旋转
def random_rotate(image):
angle = tf.random.uniform([], -15, 15, dtype=tf.float32)
image = tf.keras.preprocessing.image.apply_affine_transform(image, theta=angle)
return image
# 对训练数据进行增强操作
augmented_train_images = []
for image in train_images:
# 随机翻转
image = random_flip(image)
# 随机旋转
image = random_rotate(image)
augmented_train_images.append(image)
# 将增强后的数据集与原始数据集合并
augmented_train_images = np.array(augmented_train_images)
train_images = np.concatenate([train_images, augmented_train_images], axis=0)
train_labels = np.concatenate([train_labels, train_labels], axis=0)
# 对测试数据进行增强操作
augmented_test_images = []
for image in test_images:
# 随机翻转
image = random_flip(image)
# 随机旋转
image = random_rotate(image)
augmented_test_images.append(image)
augmented_test_images = np.array(augmented_test_images)
test_images = np.concatenate([test_images, augmented_test_images], axis=0)
test_labels = np.concatenate([test_labels, test_labels], axis=0)
```
现在,我们可以开始搭建Lenet-5模型了。Lenet-5是一个经典的卷积神经网络模型,由两个卷积层、两个池化层和三个全连接层组成。
```
# 搭建Lenet-5模型
model = tf.keras.Sequential([
# Conv1
tf.keras.layers.Conv2D(filters=6, kernel_size=(5, 5), activation='relu', input_shape=(28, 28, 1)),
tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),
# Conv2
tf.keras.layers.Conv2D(filters=16, kernel_size=(5, 5), activation='relu'),
tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),
# Flatten
tf.keras.layers.Flatten(),
# FC1
tf.keras.layers.Dense(units=120, activation='relu'),
# FC2
tf.keras.layers.Dense(units=84, activation='relu'),
# FC3
tf.keras.layers.Dense(units=10, activation='softmax')
])
# 编译模型
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
# 训练模型
model.fit(train_images.reshape(-1, 28, 28, 1), train_labels, epochs=10, batch_size=128)
# 评估模型
test_loss, test_acc = model.evaluate(test_images.reshape(-1, 28, 28, 1), test_labels)
print(f"Test accuracy: {test_acc}")
```
最后,我们可以使用训练好的模型进行手写数字识别了。下面是一个例子:
```
# 加载模型
model = tf.keras.models.load_model("lenet5.h5")
# 加载测试数据
(_, _), (test_images, test_labels) = mnist.load_data()
# 预测一张图片
image_index = 0
image = test_images[image_index]
label = test_labels[image_index]
plt.imshow(image, cmap='gray')
plt.show()
prediction = model.predict(image.reshape(-1, 28, 28, 1))
predicted_label = np.argmax(prediction[0])
print(f"Ground truth: {label}, Predicted label: {predicted_label}")
```
希望这个例子对您有所帮助!
阅读全文