python读取mnist数据集
时间: 2023-08-13 10:07:49 浏览: 62
要在Python中读取MNIST数据集,您可以使用以下代码:
```python
import gzip
import numpy as np
def load_data():
# 读取训练集图像数据
with gzip.open('train-images-idx3-ubyte.gz', 'rb') as f:
train_images = np.frombuffer(f.read(), np.uint8, offset=16).reshape(-1, 28*28)
# 读取训练集标签数据
with gzip.open('train-labels-idx1-ubyte.gz', 'rb') as f:
train_labels = np.frombuffer(f.read(), np.uint8, offset=8)
# 读取测试集图像数据
with gzip.open('t10k-images-idx3-ubyte.gz', 'rb') as f:
test_images = np.frombuffer(f.read(), np.uint8, offset=16).reshape(-1, 28*28)
# 读取测试集标签数据
with gzip.open('t10k-labels-idx1-ubyte.gz', 'rb') as f:
test_labels = np.frombuffer(f.read(), np.uint8, offset=8)
return (train_images, train_labels), (test_images, test_labels)
# 调用load_data函数加载数据集
(train_images, train_labels), (test_images, test_labels) = load_data()
```
在此代码中,我们使用`gzip`库打开并读取MNIST数据集文件。通过指定`offset`参数,我们可以跳过文件头部的元数据,只读取图像数据和标签数据。最后,我们将训练集和测试集分别存储在`train_images`、`train_labels`、`test_images`和`test_labels`中。请确保将MNIST数据集文件与代码文件放在同一目录下,并正确命名为`train-images-idx3-ubyte.gz`、`train-labels-idx1-ubyte.gz`、`t10k-images-idx3-ubyte.gz`和`t10k-labels-idx1-ubyte.gz`。