写一段代码,像minist数据集那样,将自己制作的6000张图片形式的数据集,制作成minist那样训练集、测试集、以及和图片名称和模式对应的txt文件
时间: 2024-02-06 16:11:00 浏览: 64
模仿mnist数据集制作自己的数据集代码
好的,可以使用Python中的PIL库和NumPy库来实现这个任务。以下是代码:
```python
from PIL import Image
import numpy as np
import os
# 数据集路径和文件名
dataset_path = "dataset/"
train_images_file = "train-images.npy"
train_labels_file = "train-labels.npy"
test_images_file = "test-images.npy"
test_labels_file = "test-labels.npy"
train_txt_file = "train.txt"
test_txt_file = "test.txt"
# 图片尺寸和数据集大小
image_size = 28
dataset_size = 6000
# 划分训练集和测试集
train_size = int(dataset_size * 0.8)
test_size = dataset_size - train_size
# 创建文件夹
if not os.path.exists(dataset_path):
os.makedirs(dataset_path)
# 生成数据集
images = np.zeros((dataset_size, image_size, image_size), dtype=np.uint8)
labels = np.zeros(dataset_size, dtype=np.uint8)
for i in range(dataset_size):
# 生成图片和标签
image = Image.fromarray(np.random.randint(0, 256, size=(image_size, image_size), dtype=np.uint8))
label = np.random.randint(0, 10)
# 保存图片和标签
image.save(dataset_path + str(i) + ".png")
images[i] = np.array(image)
labels[i] = label
# 划分训练集和测试集
train_images = images[:train_size]
train_labels = labels[:train_size]
test_images = images[train_size:]
test_labels = labels[train_size:]
# 保存训练集和测试集
np.save(dataset_path + train_images_file, train_images)
np.save(dataset_path + train_labels_file, train_labels)
np.save(dataset_path + test_images_file, test_images)
np.save(dataset_path + test_labels_file, test_labels)
# 生成训练集和测试集的txt文件
with open(dataset_path + train_txt_file, "w") as f:
for i in range(train_size):
f.write(str(i) + ".png " + str(train_labels[i]) + "\n")
with open(dataset_path + test_txt_file, "w") as f:
for i in range(test_size):
f.write(str(i + train_size) + ".png " + str(test_labels[i]) + "\n")
print("数据集制作完成!")
```
这段代码会生成一个大小为6000的图片数据集,并且划分为训练集和测试集,然后保存为NumPy数组和对应的txt文件。你可以将代码中的dataset_path变量设置为你自己的数据集文件夹路径,并修改文件名和数据集大小等参数以适应你的数据集。
阅读全文