生成一个根据提供的手机相册分类数据集,制作成训练图像数据集、测试图像数据集、训练标签数据集和测试标签数据集。将四部分分别保存成numpy文件的代码
时间: 2024-05-16 08:14:04 浏览: 154
以下是一个将提供的手机相册分类数据集制作成训练图像数据集、测试图像数据集、训练标签数据集和测试标签数据集,并将四部分分别保存成numpy文件的代码示例:
```python
import os
import numpy as np
from PIL import Image
# 设置数据集路径
data_path = '/path/to/dataset'
# 设置训练集、测试集的比例
train_ratio = 0.8
test_ratio = 0.2
# 获取所有图像文件的路径和标签
image_paths = []
labels = []
for label_name in os.listdir(data_path):
label_path = os.path.join(data_path, label_name)
if os.path.isdir(label_path):
for image_name in os.listdir(label_path):
image_path = os.path.join(label_path, image_name)
image_paths.append(image_path)
labels.append(label_name)
# 将数据集随机打乱
indices = np.arange(len(image_paths))
np.random.shuffle(indices)
image_paths = np.array(image_paths)[indices]
labels = np.array(labels)[indices]
# 划分数据集
train_size = int(len(image_paths) * train_ratio)
test_size = int(len(image_paths) * test_ratio)
train_image_paths = image_paths[:train_size]
train_labels = labels[:train_size]
test_image_paths = image_paths[train_size:train_size+test_size]
test_labels = labels[train_size:train_size+test_size]
# 加载图像并将其保存为numpy数组
def load_and_save_images(image_paths, save_path):
images = []
for image_path in image_paths:
image = Image.open(image_path)
image = image.resize((224, 224)) # 调整图像大小
image = np.array(image)
images.append(image)
images = np.array(images)
np.save(save_path, images)
# 加载标签并将其保存为numpy数组
def load_and_save_labels(labels, save_path):
label_to_index = {label: index for index, label in enumerate(np.unique(labels))}
indices = np.array([label_to_index[label] for label in labels])
np.save(save_path, indices)
# 保存训练集和测试集的图像和标签
train_image_save_path = '/path/to/save/train_images.npy'
test_image_save_path = '/path/to/save/test_images.npy'
train_label_save_path = '/path/to/save/train_labels.npy'
test_label_save_path = '/path/to/save/test_labels.npy'
load_and_save_images(train_image_paths, train_image_save_path)
load_and_save_images(test_image_paths, test_image_save_path)
load_and_save_labels(train_labels, train_label_save_path)
load_and_save_labels(test_labels, test_label_save_path)
```
上述代码将数据集随机打乱后,按照指定比例划分为训练集和测试集,并将图像和标签分别保存为numpy数组文件。其中,`load_and_save_images`函数用于加载图像并将其保存为numpy数组,`load_and_save_labels`函数用于加载标签并将其保存为numpy数组。在使用时需要将代码中的`/path/to/dataset`、`/path/to/save/train_images.npy`、`/path/to/save/test_images.npy`、`/path/to/save/train_labels.npy`和`/path/to/save/test_labels.npy`替换为实际的路径。
阅读全文