用python将数据集图像分为10类,每类的训练集和测试集数量相同,图片大小为32*32
时间: 2024-05-11 20:14:17 浏览: 167
图像分类数据集miniImageNet.zip
以下是一个简单的代码示例,使用Keras库将CIFAR-10数据集(包含10类32x32像素图像)分为训练集和测试集,并进行数据预处理:
```python
import numpy as np
from keras.datasets import cifar10
from keras.utils import to_categorical
# load the CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
# convert labels to one-hot encoding
y_train = to_categorical(y_train, num_classes=10)
y_test = to_categorical(y_test, num_classes=10)
# preprocess data by scaling pixel values to [0, 1] and converting to float32
x_train = x_train.astype('float32') / 255
x_test = x_test.astype('float32') / 255
# split data into 10 classes
class_indices = [np.where(y_train[:, i] == 1)[0] for i in range(10)]
train_indices = []
test_indices = []
for indices in class_indices:
np.random.shuffle(indices)
split_point = int(0.8 * len(indices))
train_indices += indices[:split_point].tolist()
test_indices += indices[split_point:].tolist()
# create training and testing datasets
x_train_new = x_train[train_indices]
y_train_new = y_train[train_indices]
x_test_new = x_test[test_indices]
y_test_new = y_test[test_indices]
```
在这个示例中,我们首先加载CIFAR-10数据集,并将标签转换为one-hot编码。然后,我们对图像数据进行预处理,将像素值缩放到[0,1]范围内,并将数据类型转换为float32。接下来,我们根据类别将数据集分为10个子集,并将每个子集的80%用作训练数据,20%用作测试数据。最后,我们创建新的训练和测试数据集,只包含我们选择的子集。
阅读全文