albumentations怎么批量保存图片和bbox
时间: 2023-08-09 11:05:38 浏览: 50
使用 Albumentations 库可以很方便地进行批量图像增强和保存。下面是一些示例代码:
```python
import os
import cv2
import albumentations as A
from albumentations.augmentations.bbox_utils import BboxParams
# 定义增强器
transform = A.Compose([
A.HorizontalFlip(p=0.5),
A.RandomBrightnessContrast(p=0.2),
A.Resize(width=512, height=512, p=1),
], bbox_params=BboxParams(format='pascal_voc', label_fields=['category_ids']))
# 定义图像和标注文件夹
img_folder = 'images/'
ann_folder = 'annotations/'
# 定义输出文件夹
output_folder = 'output/'
if not os.path.exists(output_folder):
os.makedirs(output_folder)
# 循环遍历图像和标注文件夹中的所有文件
for img_file in os.listdir(img_folder):
# 读取图像和标注文件
img_path = os.path.join(img_folder, img_file)
ann_file = os.path.join(ann_folder, img_file[:-4] + '.xml')
img = cv2.imread(img_path)
# 读取标注信息
bboxes = []
categories = []
with open(ann_file, 'r') as f:
lines = f.readlines()
for line in lines:
if '<xmin>' in line:
xmin = int(line.strip().split('>')[1].split('<')[0])
elif '<ymin>' in line:
ymin = int(line.strip().split('>')[1].split('<')[0])
elif '<xmax>' in line:
xmax = int(line.strip().split('>')[1].split('<')[0])
elif '<ymax>' in line:
ymax = int(line.strip().split('>')[1].split('<')[0])
elif '<name>' in line:
category = line.strip().split('>')[1].split('<')[0]
categories.append(category)
bboxes.append([xmin, ymin, xmax, ymax])
# 进行图像增强和标注转换
transformed = transform(image=img, bboxes=bboxes, category_ids=categories)
transformed_img = transformed['image']
transformed_bboxes = transformed['bboxes']
transformed_categories = transformed['category_ids']
# 保存增强后的图像和标注文件
output_img_file = os.path.join(output_folder, img_file[:-4] + '_aug.jpg')
cv2.imwrite(output_img_file, transformed_img)
output_ann_file = os.path.join(output_folder, img_file[:-4] + '_aug.xml')
with open(output_ann_file, 'w') as f:
f.write('<annotation>\n')
f.write(' <filename>{}</filename>\n'.format(img_file[:-4] + '_aug.jpg'))
f.write(' <size>\n')
f.write(' <width>{}</width>\n'.format(transformed_img.shape[1]))
f.write(' <height>{}</height>\n'.format(transformed_img.shape[0]))
f.write(' <depth>{}</depth>\n'.format(transformed_img.shape[2]))
f.write(' </size>\n')
for i, bbox in enumerate(transformed_bboxes):
xmin, ymin, xmax, ymax = bbox
category = transformed_categories[i]
f.write(' <object>\n')
f.write(' <name>{}</name>\n'.format(category))
f.write(' <bndbox>\n')
f.write(' <xmin>{}</xmin>\n'.format(xmin))
f.write(' <ymin>{}</ymin>\n'.format(ymin))
f.write(' <xmax>{}</xmax>\n'.format(xmax))
f.write(' <ymax>{}</ymax>\n'.format(ymax))
f.write(' </bndbox>\n')
f.write(' </object>\n')
f.write('</annotation>\n')
```
上面的代码中,我们使用了 `A.Compose` 定义了增强器,然后通过循环遍历图像和标注文件夹中的所有文件,读取图像和标注信息,进行增强和标注转换,最后保存增强后的图像和标注文件到输出文件夹中。其中,我们使用了 OpenCV 库来读取和保存图像,使用了标注文件的 XML 格式来保存标注信息。