vae实现cifar数据生成,给出训练集和验证集上的loss,给出几张不同epoch的生成图片,torch代码
时间: 2024-09-25 14:12:35 浏览: 35
在PyTorch中,我们可以使用Variational Autoencoder (VAE)来生成CIFAR-10数据集中的图像。首先,我们需要安装必要的库并导入它们:
```python
import torch
from torch import nn
import torchvision.datasets as dsets
import torchvision.transforms as transforms
from torchvision.utils import save_image
import matplotlib.pyplot as plt
# 数据预处理
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])
train_dataset = dsets.CIFAR10(root='./data', train=True, download=True, transform=transform)
test_dataset = dsets.CIFAR10(root='./data', train=False, download=True, transform=transform)
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=64, shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size=64, shuffle=False)
```
接下来,我们构建一个简单的VAE模型:
```python
class VAE(nn.Module):
def __init__(self):
super(VAE, self).__init__()
self.encoder = nn.Sequential(
nn.Conv2d(3, 16, kernel_size=4, stride=2, padding=1),
nn.ReLU(True),
nn.Conv2d(16, 32, kernel_size=4, stride=2, padding=1),
nn.ReLU(True),
nn.Flatten(start_dim=1),
nn.Linear(32 * 8 * 8, 256),
nn.ReLU(True),
nn.Linear(256, 128),
nn.ReLU(True),
nn.Linear(128, latent_dim) # 假设latent_dim为编码后的向量维度
)
self.decoder = nn.Sequential(
nn.Linear(latent_dim, 128),
nn.ReLU(True),
nn.Linear(128, 256),
nn.ReLU(True),
nn.Linear(256, 32 * 8 * 8),
nn.ReLU(True),
nn.ConvTranspose2d(32, 16, kernel_size=4, stride=2, padding=1),
nn.ReLU(True),
nn.ConvTranspose2d(16, 3, kernel_size=4, stride=2, padding=1),
nn.Tanh()
)
def reparameterize(self, mu, logvar):
std = torch.exp(0.5 * logvar)
eps = torch.randn_like(std)
return mu + eps * std
def forward(self, x):
encoding = self.encoder(x)
mean, logvar = encoding.split([latent_dim // 2, latent_dim // 2], dim=-1)
z = self.reparameterize(mean, logvar)
reconstruction = self.decoder(z)
return reconstruction, mean, logvar
latent_dim = 16
model = VAE()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
```
然后我们定义损失函数、训练过程以及生成和保存图片的方法:
```python
def loss_function(recon_x, x, mean, logvar):
BCE = nn.functional.binary_cross_entropy_with_logits(recon_x, x.view(-1, 3, 32, 32), reduction='sum')
KLD = -0.5 * torch.sum(1 + logvar - mean.pow(2) - logvar.exp())
return BCE + KLD
# 训练循环
epochs = 10
for epoch in range(epochs):
model.train()
for i, (images, _) in enumerate(train_loader):
optimizer.zero_grad()
recon_batch, mu, logvar = model(images)
loss = loss_function(recon_batch, images, mu, logvar)
loss.backward()
optimizer.step()
# 每个epoch结束时,保存测试集的生成图片
with torch.no_grad():
model.eval()
samples = torch.randn(64, latent_dim).to(device)
gen_imgs = model.decoder(samples).cpu()
save_image(gen_imgs, f"gen_images_epoch_{epoch}.png", nrow=8)
print(f"Epoch [{epoch+1}/{epochs}], Loss: {loss.item()}")
```
在这个例子中,`device`应设置为GPU如果可用,否则为CPU。在每个epoch结束后,我们生成并保存了一张图片到文件夹中,展示了不同训练阶段的生成效果。
阅读全文