train_set = TrainDatasetFromFolder('/root/autodl-tmp/srpad_project/data/HR', NAME, crop_size=CROP_SIZE, upscale_factor=UPSCALE_FACTOR) val_set = ValDatasetFromFolder('/root/autodl-tmp/srpad_project/data/HR', NAME, crop_size=CROP_SIZE, upscale_factor=UPSCALE_FACTOR)#47-50加载训练集和验证集的图像 train_loader = DataLoader(dataset=train_set, num_workers=4, batch_size=16, shuffle=True) val_loader = DataLoader(dataset=val_set, num_workers=4, batch_size=1, shuffle=False) net = Net().cuda()#初始化网络 criterion = torch.nn.MSELoss().cuda()#设置损失函数 optimizer = torch.optim.Adam([paras for paras in net.parameters() if paras.requires_grad == True], lr=0.001)#设置优化器 t = 5 T = NUM_EPOCHS n_t = 0.5 lambda1 = lambda epoch: (0.9 * epoch / t + 0.1) if epoch < t else 0.1 if n_t * ( 1 + math.cos(math.pi * (epoch - t) / (T - t))) < 0.1 else n_t * ( 1 + math.cos(math.pi * (epoch - t) / (T - t))) scheduler = torch.optim.lr_scheduler.LambdaLR(optimizer, lr_lambda=lambda1)#56-64损失函数学习率的一个变化策略。这里面我们学习选择了先上升后下降的一个学习力策略 results = {'loss': [], 'psnr': [], 'ssim': [], 'bic_psnr': [], 'bic_ssim': [], 'val_loss': []} for epoch in range(1, NUM_EPOCHS + 1):#迭代开始 train_bar = tqdm(train_loader) running_results = {'batch_sizes': 0, 'loss': 0} net.train()#加载网络,进入for循环 for data, target in train_bar: batch_size = data.size(0) running_results['batch_sizes'] += batch_size inputs = Variable(data).cuda()#加载variable形式,把它放在cuda(GPU)上 gt = Variable(target).cuda() output = net(inputs)#网络输出
时间: 2024-04-26 20:23:51 浏览: 148
这段代码是用训练集和验证集的图像来训练神经网络模型。首先,使用 TrainDatasetFromFolder 和 ValDatasetFromFolder 从文件夹中加载训练集和验证集的图像,并设置裁剪大小和超分辨率倍数等参数。然后,使用 DataLoader 将训练集和验证集的图像分批次加载,并指定每个批次的大小、是否打乱顺序等参数。接下来,初始化神经网络模型、损失函数和优化器,并使用 LambdaLR 设置损失函数学习率的变化策略。在迭代开始前,初始化结果字典,并在每个迭代 epoch 中,使用 tqdm 记录训练进度。在每个批次中,将数据和标签转换为 Variable 类型,并将其放在 cuda(GPU)上。然后,将输入数据送入神经网络模型中,得到输出结果。
相关问题
File "train_rcnn.py", line 195, in <module> model = PointRCNN(num_classes=train_loader.dataset.num_class, use_xyz=True, mode='TRAIN') File "/root/autodl-tmp/project/tools/../lib/net/point_rcnn.py", line 15, in __init__ self.rpn = RPN(use_xyz=use_xyz, mode=mode) File "/root/autodl-tmp/project/tools/../lib/net/rpn.py", line 17, in __init__ self.backbone_net = MODEL.get_model(input_channels=int(cfg.RPN.USE_INTENSITY), use_xyz=use_xyz) File "/root/autodl-tmp/project/tools/../lib/net/pointnet2_msg.py", line 97, in get_model return Pointnet2MSG(input_channels=input_channels, use_xyz=use_xyz) File "/root/autodl-tmp/project/tools/../lib/net/pointnet2_msg.py", line 125, in __init__ SelfAttention(channel_out=channel_out) TypeError: __init__() got an unexpected keyword argument 'channel_out'
这个错误提示的意思是,在 "/root/autodl-tmp/project/tools/../lib/net/pointnet2_msg.py" 文件的第 125 行,SelfAttention() 函数不支持 'channel_out' 这个参数。可能是版本更新或者代码改动导致的。你需要检查一下代码,看看是否有相应的参数名字改动或者参数不需要了。你可以查看一下 SelfAttention() 函数的定义,看看它需要哪些参数,并且检查你传给它的参数是否正确。
Traceback (most recent call last): File "/root/autodl-tmp/ultralytics-main/run.py", line 7, in <module> model.train(data='/root/autodl-tmp/ultralytics-main/traindata3/data.yaml') File "/root/autodl-tmp/ultralytics-main/ultralytics/yolo/engine/model.py", line 371, in train self.trainer.train() File "/root/autodl-tmp/ultralytics-main/ultralytics/yolo/engine/trainer.py", line 192, in train self._do_train(world_size) File "/root/autodl-tmp/ultralytics-main/ultralytics/yolo/engine/trainer.py", line 328, in _do_train preds = self.model(batch['img']) File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/root/autodl-tmp/ultralytics-main/ultralytics/nn/tasks.py", line 219, in forward return self._forward_once(x, profile, visualize) # single-scale inference, train File "/root/autodl-tmp/ultralytics-main/ultralytics/nn/tasks.py", line 70, in _forward_once x = m(x) # run File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/root/autodl-tmp/ultralytics-main/ultralytics/nn/modules/block.py", line 183, in forward return self.cv2(torch.cat(y, 1)) torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 100.00 MiB (GPU 0; 23.65 GiB total capacity; 6.18 GiB already allocated; 98.56 MiB free; 6.21 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF terminate called without an active exception Aborted (core dumped)
这是一个CUDA out of memory错误,说明你的GPU显存已经不足以支持你的模型在训练时的计算了。你可以尝试减小batch size或者使用更高显存的GPU来解决这个问题。另外,你也可以尝试使用PyTorch提供的一些内存管理的工具,比如设置max_split_size_mb参数来避免内存碎片化。
阅读全文