File "/home/zhxk/.local/bin/yolo", line 8, in <module> sys.exit(entrypoint()) File "/home/zhxk/.local/lib/python3.8/site-packages/ultralytics/yolo/cfg/__init__.py", line 249, in entrypoint getattr(model, mode)(verbose=True, **overrides) File "/home/zhxk/.local/lib/python3.8/site-packages/ultralytics/yolo/engine/model.py", line 207, in train self.trainer.train() File "/home/zhxk/.local/lib/python3.8/site-packages/ultralytics/yolo/engine/trainer.py", line 183, in train self._do_train(int(os.getenv("RANK", -1)), world_size) File "/home/zhxk/.local/lib/python3.8/site-packages/ultralytics/yolo/engine/trainer.py", line 302, in _do_train self.loss, self.loss_items = self.criterion(preds, batch) File "/home/zhxk/.local/lib/python3.8/site-packages/ultralytics/yolo/v8/detect/train.py", line 76, in criterion return self.compute_loss(preds, batch) File "/home/zhxk/.local/lib/python3.8/site-packages/ultralytics/yolo/v8/detect/train.py", line 174, in __call__ _, target_bboxes, target_scores, fg_mask, _ = self.assigner( File "/home/zhxk/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/home/zhxk/.local/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context return func(*args, **kwargs) File "/home/zhxk/.local/lib/python3.8/site-packages/ultralytics/yolo/utils/tal.py", line 97, in forward target_gt_idx, fg_mask, mask_pos = select_highest_overlaps(mask_pos, overlaps, self.n_max_boxes) File "/home/zhxk/.local/lib/python3.8/site-packages/ultralytics/yolo/utils/tal.py", line 44, in select_highest_overlaps if fg_mask.max() > 1: # one anchor is assigned to multiple gt_bboxes RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Sentry is attempting to send 2 pending error messages Waiting up to 2 seconds Press Ctrl-C to quit THCudaCheck FAIL file=/pytorch/aten/src/THC/THCCachingHostAllocator.cpp line=278 error=710 : device-side assert triggered
时间: 2023-08-21 14:02:02 浏览: 279
根据您提供的错误信息,这是一个与CUDA相关的错误。出现此错误通常是由于CUDA内核遇到了问题,触发了设备端断言。这可能是由于使用了不正确的参数或数据类型,或者是由于内存不足等问题引起的。
为了调试此问题,您可以尝试以下几个步骤:
1. 确保您的CUDA版本与使用的深度学习框架兼容,并且已正确安装CUDA驱动程序和依赖项。
2. 检查您的代码和模型配置是否正确,特别是与CUDA相关的部分。确保正确设置了设备(device)和数据类型(dtype)。
3. 如果您的显卡内存不足,可以尝试减小批量大小(batch size)或模型大小,或者考虑使用更高性能的显卡。
4. 尝试将环境变量`CUDA_LAUNCH_BLOCKING`设置为1,以便在CUDA错误发生时阻塞程序并打印更详细的错误信息。
如果以上步骤都没有解决问题,您可以尝试在相关社区或论坛上提供更详细的错误信息以寻求帮助。
阅读全文