torch.distributed
时间: 2023-09-23 09:05:59 浏览: 216
PyTorch Elastic :PyTorch分布式训练框架-python
torch.distributed是PyTorch中用于分布式训练的模块。它提供了一系列的函数和类,用于在多个进程和多台机器上并行训练模型。其中包括torch.distributed.all_gather()函数,可以将各个进程的数据搜集到一块。<em>1</em><em>2</em><em>3</em>
#### 引用[.reference_title]
- *1* [torch.distributed多卡/多GPU/分布式DPP(一) —— torch.distributed.launch & all_gather & init_process_...](https://blog.csdn.net/hxxjxw/article/details/116294350)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v92^chatsearchT3_1"}} ] [.reference_item]
- *2* *3* [torch.distributed使用总结 单机多卡并行训练demo](https://blog.csdn.net/weixin_48759194/article/details/127364625)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v92^chatsearchT3_1"}} ] [.reference_item]
[ .reference_list ]
阅读全文