Training got stuck when I used DistributedDataParallel mode but dataParallel mode is useful
Created by: wuqi930907
Hi,I have builded a docker image according to the Dockerfile.
But,my training got stuck with command:“python -m torch.distributed.launch --nproc_per_node 2 train.py --batch-size 32 --data test.yaml --weights pretrained_model/yolov5l.pt”.Following is my log.


Then, I changed my command:"python train.py --batch-size 32 --data test.yaml --weights pretrained_model/yolov5m.pt --device 0,1", everything is normal.
