Unable to Train on Multiple GPUs(4070 ti + 4070) with YOLOv10n #16259
Replies: 2 comments
-
@monggus525 I'd strongly recommend against training DDP with different GPU models. |
Beta Was this translation helpful? Give feedback.
-
👋 Hello @monggus525, thank you for reaching out and for using Ultralytics 🚀! This is an automated response, but rest assured, an Ultralytics engineer will assist you soon. To address your issue with multi-GPU training, let's tackle it step by step. Understanding the ErrorThe error message you're experiencing, Solutions to Consider
Additional SupportJoin our community to get real-time support:
Feel free to provide additional error logs or configuration details to help us assist further. We're here to help! 😊 |
Beta Was this translation helpful? Give feedback.
-
I am attempting to train YOLOv10n detection using 4070 Ti (GPU0) and 4070 (GPU1) together. When I specify device=0,1 in the training command, I encounter the following error message:
My training code was:
yolo train model=yolov10n.pt device=0,1 epochs=100 data=dataset_fixed.yaml imgsz=1600 batch=4
I have CUDA 12.4 installed, PyTorch version 2.4.1, and the Ultralytics YOLOv8 environment set up. Despite having the latest versions, I still face this issue. Based on GPT's suggestion, I also installed pyuv, but the problem persists. Additionally, if I don't specify device=0,1 and just run the training, it only uses device=0 (single GPU), rather than utilizing both GPUs.
Is it expected to run into errors when using GPUs with different models (in this case: 4070 Ti + 4070) for multi-GPU training? If different GPU models cannot be used, should I use identical GPUs (e.g., 4070 Ti + 4070 Ti or 4070 + 4070)?
If different GPU models are supported, how can I resolve this issue and successfully train on both GPUs?
Beta Was this translation helpful? Give feedback.
All reactions