Replies: 2 comments 1 reply
-
@knoppmyth thank you for sharing your inference performance results between the CPU and iGPU on the AMD Ryzen 7 4700U using ROCm. It's interesting to see the comparison, and your detailed report could be valuable for others in the community who are considering similar hardware setups. From your results, it's evident that the iGPU provides a significant speedup over the CPU for inference tasks with YOLOv8, even when only 1 GB is dedicated to the iGPU. This aligns with the general expectation that GPUs, including integrated ones, can offer better parallel processing capabilities for deep learning inference compared to CPUs. It's unfortunate to hear about the system stability issues you encountered. Hardware stability is crucial for consistent performance testing and deployment. If you decide to continue testing with a different system in the future, the community would surely benefit from any additional insights you can provide. Your contribution to the YOLOv8 community is appreciated, and we encourage you to share any further findings or questions you might have. Remember, the documentation at https://docs.ultralytics.com is always there to assist you with any additional information you might need regarding the use of YOLOv8. Good luck with your future testing, and we hope to see more from you! 🚀🤖 |
Beta Was this translation helpful? Give feedback.
-
@glenn-jocher You're welcome. I never got the minisforum system. But when I get a new system, I do intend to share my results. |
Beta Was this translation helpful? Give feedback.
-
Thought I'd share this to show the performance difference between using the CPU and iGPU with ROCm. The OS is Arch Linux. the required software was installed in a virtual environment. The hardware is Asus PN50 with 16 GB of RAM.
PyTorch was installed in the venv with:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2
Followed by ultralytics with
pip install ultralytics
Before inferencing:
export HSA_OVERRIDE_GFX_VERSION=9.0.0
Inferenced with:
yolo predict model=~/rocm/best.pt source=~/rocm/foo/
Here is the output with 1 GB dedicated to the iGPU (Note iGPU is first followed by the CPU):
Here is the performance with 4G dedicated to the iGPU:
Unfortunately, I won't be able to do any follow-up performance testing as the system wasn't stable (kept locking up) so I returned it (Yes, RAM was tested). I may get a minisforum to continue testing with ROCm.
This was a custom object detector. As I don't want what I'm detecting known, it was replaced with "foo".
Beta Was this translation helpful? Give feedback.
All reactions