You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm currently working on a job where we need to process 400 Hours of video every day in a RTX4090. I was able to decode 1 hour of video in 30seconds using pynvvideocodec and CV-CUDA.
Using the provided sample of object detection with PeopleNet from Nvidia i was able to run 1 hour in 3 minutes. But now i need to scale it in my cloud. I have a few questions:
Is worth spawn subprocess to decode video and inference using the same context or multiple context?
What is safest approach: run multiple python programs to avoid concurrency or run the same python process with threading/multiprocess (i don't think is a good a idea dude do GPU handling),?
I'm trying to avoid as much as possible trade between CPU and GPU. My next step is track those people using ByteTrack wich i will try to port to GPU entirely.
Tips for those who are trying to run the sample:
Need TensorRT 8.6.x tar file from NVIDIA
tao-converter can be downloaded, but is deprecated, you also need to chmod +x tao-converter and only works with tensorrt 8.x.x.
Also install python whl from TensorRT tar file.
The text was updated successfully, but these errors were encountered:
I'm currently working on a job where we need to process 400 Hours of video every day in a RTX4090. I was able to decode 1 hour of video in 30seconds using pynvvideocodec and CV-CUDA.
Using the provided sample of object detection with PeopleNet from Nvidia i was able to run 1 hour in 3 minutes. But now i need to scale it in my cloud. I have a few questions:
I'm trying to avoid as much as possible trade between CPU and GPU. My next step is track those people using ByteTrack wich i will try to port to GPU entirely.
Tips for those who are trying to run the sample:
The text was updated successfully, but these errors were encountered: