Design and Implement a good GPU utilization metrics #93

Fizzbb · 2022-02-02T22:41:30Z

Not a perfect measure from default Nvidia function.
"Percent of time over the past sample period during which one or more kernels was executing on the GPU."
https://docs.nvidia.com/deploy/nvml-api/structnvmlUtilization__t.html#structnvmlUtilization__t

Fizzbb · 2022-02-18T16:54:30Z

Some research papers use an indirect way to measure, e.g., how many jobs can be served, what batch size can be served within SLO before and after. Some metrics in the throughput category seem to make sense.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Design and Implement a good GPU utilization metrics #93

Design and Implement a good GPU utilization metrics #93

Fizzbb commented Feb 2, 2022

Fizzbb commented Feb 18, 2022

Design and Implement a good GPU utilization metrics #93

Design and Implement a good GPU utilization metrics #93

Comments

Fizzbb commented Feb 2, 2022

Fizzbb commented Feb 18, 2022