Research: Design of llama-bench #10386

jumbo-q · 2024-11-18T14:51:28Z

Dear
How the bench designed to test the efficiency of performance
whats the batch mean in the bench and what is tested

No response

No response

No response

No response

JohannesGaessler · 2024-11-18T15:00:21Z

jumbo-q · 2024-11-18T15:35:49Z

Thanks u Ive seen it before
theres two Questions:

Is the content of this batch input self-defined, similar to Some other Infer framework or is there a specific dataset for it? Or other operations?
The output time only provides the average and variance for each token. How is this time calculated? Is it the mean and variance over multiple runs? Also, what part of the execution is being timed? From which point to which point is the timing measured?

jumbo-q added the research 🔬 label Nov 18, 2024

Provide feedback