Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Research: bench of the llamacpp #10405

Open
1 of 5 tasks
jumbo-q opened this issue Nov 19, 2024 · 0 comments
Open
1 of 5 tasks

Research: bench of the llamacpp #10405

jumbo-q opened this issue Nov 19, 2024 · 0 comments

Comments

@jumbo-q
Copy link

jumbo-q commented Nov 19, 2024

Research Stage

  • Background Research (Let's try to avoid reinventing the wheel)
  • Hypothesis Formed (How do you think this will work and it's effect?)
  • Strategy / Implementation Forming
  • Analysis of results
  • Debrief / Documentation (So people in the future can learn from us)

Previous existing literature and research

theres two Questions:

Is the content of this batch input self-defined, similar to Some other Infer framework or is there a specific dataset for it? Or other operations?
The output time only provides the average and variance for each token. How is this time calculated? Is it the mean and variance over multiple runs? Also, what part of the execution is being timed? From which point to which point is the timing measured?

Hypothesis

No response

Implementation

No response

Analysis

No response

Relevant log output

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant