Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs Updates #61

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open

Docs Updates #61

wants to merge 9 commits into from

Conversation

rgreenberg1
Copy link
Contributor

Summary:

This pull request introduces the GuideLLM CLI guide, README enhancements, image uploads, and the supported backends documentation to highlight all the backends that can be used with GuideLLM.

Test Cases:
The GuideLLM CLI has been tested with various LLM models and backends.
Unit tests ensure core functionalities work as expected.

Documentation:
Created documentation detailing the GuideLLM CLI usage and output metrics.
Created documentation detailing the openai-compatible API/HTTP pathway for TGI, llama.cpp, and DeepSparse in supported_backends.md

Additional Information:
The pull request includes changes to the docs/guides directory for the CLI documentation.
Binary files containing performance summary visualizations are added to the docs/assets directory.

Please review and provide feedback.

Copy link
Contributor

@parfeniukink parfeniukink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job!

Comment on lines +38 to +40
- The `--target` flag specifies the server hosting the model. In this case, it is a local vLLM server.
- The `--model` flag specifies the model to evaluate. The model name should match the name of the model deployed on the server
- By default, GuideLLM will run a `sweep` of performance evaluations across different request rates, each lasting 120 seconds. The results will be saved to a local directory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I would rename flag to parameter since our CLI supports both: parameters and flags. If you specify a flag - there is no value next to it. If you specify parameter - the value is requied then.

  2. In some cases we may get an error if the tokenizer is not specified. I would add another item here. Text is below:

  • The --tokenizer parameter specifies the tokenizer to encount the number of tokens in the dataset. If you faced any issues try using --tokenizer neuralmagic/Meta-Llama-3.1-8B-quantized.w8a8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants