Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement chat_vllm() #148

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Implement chat_vllm() #148

wants to merge 1 commit into from

Conversation

hadley
Copy link
Member

@hadley hadley commented Nov 7, 2024

Fixes #140

@cboettig can you please take a look and let me know if this meets your needs?

@cboettig
Copy link

@hadley thanks much and apologies for the delay. I can confirm this is working now. I ran into issues with gorilla, maybe you did too, but was able to test by running other tool-use-enabled models in vllm locally.

e.g. using Groq/Llama-3-Groq-8B-Tool-Use:

#library(elmer)
devtools::load_all()

# vllm is installable as a python module, could have used reticulate
system("vllm serve --dtype=half --enable-auto-tool-choice --tool-call-parser llama3_json  Groq/Llama-3-Groq-8B-Tool-Use")


chat <- chat_vllm(
  base_url = "http://localhost:8000/v1",
  model = "Groq/Llama-3-Groq-8B-Tool-Use",
  api_key = ""
)

tool_rnorm <- tool(
  rnorm,
  "Drawn numbers from a random normal distribution",
  n = type_integer("The number of observations. Must be a positive integer."),
  mean = type_number("The mean value of the distribution."),
  sd = type_number("The standard deviation of the distribution. Must be a non-negative number.")
)
# Then register it
chat$register_tool(tool_rnorm)

# Then ask a question that needs it.
chat$chat("
  Give me five numbers from a random normal distribution of mean 5 and sd 1.
")

Groq/Llama-3-Groq-8B-Tool-Use is available of ollama too so should be easy to compare (though sadly no mpb support yet on vllm but fine if you have an nvidia card handy).

Thanks so much for adding this! Looking forward to testing out in the classroom!

@cboettig
Copy link

Maybe this needs to be extended to structured data too? In calling extract_data() with vllm I'm seeing the errors and it looks like strict is still in there, but not sure if it's related. lemme know if that's worth a separate issue or is an easy extension to this.

@hadley
Copy link
Member Author

hadley commented Nov 12, 2024

Oh yeah, I bet I forgot about that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider support for vllm-hosted models?
2 participants