Running with gemma 2 in vllm gives a chat template error #1384
Replies: 5 comments
-
Hey were you able to solve it. |
Beta Was this translation helpful? Give feedback.
-
Relevant issue: #1386 |
Beta Was this translation helpful? Give feedback.
-
I am facing same issue, any idea? |
Beta Was this translation helpful? Give feedback.
-
Could you try adding |
Beta Was this translation helpful? Give feedback.
-
Use vllm v0.6.2 something like. docker run \
--runtime nvidia \
--gpus all \
-v ~/.cache/huggingface:/root/.cache/huggingface \
-e VLLM_ATTENTION_BACKEND=FLASHINFER \
-p 8000:8000 \
--env "HUGGING_FACE_HUB_TOKEN=your_hf_token" \
--env "VLLM_ALLOW_LONG_MAX_MODEL_LEN=1" \
--ipc=host \
--log-opt max-size=10m \
--log-opt max-file=3 \
vllm/vllm-openai:v0.6.2 \
--model "hugging-quants/gemma-2-9b-it-AWQ-INT4" \
--max-model-len 8192 \
--enable-chunked-prefill True \
--max-num-batched-tokens 256 \
--max-num-seqs 8 \
--gpu-memory-utilization 0.9 FYI gemma doesn't support ssytem prompt. only user prompt |
Beta Was this translation helpful? Give feedback.
-
Hello,
In running chat ui and trying some models, with phi3 and llama i had no problem but when I run gemma2 in vllm Im not able to make any good api request,
in env.local:
{
"name": "google/gemma-2-2b-it",
"id": "google/gemma-2-2b-it",
"chatPromptTemplate": "{{#each messages}}{{#ifUser}}<start_of_turn>user\n{{#if @FIRST}}{{#if @root.preprompt}}{{@root.preprompt}}\n{{/if}}{{/if}}{{content}}<end_of_turn>\n<start_of_turn>model\n{{/ifUser}}{{#ifAssistant}}{{content}}<end_of_turn>\n{{/ifAssistant}}{{/each}}",
"parameters": {
"temperature": 0.1,
"top_p": 0.95,
"repetition_penalty": 1.2,
"top_k": 50,
"truncate": 1000,
"max_new_tokens": 2048,
"stop": ["<end_of_turn>"]
},
"endpoints": [
{
"type": "openai",
"baseURL": "http://127.0.0.1:8000/v1",
}
and I always have the same response in vllm server:
ERROR 08-05 12:39:06 serving_chat.py:118] Error in applying chat template from request: System role not supported
INFO: 127.0.0.1:42142 - "POST /v1/chat/completions HTTP/1.1" 400 Bad Request
do someone know if I have to change and how do change the chat template ? is it a vllm problem or a chat ui problem?
Thank U!
Beta Was this translation helpful? Give feedback.
All reactions