Fix prompt format mismatch with huggingface #807

bhbruce · 2024-08-07T03:05:46Z

System prompt: remove <s>; token [1] is generated by default.
End of System prompt:

Before:         -> After
\n\n               \n\n"""
=> Origin code implies three \n.
"""

Fix append_user_prompt & append_bot_prompt to match behavior of
tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
Correct Format for LLama2:

<s>[INST] <<SYS>>
{{ system_prompt }}
<</SYS>>

{{ user_msg_1 }} [/INST] {{ model_answer_1 }} </s><s>[INST] {{ user_msg_2 }} [/INST]

1. System prompt: remove <s> => token [1] is generated by default. 2. End of System prompt: Before: -> After \n\n \n\n""" """ => Origin code implies three \n. 3. Fix append_user_prompt & append_bot_prompt to match behavior of `tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)` Correct Format for LLama2: ``` <s>[INST] <<SYS>> {{ system_prompt }} <</SYS>> {{ user_msg_1 }} [/INST] {{ model_answer_1 }} </s><s>[INST] {{ user_msg_2 }} [/INST] ``` Signed-off-by: Bruce Lai <[email protected]>

bhbruce · 2024-08-07T03:24:48Z

Example code to get transformer prompt:

import torch
from transformers import pipeline
pipe = pipeline("text-generation", model="meta-llama/Llama-2-7b-chat-hf",
                torch_dtype=torch.float32,
                device_map="auto",
                token="xxxxxx")
# We use the tokenizer's chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
messages = [
    {
        "role": "system",
        "content": "Be concise. You are a helpful, respectful and honest assistant. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.",
    },
    {"role": "user", "content": "Who's the president of the USA?"}]

prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
print(f"Prompt:\n{prompt}")
tk_prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
print(f"Tokenized prompt:\n{tk_prompt}")

print("\n\n==== Multi-run ====")

messages = [
    {
        "role": "system",
        "content": "Be concise. You are a helpful, respectful and honest assistant. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.",
    },
    {"role": "user", "content": "Who's the president of the USA?"},
    {"role": "assistant", "content": "The president of the United States is currently Joe Biden."},
    {"role": "user", "content": "How are you doing?"},
    ]

prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
print(f"Prompt:\n{prompt}")
tk_prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
print(f"Tokenized prompt:\n{tk_prompt}")

Output

Prompt:
<s>[INST] <<SYS>>
Be concise. You are a helpful, respectful and honest assistant. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
<</SYS>>

Who's the president of the USA? [/INST]
Tokenized prompt:
<s>[INST] <<SYS>>
Be concise. You are a helpful, respectful and honest assistant. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
<</SYS>>

Who's the president of the USA? [/INST]


==== Multi-run ====
Prompt:
<s>[INST] <<SYS>>
Be concise. You are a helpful, respectful and honest assistant. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
<</SYS>>

Who's the president of the USA? [/INST] The president of the United States is currently Joe Biden. </s><s>[INST] How are you doing? [/INST]
Tokenized prompt:
<s>[INST] <<SYS>>
Be concise. You are a helpful, respectful and honest assistant. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
<</SYS>>

monorimet

Looks good to me. @gpetters94 fyi

powderluv requested a review from monorimet August 8, 2024 01:41

monorimet approved these changes Aug 8, 2024

View reviewed changes

monorimet merged commit 19c5a9a into nod-ai:main Aug 8, 2024
2 of 3 checks passed

bhbruce deleted the bhbruce/fix-llm-prompt branch August 9, 2024 05:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix prompt format mismatch with huggingface #807

Fix prompt format mismatch with huggingface #807

bhbruce commented Aug 7, 2024

bhbruce commented Aug 7, 2024

monorimet left a comment

Fix prompt format mismatch with huggingface #807

Fix prompt format mismatch with huggingface #807

Conversation

bhbruce commented Aug 7, 2024

bhbruce commented Aug 7, 2024

Example code to get transformer prompt:

Output

monorimet left a comment

Choose a reason for hiding this comment