Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix prompt format mismatch with huggingface #807

Merged
merged 1 commit into from
Aug 8, 2024

Conversation

bhbruce
Copy link
Contributor

@bhbruce bhbruce commented Aug 7, 2024

  1. System prompt: remove <s>; token [1] is generated by default.
  2. End of System prompt:
Before:         -> After
\n\n               \n\n"""
=> Origin code implies three \n.
"""
  1. Fix append_user_prompt & append_bot_prompt to match behavior of
    tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    Correct Format for LLama2:
<s>[INST] <<SYS>>
{{ system_prompt }}
<</SYS>>

{{ user_msg_1 }} [/INST] {{ model_answer_1 }} </s><s>[INST] {{ user_msg_2 }} [/INST]

1. System prompt: remove <s> => token [1] is generated by default.
2. End of System prompt:
Before:         -> After
\n\n               \n\n"""
"""
=> Origin code implies three \n.
3. Fix append_user_prompt & append_bot_prompt to match behavior of
   `tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)`
Correct Format for LLama2:
```
<s>[INST] <<SYS>>
{{ system_prompt }}
<</SYS>>

{{ user_msg_1 }} [/INST] {{ model_answer_1 }} </s><s>[INST] {{ user_msg_2 }} [/INST]
```

Signed-off-by: Bruce Lai <[email protected]>
@bhbruce
Copy link
Contributor Author

bhbruce commented Aug 7, 2024

Example code to get transformer prompt:

import torch
from transformers import pipeline
pipe = pipeline("text-generation", model="meta-llama/Llama-2-7b-chat-hf",
                torch_dtype=torch.float32,
                device_map="auto",
                token="xxxxxx")
# We use the tokenizer's chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
messages = [
    {
        "role": "system",
        "content": "Be concise. You are a helpful, respectful and honest assistant. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.",
    },
    {"role": "user", "content": "Who's the president of the USA?"}]

prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
print(f"Prompt:\n{prompt}")
tk_prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
print(f"Tokenized prompt:\n{tk_prompt}")

print("\n\n==== Multi-run ====")

messages = [
    {
        "role": "system",
        "content": "Be concise. You are a helpful, respectful and honest assistant. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.",
    },
    {"role": "user", "content": "Who's the president of the USA?"},
    {"role": "assistant", "content": "The president of the United States is currently Joe Biden."},
    {"role": "user", "content": "How are you doing?"},
    ]

prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
print(f"Prompt:\n{prompt}")
tk_prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
print(f"Tokenized prompt:\n{tk_prompt}")

Output

Prompt:
<s>[INST] <<SYS>>
Be concise. You are a helpful, respectful and honest assistant. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
<</SYS>>

Who's the president of the USA? [/INST]
Tokenized prompt:
<s>[INST] <<SYS>>
Be concise. You are a helpful, respectful and honest assistant. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
<</SYS>>

Who's the president of the USA? [/INST]


==== Multi-run ====
Prompt:
<s>[INST] <<SYS>>
Be concise. You are a helpful, respectful and honest assistant. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
<</SYS>>

Who's the president of the USA? [/INST] The president of the United States is currently Joe Biden. </s><s>[INST] How are you doing? [/INST]
Tokenized prompt:
<s>[INST] <<SYS>>
Be concise. You are a helpful, respectful and honest assistant. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
<</SYS>>

Copy link
Contributor

@monorimet monorimet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. @gpetters94 fyi

@monorimet monorimet merged commit 19c5a9a into nod-ai:main Aug 8, 2024
2 of 3 checks passed
@bhbruce bhbruce deleted the bhbruce/fix-llm-prompt branch August 9, 2024 05:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants