Spaces:
Running
Running
text | |
{'role': 'assistant', 'content': "Matey, I'm afraid I must inform ye that humans cannot eat helicopters. Helicopters are not food, they are flying machines. Food is meant to be eaten, like a hearty plate o' grog, a savory bowl o' stew, or a delicious loaf o' bread. But helicopters, they be for transportin' and movin' around, not for eatin'. So, I'd say none, me hearties. None at all."} | |
The pipeline will take care of all the details of tokenization and calling apply_chat_template for you - | |
once the model has a chat template, all you need to do is initialize the pipeline and pass it the list of messages! | |
What are "generation prompts"? | |
You may have noticed that the apply_chat_template method has an add_generation_prompt argument. This argument tells | |
the template to add tokens that indicate the start of a bot response. For example, consider the following chat: | |
python | |
messages = [ | |
{"role": "user", "content": "Hi there!"}, | |
{"role": "assistant", "content": "Nice to meet you!"}, | |
{"role": "user", "content": "Can I ask a question?"} | |
] | |
Here's what this will look like without a generation prompt, using the ChatML template we saw in the Zephyr example: | |
python | |
tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=False) | |
"""<|im_start|>user | |
Hi there!<|im_end|> | |
<|im_start|>assistant | |
Nice to meet you!<|im_end|> | |
<|im_start|>user | |
Can I ask a question?<|im_end|> | |
""" | |
And here's what it looks like with a generation prompt: | |
python | |
tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) | |
"""<|im_start|>user | |
Hi there!<|im_end|> | |
<|im_start|>assistant | |
Nice to meet you!<|im_end|> | |
<|im_start|>user | |
Can I ask a question?<|im_end|> | |
<|im_start|>assistant | |
""" | |
Note that this time, we've added the tokens that indicate the start of a bot response. This ensures that when the model | |
generates text it will write a bot response instead of doing something unexpected, like continuing the user's | |
message. Remember, chat models are still just language models - they're trained to continue text, and chat is just a | |
special kind of text to them! You need to guide them with appropriate control tokens, so they know what they're | |
supposed to be doing. | |
Not all models require generation prompts. Some models, like BlenderBot and LLaMA, don't have any | |
special tokens before bot responses. In these cases, the add_generation_prompt argument will have no effect. The exact | |
effect that add_generation_prompt has will depend on the template being used. | |
Can I use chat templates in training? | |
Yes! We recommend that you apply the chat template as a preprocessing step for your dataset. After this, you | |
can simply continue like any other language model training task. When training, you should usually set | |
add_generation_prompt=False, because the added tokens to prompt an assistant response will not be helpful during | |
training. Let's see an example: | |
thon | |
from transformers import AutoTokenizer | |
from datasets import Dataset | |
tokenizer = AutoTokenizer.from_pretrained("HuggingFaceH4/zephyr-7b-beta") | |
chat1 = [ | |
{"role": "user", "content": "Which is bigger, the moon or the sun?"}, | |
{"role": "assistant", "content": "The sun."} | |
] | |
chat2 = [ | |
{"role": "user", "content": "Which is bigger, a virus or a bacterium?"}, | |
{"role": "assistant", "content": "A bacterium."} | |
] | |
dataset = Dataset.from_dict({"chat": [chat1, chat2]}) | |
dataset = dataset.map(lambda x: {"formatted_chat": tokenizer.apply_chat_template(x["chat"], tokenize=False, add_generation_prompt=False)}) | |
print(dataset['formatted_chat'][0]) | |
And we get:text | |
<|user|> | |
Which is bigger, the moon or the sun? | |
<|assistant|> | |
The sun. |