Spaces:
Sleeping
Sleeping
Why do some models have multiple templates? | |
Some models use different templates for different use cases. For example, they might use one template for normal chat | |
and another for tool-use, or retrieval-augmented generation. In these cases, tokenizer.chat_template is a dictionary. | |
This can cause some confusion, and where possible, we recommend using a single template for all use-cases. You can use | |
Jinja statements like if tools is defined and {% macro %} definitions to easily wrap multiple code paths in a | |
single template. | |
When a tokenizer has multiple templates, tokenizer.chat_template will be a dict, where each key is the name | |
of a template. The apply_chat_template method has special handling for certain template names: Specifically, it will | |
look for a template named default in most cases, and will raise an error if it can't find one. However, if a template | |
named tool_use exists when the user has passed a tools argument, it will use that instead. To access templates | |
with other names, pass the name of the template you want to the chat_template argument of | |
apply_chat_template(). | |
We find that this can be a bit confusing for users, though - so if you're writing a template yourself, we recommend | |
trying to put it all in a single template where possible! | |
What are "default" templates? | |
Before the introduction of chat templates, chat handling was hardcoded at the model class level. For backwards | |
compatibility, we have retained this class-specific handling as default templates, also set at the class level. If a | |
model does not have a chat template set, but there is a default template for its model class, the TextGenerationPipeline | |
class and methods like apply_chat_template will use the class template instead. You can find out what the default | |
template for your tokenizer is by checking the tokenizer.default_chat_template attribute. | |
This is something we do purely for backward compatibility reasons, to avoid breaking any existing workflows. Even when | |
the class template is appropriate for your model, we strongly recommend overriding the default template by | |
setting the chat_template attribute explicitly to make it clear to users that your model has been correctly configured | |
for chat. | |
Now that actual chat templates have been adopted more widely, default templates have been deprecated and will be | |
removed in a future release. We strongly recommend setting the chat_template attribute for any tokenizers that | |
still depend on them! | |
What template should I use? | |
When setting the template for a model that's already been trained for chat, you should ensure that the template | |
exactly matches the message formatting that the model saw during training, or else you will probably experience | |
performance degradation. This is true even if you're training the model further - you will probably get the best | |
performance if you keep the chat tokens constant. This is very analogous to tokenization - you generally get the | |
best performance for inference or fine-tuning when you precisely match the tokenization used during training. | |
If you're training a model from scratch, or fine-tuning a base language model for chat, on the other hand, | |
you have a lot of freedom to choose an appropriate template! LLMs are smart enough to learn to handle lots of different | |
input formats. One popular choice is the ChatML format, and this is a good, flexible choice for many use-cases. | |
It looks like this: | |
{%- for message in messages %} | |
{{- '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }} | |
{%- endfor %} | |
If you like this one, here it is in one-liner form, ready to copy into your code. The one-liner also includes | |
handy support for generation prompts, but note that it doesn't add BOS or EOS tokens! | |
If your model expects those, they won't be added automatically by apply_chat_template - in other words, the | |
text will be tokenized with add_special_tokens=False. This is to avoid potential conflicts between the template and | |
the add_special_tokens logic. If your model expects special tokens, make sure to add them to the template! | |
python | |
tokenizer.chat_template = "{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}" | |
This template wraps each message in <|im_start|> and <|im_end|> tokens, and simply writes the role as a string, which |