Spaces:

thenativefox
/

RAG

Sleeping

RAG / BAAI_bge-large-en-v1.5 /recursive_chunks /_chat_templating.txt_chunk_10.txt

thenativefox

Added split files and tables

939262b 11 months ago

4.51 kB

	Why do some models have multiple templates?
	Some models use different templates for different use cases. For example, they might use one template for normal chat
	and another for tool-use, or retrieval-augmented generation. In these cases, tokenizer.chat_template is a dictionary.
	This can cause some confusion, and where possible, we recommend using a single template for all use-cases. You can use
	Jinja statements like if tools is defined and {% macro %} definitions to easily wrap multiple code paths in a
	single template.
	When a tokenizer has multiple templates, tokenizer.chat_template will be a dict, where each key is the name
	of a template. The apply_chat_template method has special handling for certain template names: Specifically, it will
	look for a template named default in most cases, and will raise an error if it can't find one. However, if a template
	named tool_use exists when the user has passed a tools argument, it will use that instead. To access templates
	with other names, pass the name of the template you want to the chat_template argument of
	apply_chat_template().
	We find that this can be a bit confusing for users, though - so if you're writing a template yourself, we recommend
	trying to put it all in a single template where possible!
	What are "default" templates?
	Before the introduction of chat templates, chat handling was hardcoded at the model class level. For backwards
	compatibility, we have retained this class-specific handling as default templates, also set at the class level. If a
	model does not have a chat template set, but there is a default template for its model class, the TextGenerationPipeline
	class and methods like apply_chat_template will use the class template instead. You can find out what the default
	template for your tokenizer is by checking the tokenizer.default_chat_template attribute.
	This is something we do purely for backward compatibility reasons, to avoid breaking any existing workflows. Even when
	the class template is appropriate for your model, we strongly recommend overriding the default template by
	setting the chat_template attribute explicitly to make it clear to users that your model has been correctly configured
	for chat.
	Now that actual chat templates have been adopted more widely, default templates have been deprecated and will be
	removed in a future release. We strongly recommend setting the chat_template attribute for any tokenizers that
	still depend on them!
	What template should I use?
	When setting the template for a model that's already been trained for chat, you should ensure that the template
	exactly matches the message formatting that the model saw during training, or else you will probably experience
	performance degradation. This is true even if you're training the model further - you will probably get the best
	performance if you keep the chat tokens constant. This is very analogous to tokenization - you generally get the
	best performance for inference or fine-tuning when you precisely match the tokenization used during training.
	If you're training a model from scratch, or fine-tuning a base language model for chat, on the other hand,
	you have a lot of freedom to choose an appropriate template! LLMs are smart enough to learn to handle lots of different
	input formats. One popular choice is the ChatML format, and this is a good, flexible choice for many use-cases.
	It looks like this:
	{%- for message in messages %}
	{{- '<\|im_start\|>' + message['role'] + '\n' + message['content'] + '<\|im_end\|>' + '\n' }}
	{%- endfor %}
	If you like this one, here it is in one-liner form, ready to copy into your code. The one-liner also includes
	handy support for generation prompts, but note that it doesn't add BOS or EOS tokens!
	If your model expects those, they won't be added automatically by apply_chat_template - in other words, the
	text will be tokenized with add_special_tokens=False. This is to avoid potential conflicts between the template and
	the add_special_tokens logic. If your model expects special tokens, make sure to add them to the template!
	python
	tokenizer.chat_template = "{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'<\|im_start\|>' + message['role'] + '\n' + message['content'] + '<\|im_end\|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<\|im_start\|>assistant\n' }}{% endif %}"
	This template wraps each message in <\|im_start\|> and <\|im_end\|> tokens, and simply writes the role as a string, which