Spaces:
Sleeping
Sleeping
Advanced: Retrieval-augmented generation | |
"Retrieval-augmented generation" or "RAG" LLMs can search a corpus of documents for information before responding | |
to a query. This allows models to vastly expand their knowledge base beyond their limited context size. Our | |
recommendation for RAG models is that their template | |
should accept a documents argument. This should be a list of documents, where each "document" | |
is a single dict with title and contents keys, both of which are strings. Because this format is much simpler | |
than the JSON schemas used for tools, no helper functions are necessary. | |
Here's an example of a RAG template in action: | |
thon | |
document1 = { | |
"title": "The Moon: Our Age-Old Foe", | |
"contents": "Man has always dreamed of destroying the moon. In this essay, I shall" | |
} | |
document2 = { | |
"title": "The Sun: Our Age-Old Friend", | |
"contents": "Although often underappreciated, the sun provides several notable benefits" | |
} | |
model_input = tokenizer.apply_chat_template( | |
messages, | |
documents=[document1, document2] | |
) | |
Advanced: How do chat templates work? | |
The chat template for a model is stored on the tokenizer.chat_template attribute. If no chat template is set, the | |
default template for that model class is used instead. Let's take a look at the template for BlenderBot: | |
thon | |
from transformers import AutoTokenizer | |
tokenizer = AutoTokenizer.from_pretrained("facebook/blenderbot-400M-distill") | |
tokenizer.default_chat_template | |
"{% for message in messages %}{% if message['role'] == 'user' %}{{ ' ' }}{% endif %}{{ message['content'] }}{% if not loop.last %}{{ ' ' }}{% endif %}{% endfor %}{{ eos_token }}" |