Spaces:

thenativefox
/

RAG

Sleeping

RAG / openai_text-embedding-ada-002 /fixed_chunks /_chat_templating.txt_chunk_6.txt

thenativefox

Added split files and tables

939262b 11 months ago

1.64 kB

	Advanced: Retrieval-augmented generation
	"Retrieval-augmented generation" or "RAG" LLMs can search a corpus of documents for information before responding
	to a query. This allows models to vastly expand their knowledge base beyond their limited context size. Our
	recommendation for RAG models is that their template
	should accept a documents argument. This should be a list of documents, where each "document"
	is a single dict with title and contents keys, both of which are strings. Because this format is much simpler
	than the JSON schemas used for tools, no helper functions are necessary.
	Here's an example of a RAG template in action:
	thon
	document1 = {
	"title": "The Moon: Our Age-Old Foe",
	"contents": "Man has always dreamed of destroying the moon. In this essay, I shall"
	}
	document2 = {
	"title": "The Sun: Our Age-Old Friend",
	"contents": "Although often underappreciated, the sun provides several notable benefits"
	}
	model_input = tokenizer.apply_chat_template(
	messages,
	documents=[document1, document2]
	)

	Advanced: How do chat templates work?
	The chat template for a model is stored on the tokenizer.chat_template attribute. If no chat template is set, the
	default template for that model class is used instead. Let's take a look at the template for BlenderBot:
	thon

	from transformers import AutoTokenizer
	tokenizer = AutoTokenizer.from_pretrained("facebook/blenderbot-400M-distill")
	tokenizer.default_chat_template
	"{% for message in messages %}{% if message['role'] == 'user' %}{{ ' ' }}{% endif %}{{ message['content'] }}{% if not loop.last %}{{ ' ' }}{% endif %}{% endfor %}{{ eos_token }}"