Spaces:

thenativefox
/

RAG

Running

RAG / openai_text-embedding-ada-002 /fixed_chunks /_chat_templating.txt_chunk_2.txt

thenativefox

Added split files and tables

939262b 10 months ago

3.64 kB

	text
	{'role': 'assistant', 'content': "Matey, I'm afraid I must inform ye that humans cannot eat helicopters. Helicopters are not food, they are flying machines. Food is meant to be eaten, like a hearty plate o' grog, a savory bowl o' stew, or a delicious loaf o' bread. But helicopters, they be for transportin' and movin' around, not for eatin'. So, I'd say none, me hearties. None at all."}
	The pipeline will take care of all the details of tokenization and calling apply_chat_template for you -
	once the model has a chat template, all you need to do is initialize the pipeline and pass it the list of messages!
	What are "generation prompts"?
	You may have noticed that the apply_chat_template method has an add_generation_prompt argument. This argument tells
	the template to add tokens that indicate the start of a bot response. For example, consider the following chat:
	python
	messages = [
	{"role": "user", "content": "Hi there!"},
	{"role": "assistant", "content": "Nice to meet you!"},
	{"role": "user", "content": "Can I ask a question?"}
	]
	Here's what this will look like without a generation prompt, using the ChatML template we saw in the Zephyr example:
	python
	tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=False)
	"""<\|im_start\|>user
	Hi there!<\|im_end\|>
	<\|im_start\|>assistant
	Nice to meet you!<\|im_end\|>
	<\|im_start\|>user
	Can I ask a question?<\|im_end\|>
	"""
	And here's what it looks like with a generation prompt:
	python
	tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	"""<\|im_start\|>user
	Hi there!<\|im_end\|>
	<\|im_start\|>assistant
	Nice to meet you!<\|im_end\|>
	<\|im_start\|>user
	Can I ask a question?<\|im_end\|>
	<\|im_start\|>assistant
	"""
	Note that this time, we've added the tokens that indicate the start of a bot response. This ensures that when the model
	generates text it will write a bot response instead of doing something unexpected, like continuing the user's
	message. Remember, chat models are still just language models - they're trained to continue text, and chat is just a
	special kind of text to them! You need to guide them with appropriate control tokens, so they know what they're
	supposed to be doing.
	Not all models require generation prompts. Some models, like BlenderBot and LLaMA, don't have any
	special tokens before bot responses. In these cases, the add_generation_prompt argument will have no effect. The exact
	effect that add_generation_prompt has will depend on the template being used.
	Can I use chat templates in training?
	Yes! We recommend that you apply the chat template as a preprocessing step for your dataset. After this, you
	can simply continue like any other language model training task. When training, you should usually set
	add_generation_prompt=False, because the added tokens to prompt an assistant response will not be helpful during
	training. Let's see an example:
	thon
	from transformers import AutoTokenizer
	from datasets import Dataset
	tokenizer = AutoTokenizer.from_pretrained("HuggingFaceH4/zephyr-7b-beta")
	chat1 = [
	{"role": "user", "content": "Which is bigger, the moon or the sun?"},
	{"role": "assistant", "content": "The sun."}
	]
	chat2 = [
	{"role": "user", "content": "Which is bigger, a virus or a bacterium?"},
	{"role": "assistant", "content": "A bacterium."}
	]
	dataset = Dataset.from_dict({"chat": [chat1, chat2]})
	dataset = dataset.map(lambda x: {"formatted_chat": tokenizer.apply_chat_template(x["chat"], tokenize=False, add_generation_prompt=False)})
	print(dataset['formatted_chat'][0])
	And we get:text
	<\|user\|>
	Which is bigger, the moon or the sun?
	<\|assistant\|>
	The sun.