chuanli-lambda commited on
Commit
c8d80c9
·
1 Parent(s): 631a04f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +88 -8
README.md CHANGED
@@ -1,19 +1,99 @@
1
  ---
2
  language:
3
- - en
4
  tags:
5
- - pytorch
6
- - causal-lm
7
- - pythia
8
  license: apache-2.0
9
  datasets:
10
- - Dahoas/synthetic-instruct-gptj-pairwise
11
  ---
12
 
13
- This model is created by finetuning `EleutherAI/pythia-2.8b-deduped` on the `Dahoas/synthetic-instruct-gptj-pairwise` for 4 epochs.
14
 
15
  You can try a [demo](https://cloud.lambdalabs.com/demos/ml/qa-28b-8000) of the model hosted on [Lambda Cloud](https://lambdalabs.com/service/gpu-cloud).
16
 
17
- It took 8xA100 80GB five hours to train the model. We set `batch_size_per_gpu` to `2` (so global batch size is 8), and learning rate to `0.00001` (with linear decay to zero at the last trainig step).
18
 
19
- The Weights and Biases record of the training can be found [here](https://wandb.ai/chuanli11/ft-synthetic-instruct-gptj-pairwise-pythia2.8b?workspace=user-chuanli11).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  language:
3
+ - en
4
  tags:
5
+ - pytorch
6
+ - causal-lm
7
+ - pythia
8
  license: apache-2.0
9
  datasets:
10
+ - Dahoas/synthetic-instruct-gptj-pairwise
11
  ---
12
 
13
+ This model is created by finetuning [`EleutherAI/pythia-2.8b-deduped`](https://huggingface.co/EleutherAI/pythia-2.8b-deduped) on the [`Dahoas/synthetic-instruct-gptj-pairwise`](https://huggingface.co/datasets/Dahoas/synthetic-instruct-gptj-pairwise).
14
 
15
  You can try a [demo](https://cloud.lambdalabs.com/demos/ml/qa-28b-8000) of the model hosted on [Lambda Cloud](https://lambdalabs.com/service/gpu-cloud).
16
 
17
+ ### Model Details
18
 
19
+ - Finetuned by: [Lambda](https://lambdalabs.com/)
20
+ - Model type: Transformer-based Language Model
21
+ - Language: English
22
+ - Pre-trained model: [EleutherAI/pythia-2.8b-deduped](https://huggingface.co/EleutherAI/pythia-2.8b-deduped)
23
+ - Dataset: [Dahoas/synthetic-instruct-gptj-pairwise](https://huggingface.co/datasets/Dahoas/synthetic-instruct-gptj-pairwise)
24
+ - Library: [transformers](https://huggingface.co/docs/transformers/index)
25
+ - License: Apache 2.0
26
+
27
+ ### Prerequisites
28
+
29
+ Running inference with the model takes ~7GB of GPU memory.
30
+
31
+ ### Quick Start
32
+
33
+ ```
34
+ import torch
35
+
36
+ from transformers import AutoTokenizer, pipeline, StoppingCriteria, StoppingCriteriaList
37
+
38
+ device = torch.device("cuda:0") if torch.cuda.is_available() else torch.device("cpu")
39
+
40
+ model_name = "lambdalabs/pythia-2.8b-deduped-synthetic-instruct"
41
+ max_new_tokens = 2048
42
+ stop_token = "<|stop|>"
43
+
44
+
45
+ class KeywordsStoppingCriteria(StoppingCriteria):
46
+ def __init__(self, keywords_ids: list):
47
+ self.keywords = keywords_ids
48
+
49
+ def __call__(
50
+ self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs
51
+ ) -> bool:
52
+ if input_ids[0][-1] in self.keywords:
53
+ return True
54
+ return False
55
+
56
+
57
+ tokenizer = AutoTokenizer.from_pretrained(
58
+ model_name,
59
+ )
60
+ tokenizer.pad_token = tokenizer.eos_token
61
+ tokenizer.add_tokens([stop_token])
62
+
63
+ stop_ids = [tokenizer.encode(w)[0] for w in [stop_token]]
64
+ stop_criteria = KeywordsStoppingCriteria(stop_ids)
65
+
66
+ generator = pipeline(
67
+ "text-generation",
68
+ model=model_name,
69
+ device=device,
70
+ max_new_tokens=max_new_tokens,
71
+ torch_dtype=torch.float16,
72
+ stopping_criteria=StoppingCriteriaList([stop_criteria]),
73
+ )
74
+
75
+ example = "How can I make an omelette."
76
+ text = "Question: {}\nAnswer:".format(example)
77
+
78
+ result = generator(
79
+ text,
80
+ num_return_sequences=1,
81
+ )
82
+
83
+ output = result[0]["generated_text"]
84
+
85
+ print(output)
86
+ ```
87
+
88
+ Output:
89
+
90
+ ```
91
+ Question: How can I make an omelette.
92
+ Answer:To make an omelette, start by cracking two eggs into a bowl and whisking them together. Add a splash of milk and a pinch of salt and pepper. Heat a non-stick pan over medium-high heat and add a tablespoon of butter. Once the butter has melted, pour in the egg mixture. As the eggs set, use a spatula to lift the edges and let the uncooked egg run underneath. When the eggs are cooked through and no visible liquid egg remains, top with your desired fillings and fold the omelette in half before sliding it onto a plate.<|stop|>
93
+ ```
94
+
95
+ ### Training
96
+
97
+ The model was trained on the [`Dahoas/synthetic-instruct-gptj-pairwise`](https://huggingface.co/datasets/Dahoas/synthetic-instruct-gptj-pairwise). We split the original dataset into the train (first 32000 examples) and validation (the remaining 1144 examples) subsets.
98
+
99
+ We finetune the model for 4 epoches. This took 8xA100 80GB 5 hours, where we set `batch_size_per_gpu` to `2` (so global batch size is 16), and learning rate to `0.00001` (with linear decay to zero at the last trainig step). You can find a Weights and Biases record [here](https://wandb.ai/chuanli11/ft-synthetic-instruct-gptj-pairwise-pythia2.8b?workspace=user-chuanli11).