updated generation
Browse files
README.md
CHANGED
@@ -58,16 +58,6 @@ outputs = model.generate(inputs)
|
|
58 |
print(tokenizer.decode(outputs[0]))
|
59 |
```
|
60 |
|
61 |
-
### Fill-in-the-middle
|
62 |
-
Fill-in-the-middle uses special tokens to identify the prefix/middle/suffix part of the input and output:
|
63 |
-
|
64 |
-
```Java
|
65 |
-
input_text = "<fim_prefix>public class HelloWorld {\n public static void main(String[] args) {<fim_suffix>}\n}<fim_middle>"
|
66 |
-
inputs = tokenizer.encode(input_text, return_tensors="pt").to(device)
|
67 |
-
outputs = model.generate(inputs)
|
68 |
-
print(tokenizer.decode(outputs[0]))
|
69 |
-
```
|
70 |
-
|
71 |
### Attribution & Other Requirements
|
72 |
|
73 |
The pretraining dataset of the model was filtered for permissive licenses only. Nevertheless, the model can generate source code verbatim from the dataset. The code's license might require attribution and/or other specific requirements that must be respected. We provide a [search index](https://huggingface.co/spaces/bigcode/starcoder-search) that let's you search through the pretraining data to identify where generated code came from and apply the proper attribution to your code.
|
|
|
58 |
print(tokenizer.decode(outputs[0]))
|
59 |
```
|
60 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
61 |
### Attribution & Other Requirements
|
62 |
|
63 |
The pretraining dataset of the model was filtered for permissive licenses only. Nevertheless, the model can generate source code verbatim from the dataset. The code's license might require attribution and/or other specific requirements that must be respected. We provide a [search index](https://huggingface.co/spaces/bigcode/starcoder-search) that let's you search through the pretraining data to identify where generated code came from and apply the proper attribution to your code.
|