Spaces:

dar-tau
/

run_inference

Running on Zero

dar-tau commited on Jun 8, 2024

Commit

c3144ec

verified ·

1 Parent(s): c235215

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -12,12 +12,13 @@ pipe = pipeline("text-generation", model=model_name, device="cuda")
 generate_kwargs = {'max_new_tokens': 20}
-system_prompt = '''You are given an input text for a chat interface. Propose auto-completion to the text. You have several roles:
-- Fight under-specification: if the user does not provide sufficient context, propose them a set of relevant suggestions.
-- Complete text: The text provided to you is in the making. If you have a good idea for how to complete - make suggestions.
 Make sure the suggestions are valid completions of the text! No need for them to complete the text completely.
-Suggest only up to 5 works ahead.
 '''
 @spaces.GPU
@@ -26,7 +27,7 @@ def generate(text):
         {'role': 'system', 'content': system_prompt},
         {'role': 'user', 'content': text}
     ]
-    return pipe(messages, **generate_kwargs)
 if __name__ == "__main__":

 generate_kwargs = {'max_new_tokens': 20}
+system_prompt = '''You are given a partial input text for a chat interface. Propose auto-completion to the text. You have several roles:
+- Fight under-specification.
+- Complete text to save the user time.
+Don't suggest anything if there are no good suggestions.
 Make sure the suggestions are valid completions of the text! No need for them to complete the text completely.
+Suggest only up to 5 works ahead. The scheme of your answer should be "answer1;answer2;answer3" (return between 0 to 4 answers).
 '''
 @spaces.GPU
         {'role': 'system', 'content': system_prompt},
         {'role': 'user', 'content': text}
     ]
+    return pipe(messages, **generate_kwargs)['generated_text'][-1]['content']
 if __name__ == "__main__":