I notice the inference.py script does not provide native functionality for running this model tensor parallel? Is there any plan to update this script to allow for such?
inference.py
· Sign up or log in to comment