Spaces:

ford442
/

vicuna-coder-33b

Running on Zero

ford442 commited on 1 day ago

Commit

1342ff2

verified ·

1 Parent(s): ff2ec1d

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -41,7 +41,7 @@ model = AutoModelForCausalLM.from_pretrained(
     device_map="auto", # device_map="auto" can be helpful for very large models to distribute layers if you have multiple GPUs or for offloading.
                          # For single GPU, explicit .to('cuda') is fine.
     # trust_remote_code=True # Removed: Generally not needed for Vicuna/Llama models
-).to('cuda', torch.bfloat16) # Explicitly using bfloat16 as in original code
 # ** MODIFICATION 3: Removed `trust_remote_code=True` for tokenizer **
 print(f"Loading tokenizer: {model_name}")

     device_map="auto", # device_map="auto" can be helpful for very large models to distribute layers if you have multiple GPUs or for offloading.
                          # For single GPU, explicit .to('cuda') is fine.
     # trust_remote_code=True # Removed: Generally not needed for Vicuna/Llama models
+).to('cuda') #, torch.bfloat16) # Explicitly using bfloat16 as in original code
 # ** MODIFICATION 3: Removed `trust_remote_code=True` for tokenizer **
 print(f"Loading tokenizer: {model_name}")