Spaces:
Sleeping
Sleeping
To avoid wasting memory like this, explicitly set the torch_dtype parameter to the desired data type or set torch_dtype="auto" to load the weights with the most optimal memory pattern (the data type is automatically derived from the model weights). | |
from transformers import AutoModelForCausalLM | |
gemma = AutoModelForCausalLM.from_pretrained("google/gemma-7b", torch_dtype=torch.float16) | |
from transformers import AutoModelForCausalLM | |
gemma = AutoModelForCausalLM.from_pretrained("google/gemma-7b", torch_dtype="auto") | |
You can also set the data type to use for models instantiated from scratch. | |
thon | |
import torch | |
from transformers import AutoConfig, AutoModel | |
my_config = AutoConfig.from_pretrained("google/gemma-2b", torch_dtype=torch.float16) | |
model = AutoModel.from_config(my_config) | |
``` |