Overview

fukugawa/gemma-2-9b-finetunedをBitsAndBytes(0.44.1)で4bit量子化

以下はBitsAndBytesで4bit量子化して自分のリポジトリにPushするまでのサンプルコードです。

# python 3.10
pip install bitsandbytes==0.44.1
pip install accelerate==1.2.1
pip install transformers==4.50.0
pip install huggingface_hub[cli]
# アクセストークンを入力してログイン
huggingface-cli login
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

model_id = "fukugawa/gemma-2-9b-finetuned" 
repo_id = "fukugawa/gemma-2-9b-finetuned-bnb-4bit" 

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map="auto")

tokenizer.push_to_hub(repo_id)
model.push_to_hub(repo_id)
Downloads last month
23
Safetensors
Model size
5.21B params
Tensor type
F32
·
FP16
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for fukugawa/gemma-2-9b-finetuned-bnb-4bit

Quantized
(3)
this model

Dataset used to train fukugawa/gemma-2-9b-finetuned-bnb-4bit

Space using fukugawa/gemma-2-9b-finetuned-bnb-4bit 1