YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Gemma-2B Fine-tuned on Atlaset

This model is a fine-tuned version of google/gemma-2b on the Atlaset dataset. It improves upon the base model by leveraging domain-specific knowledge from the Atlaset corpus.

Model Details

  • Base Model: google/gemma-2b
  • Fine-tuning Method: Low-Rank Adaptation (LoRA)
  • Training Hardware: 2x T4 GPUs on Kaggle
  • Context Length: 256 tokens
  • Parameters: 2B (base model) + LoRA parameters

Training Details

  • LoRA Configuration:
    • Rank: 16
    • Alpha: 32
    • Dropout: 0.05
    • Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • Training Steps: 5000
  • Batch Size: 4 per device
  • Learning Rate: 3e-4
  • Weight Decay: 0.01
  • Optimizer: AdamW
  • Precision: bfloat16

Performance

This model shows improved performance on tasks related to the domains covered in the Atlaset dataset, with particular strength in:

  • Knowledge-intensive tasks
  • Context-aware reasoning
  • Structured response generation

Usage Example

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("Yamemaru/gemma-2b-finetuned-atlaset")
model = AutoModelForCausalLM.from_pretrained("Yamemaru/gemma-2b-finetuned-atlaset")

# Tokenize input
input_text = "Write a summary about machine learning"
inputs = tokenizer(input_text, return_tensors="pt")

# Generate text
outputs = model.generate(
    inputs.input_ids,
    max_length=512,
    temperature=0.7,
    top_p=0.9,
    top_k=50,
    repetition_penalty=1.1,
    do_sample=True
)

# Decode and print the response
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support