VezilkaLLM is a 4-billion parameter base language model specifically trained for the Macedonian language, built on top of the Gemma 3 4B pretrained foundation model. Developed using a high-quality 1.47B-word Macedonian corpus from LVSTCK, it offers strong fluency and performance on Macedonian NLP tasks while remaining lightweight and efficient for fine-tuning or deployment on modest hardware. The model was trained with the Hugging Face Transformers Trainer API on a single NVIDIA H100 GPU, using a context window of 8192 tokens and optimized for long-context coherence.
As the first model in a new series of Macedonian LLMs, VezilkaLLM sets the foundation for future releases focused on instruction tuning, chat capabilities, reasoning, and domain-specific tasks. As shown in Table 1, despite being smaller than typical multilingual or 7B–8B models, VezilkaLLM matches or outperforms them on Macedonian benchmarks, demonstrating the value of domain-specific training for low-resource languages. It is intended as a base model for further fine-tuning and does not include safety mechanisms or chat-specific tuning out of the box.
Model | ARC Challenge | ARC Easy | BoolQ | HellaSwag | OpenBookQA | PIQA | Winogrande | NQ Open |
---|---|---|---|---|---|---|---|---|
gemma-3-4b-pt | 0.28 | 0.48 | 0.75 | 0.39 | 0.25 | 0.62 | 0.59 | 0.00 |
VezilkaLLM | 0.30 | 0.50 | 0.72 | 0.41 | 0.25 | 0.65 | 0.59 | 0.03 |
domestic-yak-8B | 0.31 | 0.52 | 0.77 | 0.43 | 0.29 | 0.67 | 0.63 | 0.04 |
MKLLM-7B | 0.32 | 0.54 | 0.71 | 0.43 | 0.28 | 0.62 | 0.62 | 0.03 |
Table 1: Model Performance Comparison Across Evaluation Bencmarks
We present the evaluation results viusally in Figure 1.
Figure 1: Model Performance Comparison Across Evaluation Bencmarks
You can run inference with VezilkaLLM using the Hugging Face Transformers library as shown below:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
model_id = "finki-ukim/VezilkaLLM"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
generator = pipeline("text-generation", model=model, tokenizer=tokenizer)
prompt = "Зборот „пријателство“ значи "
outputs = generator(prompt, max_new_tokens=128, do_sample=True, temperature=0.7)
print(outputs[0]["generated_text"])
- Downloads last month
- 54