Primus-Optima-QwenKV-1.54B

Primus-Optima-QwenKV-1.54B is an experimental chain-of-thought reasoning and code generation model, built by combining the strengths of two sources:

DeepSeek R1 (distilled 1.5B) for strong math and coding reasoning.

Qwen2.5-0.5B, fine-tuned with Process Reward Models (PRM) to boost structured step-by-step outputs in math and logic.

This hybrid design results in a bilingual, high-precision model with enhanced reasoning depth, multi-step clarity, and lightweight adaptability for math and code applications.

Key Features

Chain-of-Thought Reasoning for Math + Code
Designed to produce human-like intermediate steps in both math and programming problems — useful for education, tutoring, and technical assistants.
Hybrid Architecture (Reasoning + Reward-Guided Fine-Tuning)
Combines DeepSeek R1’s distilled capabilities with Qwen2.5-0.5B's reward-optimized reasoning for structured, goal-driven outputs.
Multilingual Capabilities (English + 中文)
Fluent and accurate in both English and Simplified Chinese, making it suitable for diverse learning and development environments.
Coder Experimental Mode
Able to solve algorithmic tasks, complete functions, and offer code walkthroughs using the same step-by-step format as it does for math.
Lightweight Yet Capable (1.54B)
With just 1.54B parameters, it is efficient for local deployments while offering surprisingly strong performance on STEM and programming tasks.

Quickstart with Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "prithivMLmods/Primus-Optima-QwenKV-1.54B"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Write a Python function to compute factorial using recursion."
messages = [
    {"role": "system", "content": "You are an expert tutor in math and programming, explaining step-by-step."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

Intended Use

Math & Programming Tutors: Assist students with logic-driven step-by-step explanations.
Bilingual STEM Apps: Ideal for dual-language math or coding environments.
Competitive Reasoning Tools: Suited for reasoning-intensive tasks like Olympiad prep, technical quizzes, and programming challenges.
On-Device LLMs: Lightweight enough for web or embedded applications needing real-time reasoning.

Limitations

Experimental Nature:
This is a hybrid research model; performance may vary across general or creative domains.
Size Constraints:
As a 1.54B parameter model, extremely complex reasoning tasks may challenge its capabilities.
Bias & Generalization:
Inherits biases from both DeepSeek R1 and Qwen2.5. Use caution in high-stakes or sensitive applications.
Prompt Engineering Required:
Structured prompts with clear questions yield the best results, especially for multi-step problems.

prithivMLmods
/

Primus-Optima-QwenKV-1.54B

Primus-Optima-QwenKV-1.54B

Key Features

Quickstart with Transformers

Intended Use

Limitations

Model tree for prithivMLmods/Primus-Optima-QwenKV-1.54B