I’m fine-tuning Qwen 2.5-0.5B to be extremely good at math, using high-quality datasets and some smart training strategies. The logs are looking really promising so far!
Expected release: Tomorrow morning? I’ll post as soon as it’s ready — stay tuned.
If you want faster updates or just wanna chat about it, come join my Discord: https://discord.gg/EXsug2Ux29 (Heads up: we might ask a couple quick questions when you join — just making sure we keep the server safe.)
This project is also helping shape the future of IntellIte. The insights and techniques we’re developing here — better dataset curation, fine-tuning tricks, and evaluation methods — will directly contribute to making IntellIte even sharper, faster, and more reliable, especially for math and reasoning tasks.
Big progress ahead. Can’t wait to share it with you all!
Come check out my new dataset "SimpleMath"! Designed to help small models get simple math almost always right, instead of complex math almost always wrong. ProCreations/SimpleMath Also useful for introducing math slowly to LLM's. Instead of jumping into the complex stuff headfirst, it can slowly learn it better by starting easy and going hard.
When OpenAI released its Computer-Using Agent (CUA) API, I happened to be playing Wordle 🧩 and thought, why not see how the model handles it? Spoiler: Wordle turned out to be a surprisingly effective benchmark. So Romain Cosentino Ph.D. and I dug in and analyzed the results of several hundred runs.
🔑 Takeaways 1️⃣ Even the best computer-using models struggle with simple, context-dependent tasks. 2️⃣ Visual perception and reasoning remain major hurdles for multimodal agents. 3️⃣ Real-world use cases reveal significant gaps between hype and reality. Perception accuracy drops to near zero by the last turn 📉
A compact chat model built for speed, efficiency, and simplicity.
IntellIte‑Chat v1.0 is the debut model in the IntellIte series—a lightweight conversational transformer crafted to be fast, memory-efficient, and easy to work with. It’s designed for devs and enthusiasts who want sharp results without huge resource demands.
🧠 Parameters & Architecture • Model Size: ~100M parameters • Architecture: Modified GPT-NeoX • Focus: Chat performance with low latency and efficient memory use
⸻
🧃 Support the Build Every dollar you donate is an extra amount of VRAM I get to work with. 😅 This project is fully independent and entirely self-funded. If you want to help bring it to life: 👉 https://buymeacoffee.com/procreations
⸻
💛 Early Supporters All early supporters will be credited here when the model launches. Even the smallest support means the world and pushes this project forward.
Special thanks to: Maybe you?
⸻
🛠️ Development Status • Architecture Design: Completed ✅ • Dataset Planning: Completed ✅ • Training Code: Near Completion 🛠️ • Training Launch: Starting Soon ⏳ • Evaluation Setup: Coming soon 🔜 • Final Release: Coming soon 🔜
⸻
Built to chat. Built on a budget. Built to prove what small models can do.
Intellite chat, a new up and coming small-efficient English-chat based AI model is coming soon! Based off of the GPT 2 tokenizer it will crush any chat (for a model its size).