Daniel Han-Chen's picture

Daniel Han-Chen

danielhanchen

AI & ML interests

None yet

Recent Activity

updated a model about 1 hour ago
unsloth/DeepSeek-V3-0324-GGUF-UD
upvoted a collection about 1 hour ago
Unsloth Dynamic 2.0 Quants
updated a model about 2 hours ago
unsloth/DeepSeek-R1-GGUF-UD
View all activity

Organizations

Qwen's profile picture Unsloth AI's profile picture Unsloth Backup Account's profile picture Unofficial Mistral Community's profile picture Social Post Explorers's profile picture gg-tt's profile picture Hugging Face Discord Community's profile picture Hugging Face Party @ PyTorch Conference's profile picture gg-hf-g's profile picture

Posts 10

view post
Post
3565
🦥 Introducing Unsloth Dynamic v2.0 GGUFs!
Our v2.0 quants set new benchmarks on 5-shot MMLU and KL Divergence, meaning you can now run & fine-tune quantized LLMs while preserving as much accuracy as possible.

Llama 4: unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF
DeepSeek-R1: unsloth/DeepSeek-R1-GGUF-UD
Gemma 3: unsloth/gemma-3-27b-it-GGUF

We made selective layer quantization much smarter. Instead of modifying only a subset of layers, we now dynamically quantize all layers so every layer has a different bit. Now, our dynamic method can be applied to all LLM architectures, not just MoE's.

Blog with Details: https://docs.unsloth.ai/basics/dynamic-v2.0

All our future GGUF uploads will leverage Dynamic 2.0 and our hand curated 300K–1.5M token calibration dataset to improve conversational chat performance.

For accurate benchmarking, we built an evaluation framework to match the reported 5-shot MMLU scores of Llama 4 and Gemma 3. This allowed apples-to-apples comparisons between full-precision vs. Dynamic v2.0, QAT and standard iMatrix quants.

Dynamic v2.0 aims to minimize the performance gap between full-precision models and their quantized counterparts.

Articles 1

Article
58

Faster fine-tuning using TRL & Unsloth