Kimi-VL-A3B Collection Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 6 items • Updated 26 days ago • 66
Adapting Large Language Models via Reading Comprehension Paper • 2309.09530 • Published Sep 18, 2023 • 78
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6, 2024 • 189
Qwen 2.5 Coder Collection Complete collection of Code-specific model series for Qwen2.5 in bnb 4bit, 16bit and GGUF formats. • 35 items • Updated 7 days ago • 28
Qwen QwQ-32B Collection Collection Qwen's reasoning models including QwQ (32B) & QVQ (72B) in formats: GGUF, dynamic 4-bit and 16-bit original versions. • 13 items • Updated 7 days ago • 5
Phi-4 (All Versions) Collection Microsoft's Phi-4 models including Reasoning + Reasoning Plus & mini. Includes Dynamic 2.0 GGUF, 4-bit & 16-bit versions. Includes Unsloth's bug fixes • 20 items • Updated 7 days ago • 68
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 30 items • Updated 7 days ago • 222
Unsloth 4-bit Dynamic Quants Collection Unsloths Dynamic 4bit Quants selectively skips quantizing certain parameters; greatly improving accuracy while only using <10% more VRAM than BnB 4bit • 28 items • Updated 7 days ago • 80
Personal Favorites Collection Recommended models I use often or like for any reason. I recommend reading their cards for more details. • 10 items • Updated Dec 24, 2024 • 84
🚂 SD-XL Training Suite Collection All the steps to train your own SD-XL custom model • 9 items • Updated Feb 14 • 22
abliterated-v3 Collection Latest gen of the abliterated models I've produced • 17 items • Updated Jun 3, 2024 • 118
Don't tell me no... Collection Models designed to provide fewer refusals. • 3 items • Updated Apr 19, 2024 • 4
Flavors of Flora Collection A collection of the different flavors of Flora. • 3 items • Updated Apr 19, 2024 • 2