Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated 19 days ago • 185
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens Paper • 2504.07096 • Published 28 days ago • 73
bartowski/deepcogito_cogito-v1-preview-qwen-32B-GGUF Text Generation • Updated 29 days ago • 11.4k • 13
mlx-community/deepcogito-cogito-v1-preview-qwen-32B-4bit Text Generation • Updated 28 days ago • 246 • 5
Unsloth 4-bit Dynamic Quants Collection Unsloths Dynamic 4bit Quants selectively skips quantizing certain parameters; greatly improving accuracy while only using <10% more VRAM than BnB 4bit • 28 items • Updated 6 days ago • 80