70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float Paper • 2504.11651 • Published 23 days ago • 28
Llama Collection All our SOTA Llama models that crush competition :) • 6 items • Updated Nov 5, 2024 • 1
Llama Collection All our SOTA Llama models that crush competition :) • 6 items • Updated Nov 5, 2024 • 1
xmadai/Llama-3.1-Nemotron-70B-Instruct-xMADai-INT4 Text Generation • Updated Oct 30, 2024 • 9 • 4
Llama Collection All our SOTA Llama models that crush competition :) • 6 items • Updated Nov 5, 2024 • 1