70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float
Paper
โข
2504.11651
โข
Published
โข
26
did you get it to work since?
in addition to all the other PRO features!