please regenerate ggufs

#3
by jacek2024 - opened

Just as a note, see https://www.reddit.com/r/LocalLLaMA/comments/1jzn9wj/comment/mn7iv7f

By using these arguments: --flash-attn -ctk q4_0 -ctv q4_0 --ctx-size 16384 --override-kv tokenizer.ggml.eos_token_id=int:151336 --override-kv glm4.rope.dimension_count=int:64 --jinja I was able to make the IQ4_XS quant work well for me on the lastest build of llama.cpp

Sign up or log in to comment