vllm depolyment error
#4
by
Saicy
- opened
ValueError: GGUF model with architecture glm4 is not supported yet.
I want to deploy it with vllm,but I find the error above. vllm=0.7.c
0.7.3
Thanks for notifying this. The GGUF foxes a bug in llama.cpp and the fix probably broke vllm compatibility.
Until the PR is merged you can use the traditional GGUF by bartowski https://huggingface.co/bartowski/THUDM_GLM-4-32B-0414-GGUF
Got it! I also try this version; will meet this error too
I guess glm gguf support is not ready yet in vllm.
You can track this for updates: https://github.com/vllm-project/vllm/issues/17069