vllm depolyment error

#4
by Saicy - opened

ValueError: GGUF model with architecture glm4 is not supported yet.

I want to deploy it with vllm,but I find the error above. vllm=0.7.c

Thanks for notifying this. The GGUF foxes a bug in llama.cpp and the fix probably broke vllm compatibility.

Until the PR is merged you can use the traditional GGUF by bartowski https://huggingface.co/bartowski/THUDM_GLM-4-32B-0414-GGUF

Got it! I also try this version; will meet this error too

I guess glm gguf support is not ready yet in vllm.
You can track this for updates: https://github.com/vllm-project/vllm/issues/17069

Sign up or log in to comment