vllm depolyment error

by Saicy - opened 12 days ago

Saicy

12 days ago

ValueError: GGUF model with architecture glm4 is not supported yet.

I want to deploy it with vllm,but I find the error above. vllm=0.7.c

Saicy

12 days ago

0.7.3

Owner 12 days ago

Thanks for notifying this. The GGUF foxes a bug in llama.cpp and the fix probably broke vllm compatibility.

Until the PR is merged you can use the traditional GGUF by bartowski https://huggingface.co/bartowski/THUDM_GLM-4-32B-0414-GGUF

Saicy

12 days ago

Got it! I also try this version; will meet this error too

Owner 11 days ago

I guess glm gguf support is not ready yet in vllm.
You can track this for updates: https://github.com/vllm-project/vllm/issues/17069

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment