google/gemma-3-12b-it-qat-q4_0-gguf

CUDA_VISIBLE_DEVICES="0" python3 -m sglang.launch_server
--model /home/AI-ModelScope/gemma-3-27b-it-qat-q4_0-gguf
--tp 1
--load-format gguf
--trust-remote-code
--quantization gguf
--dtype bfloat16
--max-total-tokens 20000
--context-length 32768
--kv-cache-dtype auto
--enable-p2p-check
--host 0.0.0.0
--port 8801
--mem-fraction-static 0.8
--api-key yoursecret
--max-running-request 1000
--enable-metrics

error：
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/sgl-workspace/sglang/python/sglang/launch_server.py", line 14, in
launch_server(server_args)
File "/sgl-workspace/sglang/python/sglang/srt/entrypoints/http_server.py", line 679, in launch_server
tokenizer_manager, scheduler_info = _launch_subprocesses(server_args=server_args)
File "/sgl-workspace/sglang/python/sglang/srt/entrypoints/engine.py", line 541, in _launch_subprocesses
tokenizer_manager = TokenizerManager(server_args, port_args)
File "/sgl-workspace/sglang/python/sglang/srt/managers/tokenizer_manager.py", line 159, in init
self.model_config = ModelConfig(
File "/sgl-workspace/sglang/python/sglang/srt/configs/model_config.py", line 67, in init
self.hf_text_config = get_hf_text_config(self.hf_config)
File "/sgl-workspace/sglang/python/sglang/srt/configs/model_config.py", line 359, in get_hf_text_config
class_name = config.architectures[0]
TypeError: 'NoneType' object is not subscriptabl

I copy the config.json、preprocessor_config.json、tokenizer.json、tokenizer.model、tokenizer_config.json，

and then I use
CUDA_VISIBLE_DEVICES="0" python3 -m sglang.launch_server
--model /home/AI-ModelScope/gemma-3-27b-it-qat-q4_0-gguf/gemma-3-27b-it-q4_0.gguf
--tp 1
--load-format gguf
--trust-remote-code
--quantization gguf
--dtype bfloat16
--max-total-tokens 20000
--context-length 32768
--kv-cache-dtype auto
--enable-p2p-check
--host 0.0.0.0
--port 8801
--mem-fraction-static 0.8
--api-key yoursecret
--max-running-request 1000
--enable-metrics

error：

what's the problem? and how to fix it

google
/

gemma-3-12b-it-qat-q4_0-gguf

sglang deploy error