The checkpoint you are trying to load has model type `gemma3` but Transformers does not recognize this architecture

#28

by bajajrahul03 - opened Mar 25

Mar 25

Trying to deploy google/gemma-3-4b-it on vllm sever on azure kubernates and getting below error .. using vllm latest image.

ValueError: The checkpoint you are trying to load has model type gemma3 but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

You can update Transformers with the command pip install --upgrade transformers. If this does not work, and the checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get the most up-to-date code by installing Transformers from source with the command pip install git+https://github.com/huggingface/transformers.git

Renu11

Google org Mar 25

Gemma 3 requires transformers version 4.45.0.dev. Please install this specific version using the provided command - pip install git+https://github.com/huggingface/[email protected] and try again. Let us know if the issue still persists. Thank you.

bajajrahul03

Mar 25

@Renu11 how do i run this command, i am deploying vai kubernates yaml even i tried adding it as part of init container, still same error. model is available on PVC .

apiVersion: apps/v1
kind: Deployment
metadata:
name: vllm-gemma-3-27b-it
namespace: ncgr-gemma-3-27b-it-ns
spec:
replicas: 1
selector:
matchLabels:
app: vllm-server
model: gemma-3-27b-it
template:
metadata:
labels:
app: vllm-server
model: gemma-3-27b-it
spec:
initContainers:
- name: install-transformers
image: python:3.9-slim
command:
- "/bin/bash"
- "-c"
- |
apt-get update && apt-get install -y git && pip install git+https://github.com/huggingface/transformers
containers:
- name: vllm-server
image: vllm/vllm-openai:latest
ports:
- containerPort: 8000
volumeMounts:
- name: model-storage
mountPath: /mnt/models
command: ["python3", "-m", "vllm.entrypoints.openai.api_server"]
args:
- "--model"
- "/mnt/models/gemma-3-27b-it"
- "--api-key"
- "$(VLLM_API_KEY)"
- "--tensor-parallel-size"
- "1"
- "--dtype"
- "float16"
- "--port"
- "8000"
- "--max-model-len"
- "128000"
- "--max-num-batched-tokens"
- "128000"
- "--max-num-seqs"
- "16"
- "--gpu-memory-utilization"
- "0.98"
- "--served-model-name"
- "Gemma-3-27B-IT"
- "--disable-log-requests"
- "--enable-chunked-prefill"
- "--enable-prefix-caching"
resources:
requests:
cpu: "16"
memory: "100Gi"
nvidia.com/gpu: "1"
limits:
cpu: "24"
memory: "160Gi"
nvidia.com/gpu: "1"
env:
- name: NCCL_DEBUG
value: "INFO"
- name: VLLM_API_KEY
valueFrom:
secretKeyRef:
name: api-key-secret
key: api-key
volumes:
- name: model-storage
persistentVolumeClaim:
claimName: gemma-3-27b-it-pvc

Renu11

Google org Mar 27

Hi @bajajrahul03 , A latest stable transformers version is available now which is compatible to Gemma3 models. You can install it using the command - pip install -U transformers.

bajajrahul03

Mar 28

ok will try and let u know

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment