Great article. I have been trying to deploy deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
on inferentia with a context window higher than 4096 (let's say MAX_TOTAL_TOKENS=8192
), but it seems there is no pre-compiled model for that. It would be great if you could add instructions to compile these models, that would be great.
Keerthan Vasist
kvasist
AI & ML interests
None yet
Recent Activity
commented on
an
article
2 months ago
How to deploy and fine-tune DeepSeek models on AWS
Organizations
None yet
kvasist's activity
commented on
How to deploy and fine-tune DeepSeek models on AWS
2 months ago