Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Qwen
/
QwQ-32B-AWQ

Text Generation
Safetensors
English
qwen2
chat
conversational
4-bit precision
awq
Model card Files Files and versions
xet
Community
7
New discussion
Resources
  • PR & discussions documentation
  • Code of Conduct
  • Hub documentation

How to make thinking take less time

#7 opened about 1 month ago by
Navanit-AI

Any one get the issues of <think> tag not showing?

1
#6 opened about 1 month ago by
ChloeHuang1

GPTQ quants

#5 opened about 2 months ago by
dazipe

有没有在3090上部署这个awq版本的,速度只有6tokens/s,正常吗

2
#4 opened about 2 months ago by
Jsoooooo

AWQ Quant Settings?

#3 opened 2 months ago by
radna

Performance loss of AWQ compared to the original model

#2 opened 2 months ago by
Saaiet

disable thinking for some requests

4
#1 opened 2 months ago by
devops724
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs