How to make thinking take less time
#7 opened about 1 month ago
by
Navanit-AI

Any one get the issues of <think> tag not showing?
1
#6 opened about 1 month ago
by
ChloeHuang1
GPTQ quants
#5 opened about 2 months ago
by
dazipe
有没有在3090上部署这个awq版本的,速度只有6tokens/s,正常吗
2
#4 opened about 2 months ago
by
Jsoooooo
AWQ Quant Settings?
#3 opened 2 months ago
by
radna

Performance loss of AWQ compared to the original model
#2 opened 2 months ago
by
Saaiet
disable thinking for some requests
4
#1 opened 2 months ago
by
devops724