ghostplant's picture

12 2

ghostplant

ghostplant

AI & ML interests

None yet

Recent Activity

updated a dataset 2 days ago

ghostplant/data-collections

liked a dataset 16 days ago

future-technologies/Universal-Transformers-Dataset

new activity 21 days ago

unsloth/DeepSeek-R1-GGUF:Share a mmlu test result,I use 2.51bit,and compare with ds api, baidu's ds,it seems 2.51bit is very smart at least in mmlu

View all activity

Organizations

None yet

ghostplant's activity

New activity in unsloth/DeepSeek-R1-GGUF 21 days ago

Share a mmlu test result,I use 2.51bit,and compare with ds api, baidu's ds,it seems 2.51bit is very smart at least in mmlu

#42 opened 2 months ago by

New activity in deepseek-ai/DeepSeek-R1 about 1 month ago

Does R1 support long context (> 4K)?

#172 opened 2 months ago by

New activity in nvidia/DeepSeek-R1-FP4 about 1 month ago

can this model run on Hopper GPU

#8 opened 2 months ago by

can this model run on A800 ?

#10 opened about 2 months ago by

Why not use FP2 or IQ2 as kTransformers does?

#11 opened about 2 months ago by

New activity in deepseek-ai/DeepSeek-R1 2 months ago

Deploying production ready service with Unsloth GGUF quants on your AWS account. (4 x L40S)

#171 opened 2 months ago by

samagra-tensorfuse

New activity in deepseek-ai/DeepSeek-R1 3 months ago

90+ tokens per second for MI300x8 using batch_size = 1

#166 opened 3 months ago by

New activity in unsloth/DeepSeek-R1-GGUF 3 months ago

Q2_K_XL 好还是 Q4好呢

#34 opened 3 months ago by

New activity in deepseek-ai/DeepSeek-R1 3 months ago

所以部署一个671B的模型显存需要多少有什么基准的硬件配置？

#118 opened 3 months ago by

New activity in deepseek-ai/DeepSeek-R1-Distill-Llama-70B 3 months ago

How much vram do you need?

#12 opened 3 months ago by

New activity in unsloth/DeepSeek-R1-GGUF 3 months ago

Is there a model removing non-shared MoE experts?

#17 opened 3 months ago by

New activity in deepseek-ai/DeepSeek-R1-Distill-Qwen-32B 3 months ago

Please convert these models to GGUF format...

#12 opened 4 months ago by