Djuunaa's picture

Djuunaa

djuna

·

AI & ML interests

None yet

Recent Activity

reacted to wolfram's post with 🤗 about 13 hours ago

Finally finished my extensive **Qwen 3 evaluations** across a range of formats and quantisations, focusing on **MMLU-Pro** (Computer Science). A few take-aways stood out - especially for those interested in local deployment and performance trade-offs: 1️⃣ **Qwen3-235B-A22B** (via Fireworks API) tops the table at **83.66%** with ~55 tok/s. 2️⃣ But the **30B-A3B Unsloth** quant delivered **82.20%** while running locally at ~45 tok/s and with zero API spend. 3️⃣ The same Unsloth build is ~5x faster than Qwen's **Qwen3-32B**, which scores **82.20%** as well yet crawls at <10 tok/s. 4️⃣ On Apple silicon, the **30B MLX** port hits **79.51%** while sustaining ~64 tok/s - arguably today's best speed/quality trade-off for Mac setups. 5️⃣ The **0.6B** micro-model races above 180 tok/s but tops out at **37.56%** - that's why it's not even on the graph (50 % performance cut-off). All local runs were done with LM Studio on an M4 MacBook Pro, using Qwen's official recommended settings. **Conclusion:** Quantised 30B models now get you ~98 % of frontier-class accuracy - at a fraction of the latency, cost, and energy. For most local RAG or agent workloads, they're not just good enough - they're the new default. Well done, Qwen - you really whipped the llama's ass! And to OpenAI: for your upcoming open model, please make it MoE, with toggleable reasoning, and release it in many sizes. *This* is the future!

new activity about 16 hours ago

BarraHome/llama3.2-1b-mla:Question About MLA Usage?

reacted to sequelbox's post with 👀 about 17 hours ago

NEW RELEASE: Esper 3 for Qwen 3! - A full-stack software assistant: a reasoning finetune focused on coding, architecture, and DevOps using the Titanium and Tachibana datasets! - Improved general and creative reasoning skills, powered by the Raiden dataset. 4B model: https://huggingface.co/ValiantLabs/Qwen3-4B-Esper3 8B model: https://huggingface.co/ValiantLabs/Qwen3-8B-Esper3 We'll also be bringing Esper 3 to larger Qwen 3 models as soon as we can - if you want these, consider helping us out: https://huggingface.co/spaces/sequelbox/SupportOpenSource More models and datasets to come soon! with my love and enthusiasm, allegra

View all activity

Organizations

djuna's activity

liked a model 9 days ago

cognitivecomputations/Dolphin-Mistral-24B-Venice-Edition

Updated 9 days ago • 702 • 20

liked a model 21 days ago

aipgpt/Txt-Polisher-Douyin-Style

Text Generation • Updated 9 days ago • 33 • 3

liked a model 24 days ago

THUDM/GLM-4-9B-0414

Text Generation • Updated 24 days ago • 19k • • 64

liked 5 models about 1 month ago

Delta-Vector/Hamanasu-Magnum-QwQ-32B

Updated Mar 18 • 81 • 9

Hamzah-Asadullah/NarrowMaid-8B

Text2Text Generation • Updated 17 days ago • 27 • 6

DataoceanAI/dolphin-small

Automatic Speech Recognition • Updated Mar 27 • 122 • 26

rinna/qwen2.5-bakeneko-32b-instruct-v2

Text Generation • Updated Mar 23 • 759 • • 8

infly/INFLogic-Qwen2.5-32B-RL-Preview

Text Generation • Updated 14 days ago • 15 • 3

liked 5 models about 2 months ago

starvector/starvector-8b-im2svg

Text Generation • Updated Mar 19 • 80.4k • 462

OnomaAIResearch/Illustrious-XL-v1.0

Updated 20 days ago • 17

OnomaAIResearch/Illustrious-XL-v1.1

Updated 20 days ago • 71

sm54/FuseO1-QwQ-SkyT1-Flash-32B

Text Generation • Updated Mar 10 • 20 • 3

gbueno86/QwQ-R1-Distill-Merge-32B

Text Generation • Updated Mar 9 • 13 • 3

liked 7 models 2 months ago

onekq-ai/QwQ-32B-bnb-4bit

Text Generation • Updated Mar 5 • 221 • 2

Qwen/QwQ-32B

Text Generation • Updated Mar 11 • 565k • • 2.75k

Qwen/QwQ-32B-GGUF

Text Generation • Updated Mar 13 • 62.7k • 194

moonshotai/Moonlight-16B-A3B-Instruct

Text Generation • Updated Mar 3 • 1.66k • 156

nothingiisreal/open-gpt-3.5-detector

Text Classification • Updated Jun 9, 2024 • 16 • 4

Undi95/MistralThinker-v1.1

Updated Mar 5 • 319 • 44

Gryphe/Pantheon-RP-Pure-1.6.2-22b-Small

Updated Oct 13, 2024 • 10 • 30