Yoshinari Fujinuma's picture

2 7

Yoshinari Fujinuma

akkikiki

·

https://akkikiki.github.io/

AI & ML interests

None yet

Recent Activity

reacted to wolfram's post with 🤗 about 5 hours ago

Finally finished my extensive **Qwen 3 evaluations** across a range of formats and quantisations, focusing on **MMLU-Pro** (Computer Science). A few take-aways stood out - especially for those interested in local deployment and performance trade-offs: 1️⃣ **Qwen3-235B-A22B** (via Fireworks API) tops the table at **83.66%** with ~55 tok/s. 2️⃣ But the **30B-A3B Unsloth** quant delivered **82.20%** while running locally at ~45 tok/s and with zero API spend. 3️⃣ The same Unsloth build is ~5x faster than Qwen's **Qwen3-32B**, which scores **82.20%** as well yet crawls at <10 tok/s. 4️⃣ On Apple silicon, the **30B MLX** port hits **79.51%** while sustaining ~64 tok/s - arguably today's best speed/quality trade-off for Mac setups. 5️⃣ The **0.6B** micro-model races above 180 tok/s but tops out at **37.56%** - that's why it's not even on the graph (50 % performance cut-off). All local runs were done with LM Studio on an M4 MacBook Pro, using Qwen's official recommended settings. **Conclusion:** Quantised 30B models now get you ~98 % of frontier-class accuracy - at a fraction of the latency, cost, and energy. For most local RAG or agent workloads, they're not just good enough - they're the new default. Well done, Qwen - you really whipped the llama's ass! And to OpenAI: for your upcoming open model, please make it MoE, with toggleable reasoning, and release it in many sizes. *This* is the future!

updated a model about 22 hours ago

Cantina/Qwen3-30B-A3B-20250506-thoughts70b-2000steps

published a model about 22 hours ago

Cantina/Qwen3-30B-A3B-20250506-thoughts70b-2000steps

View all activity

Organizations

akkikiki's activity

liked a model about 1 month ago

Qwen/Qwen2.5-Omni-7B

Any-to-Any • Updated 8 days ago • 191k • 1.58k

liked a model 3 months ago

nu-dialogue/j-moshi-ext

Updated Feb 15 • 633 • 34

liked a model over 1 year ago

llm-jp/llm-jp-13b-v1.0

Text Generation • Updated Oct 20, 2023 • 663 • 39

liked a dataset almost 2 years ago

tiiuae/falcon-refinedweb

Viewer • Updated Jun 20, 2023 • 968M • 26.2k • 849

liked 3 models about 2 years ago

mosaicml/mpt-7b

Text Generation • Updated Mar 5, 2024 • 31.8k • 1.17k

google/pix2struct-base

Image-to-Text • Updated Dec 24, 2023 • 6.66k • 72

google/flan-ul2

Text2Text Generation • Updated Nov 7, 2023 • 3.44k • 556