We need to invent a new way to benchmark models. The actual practice is fishy.

Salim Belhaddad
salym
AI & ML interests
Knowledge Graphs
Recent Activity
replied to
philschmid's
post
about 1 month ago
Gemini 2.5 Pro, thinking by default! We excited launch our best Gemini model for reasoning, multimodal and coding yet! #1 on LMSYS, Humanity’s Last Exam, AIME and GPQA and more!
TL;DR:
- 💻 Best Gemini coding model yet, particularly for web development (excels on LiveCodeBench).
- 🧠 Default "Thinking" with up to 64k token output
- 🌌 1 Million multimodal input context for text, image, video, audio, and pdf
- 🛠️ Function calling, structured output, google search & code execution.
- 🏆 #1 on LMArena & sota on AIME, GPQA, Humanity's Last Exam
- 💡 Knowledge cut of January 2025
- 🤗 Available for free as Experimental in AI Studio, Gemini API & Gemini APP
- 🚀 Rate limits - Free 2 RPM 50 req/day
Try it ⬇️
https://aistudio.google.com/?model=gemini-2.5-pro-exp-03-25
updated
a model
about 2 months ago
salym/ppo-LunarLander-v2
Organizations
salym's activity

replied to
philschmid's
post
about 1 month ago

reacted to
mlabonne's
post with 🚀
about 2 months ago
Post
6192
✂️ Gemma 3 Abliterated
I noticed that Gemma 3 was much more resilient to refusal removal than other models like Qwen 2.5.
I experimented with different recipes and improved the abliteration technique I wrote about last year.
It's still experimental but the refusal rate is super low in my tests. Enjoy!
mlabonne/gemma-3-4b-it-abliterated
mlabonne/gemma-3-12b-it-abliterated
mlabonne/gemma-3-27b-it-abliterated
I noticed that Gemma 3 was much more resilient to refusal removal than other models like Qwen 2.5.
I experimented with different recipes and improved the abliteration technique I wrote about last year.
It's still experimental but the refusal rate is super low in my tests. Enjoy!
mlabonne/gemma-3-4b-it-abliterated
mlabonne/gemma-3-12b-it-abliterated
mlabonne/gemma-3-27b-it-abliterated

reacted to
mlabonne's
post with 🔥
about 2 months ago
Post
11023
✂️ AutoAbliteration
I made a Colab notebook to automatically abliterate models.
It's quite general, so you can do interesting stuff like blocking a given language in the model outputs.
💻 Colab: https://colab.research.google.com/drive/1RmLv-pCMBBsQGXQIM8yF-OdCNyoylUR1?usp=sharing
I made a Colab notebook to automatically abliterate models.
It's quite general, so you can do interesting stuff like blocking a given language in the model outputs.
💻 Colab: https://colab.research.google.com/drive/1RmLv-pCMBBsQGXQIM8yF-OdCNyoylUR1?usp=sharing