Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
13
6
18
Emin Temiz
PRO
etemiz
Follow
Fishtiks's profile picture
cucunaga's profile picture
IamValeAI's profile picture
44 followers
·
12 following
https://pickabrain.ai
etemiz
etemiz
etemiz
AI & ML interests
Alignment
Recent Activity
replied
to
clem
's
post
about 23 hours ago
What are you using to evaluate models or AI systems? So far we're building lighteval & leaderboards on the hub but still feels early & a lot more to build. What would be useful to you?
replied
to
their
post
6 days ago
Qwen 3 numbers are in! They did a good job this time, compared to 2.5 and QwQ numbers are a lot better. I used 2 GGUFs for this, one from LMStudio and one from Unsloth. Number of parameters: 235B A22B. The first one is Q4. Second one is Q8. The LLMs that did the comparison are the same, Llama 3.1 70B and Gemma 3 27B. So I took 2*2 = 4 measurements for each column and took average of measurements. My leaderboard is pretty unrelated to others it seems. Valuable in that sense, it is another non-mainstream angle for model evaluation. More info: https://huggingface.co/blog/etemiz/aha-leaderboard
posted
an
update
6 days ago
Qwen 3 numbers are in! They did a good job this time, compared to 2.5 and QwQ numbers are a lot better. I used 2 GGUFs for this, one from LMStudio and one from Unsloth. Number of parameters: 235B A22B. The first one is Q4. Second one is Q8. The LLMs that did the comparison are the same, Llama 3.1 70B and Gemma 3 27B. So I took 2*2 = 4 measurements for each column and took average of measurements. My leaderboard is pretty unrelated to others it seems. Valuable in that sense, it is another non-mainstream angle for model evaluation. More info: https://huggingface.co/blog/etemiz/aha-leaderboard
View all activity
Organizations
None yet
etemiz
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
published
an
article
22 days ago
view article
Article
Benchmarking Human Alignment of Grok 3
By
etemiz
•
22 days ago
•
1
published
an
article
about 1 month ago
view article
Article
AHA Leaderboard
By
etemiz
•
Mar 30
•
2
published
an
article
about 2 months ago
view article
Article
Building a Beneficial AI
By
etemiz
•
Mar 16
•
5
published
an
article
2 months ago
view article
Article
Ways to Align AI with Human Values
By
etemiz
•
Feb 26
published
an
article
3 months ago
view article
Article
The AHA Indicator
By
etemiz
•
Feb 1
•
3
published
an
article
3 months ago
view article
Article
DeepSeek R1 Human Alignment Tests
By
etemiz
•
Jan 25
•
1
published
an
article
6 months ago
view article
Article
Symbiotic Intelligence
By
etemiz
•
Nov 19, 2024
•
3