7 44 300

Thinh Le PRO

thinhlpg

thinhlpg

AI & ML interests

anime stuffs

Recent Activity

liked a Space about 4 hours ago

k-mktr/gpu-poor-llm-arena

upvoted a collection about 4 hours ago

Leaderboards and benchmarks ✨

liked a dataset about 4 hours ago

basicv8vc/SimpleQA

View all activity

Organizations

thinhlpg's activity

liked a Space about 4 hours ago

208

GPU Poor LLM Arena

🏆

Compact LLM Battle Arena: Frugal AI Face-Off!

upvoted a collection about 4 hours ago

Leaderboards and benchmarks ✨

Collection

Cool leaderboard spaces collection for models across modalities! Text, vision, audio, ... • 91 items • Updated Feb 28 • 107

liked a dataset about 4 hours ago

basicv8vc/SimpleQA

Viewer • Updated Nov 5, 2024 • 4.33k • 3.35k • 12

liked a dataset about 5 hours ago

smolagents/browse_comp

Viewer • Updated 20 days ago • 1.27k • 113

reacted to merterbak's post with 🚀🔥 about 9 hours ago

Post

3047

OpenAI has released BrowseComp an open source benchmark designed to evaluate the web browsing capabilities of AI agents. This dataset comprising 1,266 questions challenges AI models to navigate the web and uncover complex and obscure information. Crafted by human trainers, the questions are intentionally difficult. (unsolvable by another person in under ten minutes and beyond the reach of existing models like ChatGPT with and without browsing and an early version of OpenAI's Deep Research tool.)

Blog Post: https://openai.com/index/browsecomp/
Paper: https://cdn.openai.com/pdf/5e10f4ab-d6f7-442e-9508-59515c65e35d/browsecomp.pdf
Code in simple eval repo: https://github.com/openai/simple-evals

liked a model about 10 hours ago

Qwen/Qwen3-30B-A3B

Text Generation • Updated 8 days ago • 108k • • 468

liked a Space about 12 hours ago

640

Open Deep-Research

🏆

OpenAI's Deep Research, but open

liked a dataset 1 day ago

Idavidrein/gpqa

Viewer • Updated Mar 28, 2024 • 1.25k • 57.4k • 164

liked 4 Spaces 1 day ago

298

Agent Leaderboard

💬

Ranking of LLMs for agentic tasks

411

GAIA Leaderboard

🦾

Submit models for evaluation and view leaderboard results

1.74k

Background Removal

🌘

Remove backgrounds from images using URLs, file uploads, or image sliders

qwen3-30b-a3b Research

🏃

qwen3-30b-a3b-research + Real Time Deep Research

liked a dataset 2 days ago

Team-ACE/ToolACE

Viewer • Updated Sep 4, 2024 • 11.3k • 1.17k • 81

liked a model 2 days ago

ds4sd/SmolDocling-256M-preview

Image-Text-to-Text • Updated Mar 23 • 85.3k • 1.32k

reacted to burtenshaw's post with 🧠👍 3 days ago

Post

1557

Qwen 3 Fine tuning >> MoE. Update the experiment thread to include config and script for fine-tuning the Qwen3-30B-A3B model.

The goal is to make a low latency non-thinking model for a daily driver coding, so 3 billion parameters active should be perfect.

✔️ training running
✔️ evals running
⏭️ improve dataset

The moe isn't going to fit into colab's A100 even with quantization (🙏 @UnslothAI ). So I've been working on HF spaces' H100s for this. Everything is available in the tread and I'll share more tomorrow.

burtenshaw/Qwen3-Code-Lite#1

liked a model 3 days ago

mistralai/Mixtral-8x7B-v0.1

Text Generation • Updated Jul 24, 2024 • 48.5k • • 1.71k

upvoted a collection 5 days ago

Chronos Models & Datasets

Collection

Collection of artifacts related to Chronos pretrained models for time series forecasting. • 12 items • Updated Nov 26, 2024 • 42

liked a Space 6 days ago

5.58k

MTEB Leaderboard

🥇

Embedding Leaderboard