21 18 37

Bhadresh Savani

bhadresh-savani

https://www.linkedin.com/in/bhadreshsavani/

AI & ML interests

NLP, Deep Learning, ML

Recent Activity

upvoted an article 7 days ago

Welcoming Llama Guard 4 on Hugging Face Hub

upvoted an article 7 days ago

Tiny Agents: a MCP-powered agent in 50 lines of code

upvoted an article 7 days ago

How to Build an MCP Server with Gradio

View all activity

Organizations

bhadresh-savani's activity

upvoted 3 articles 7 days ago

Article

Welcoming Llama Guard 4 on Hugging Face Hub

10 days ago

• 31

Article

Tiny Agents: a MCP-powered agent in 50 lines of code

14 days ago

• 221

Article

How to Build an MCP Server with Gradio

9 days ago

• 84

upvoted 2 articles 17 days ago

Article

LLM Inference on Edge: A Fun and Easy Guide to run LLMs via React Native on your Phone!

Mar 7

• 53

Article

The NLP Course is becoming the LLM Course!

Apr 3

• 90

updated a Space 21 days ago

Viz Agent

🔥

Solve coding problems by searching online

published a Space 21 days ago

Viz Agent

🔥

Solve coding problems by searching online

liked a model about 2 months ago

ByteDance/InfiniteYou

Text-to-Image • Updated 23 days ago • 17.1k • 592

liked a dataset about 2 months ago

tonyassi/celebrity-1000

Viewer • Updated Jan 6, 2024 • 18.2k • 323 • 8

upvoted 2 articles 2 months ago

Article

Hugging Face and JFrog partner to make AI Security more transparent

Mar 4

• 21

Article

Trace & Evaluate your Agent with Arize Phoenix

Feb 28

• 38

updated a Space 2 months ago

AlfredAgent

📊

Generate answers by searching and analyzing the web

published a Space 2 months ago

AlfredAgent

📊

Generate answers by searching and analyzing the web

updated a model 3 months ago

bhadresh-savani/gemma-2-2B-it-thinking-function_calling-V0

Updated Feb 22

published a model 3 months ago

bhadresh-savani/gemma-2-2B-it-thinking-function_calling-V0

Updated Feb 22

liked 2 Spaces 3 months ago

298

Agent Leaderboard

💬

Ranking of LLMs for agentic tasks

Unit 1 Certification - AI Agent Fundamentals

🎓

Display a message with certification information

upvoted an article 3 months ago

Article

How to deploy and fine-tune DeepSeek models on AWS

Jan 30

• 52

liked a model 5 months ago

Datou1111/shou_xin

Text-to-Image • Updated Mar 16 • 151 • • 873

reacted to lin-tan's post with 🔥 6 months ago

Post

1445

Can language models replace developers? #RepoCod says “Not Yet”, because GPT-4o and other LLMs have <30% accuracy/pass@1 on real-world code generation tasks.
- Leaderboard https://lt-asset.github.io/REPOCOD/
- Dataset: lt-asset/REPOCOD
@jiang719 @shanchao @Yiran-Hu1007
Compared to #SWEBench, RepoCod tasks are
- General code generation tasks, while SWE-Bench tasks resolve pull requests from GitHub issues.
- With 2.6X more tests per task (313.5 compared to SWE-Bench’s 120.8).

Compared to #HumanEval, #MBPP, #CoderEval, and #ClassEval, RepoCod has 980 instances from 11 Python projects, with
- Whole function generation
- Repository-level context
- Validation with test cases, and
- Real-world complex tasks: longest average canonical solution length (331.6 tokens) and the highest average cyclomatic complexity (9.00)

Introducing hashtag #RepoCod-Lite 🐟 for faster evaluations: 200 of the toughest tasks from RepoCod with:
- 67 repository-level, 67 file-level, and 66 self-contains tasks
- Detailed problem descriptions (967 tokens) and long canonical solutions (918 tokens)
- GPT-4o and other LLMs have < 10% accuracy/pass@1 on RepoCod-Lite tasks.
- Dataset: lt-asset/REPOCOD_Lite

#LLM4code #LLM #CodeGeneration #Security

2 replies