Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
xhluca 's Collections
AgentRewardBench
BM25S

AgentRewardBench

updated 24 days ago
Upvote
-

  • AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories

    Paper • 2504.08942 • Published 27 days ago • 27

  • McGill-NLP/agent-reward-bench

    Viewer • Updated 18 days ago • 1.41k • 3.55k • 2

  • Running
    4
    4

    Agent Reward Bench Demo

    💻

    Visualize agent interactions with WebArena tasks


  • Running

    Agent Reward Bench Leaderboard

    🥇

    Leaderboard for AgentRewardBench

Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs