Commit History
Add agentharm and swe-bench tasks
1289818
xeon27
commited on
Add results for GAIA and GDM tasks
2718fde
xeon27
commited on
Update about page
8596ab1
Add model name links and change single-turn to base
9c55d6d
xeon27
commited on
Remove filtering
e344502
Change nomenclature to single-turn
eb538cb
xeon27
commited on
Add new tasks
6eaffc5
xeon27
commited on
Add task link in description
ba14348
xeon27
commited on
[WIP] Add task link in description
6410971
xeon27
commited on
[WIP] Add task link in description
159e996
xeon27
commited on
[WIP] Add task link in description
fcd47ae
xeon27
commited on
Make task names clickable and link to inspect-evals repo
15e5347
xeon27
commited on
Make values clickable
bbde2b0
xeon27
commited on
Add title and required text
ba2f546
xeon27
commited on
Add GAIA and GDM-InterCode-CTF tasks
0dddab1
xeon27
commited on
Add base eval tasks
006ba57
xeon27
commited on