Yuanxh commited on
Commit
643030c
Β·
1 Parent(s): bde3703

update_README

Browse files
Files changed (1) hide show
  1. constants.py +1 -1
constants.py CHANGED
@@ -34,7 +34,7 @@ LEADERBOARD_INTRODUCTION = """# πŸ† S-Eval Leaderboard
34
  ## πŸ”” Updates
35
  πŸ“£ [2025/03/30]: πŸŽ‰ Our paper has been accepted by ISSTA 2025. To meet evaluation needs under different budgets, we partition the benchmark into four scales: [Small](https://github.com/IS2Lab/S-Eval/tree/main/s_eval/small) (1,000 Base and 10,000 Attack in each language), [Medium](https://github.com/IS2Lab/S-Eval/tree/main/s_eval/medium) (3,000 Base and 30,000 Attack in each language), [Large](https://github.com/IS2Lab/S-Eval/tree/main/s_eval/large) (5,000 Base and 50,000 Attack in each language) and [Full](https://github.com/IS2Lab/S-Eval/tree/main/s_eval/full) (10,000 Base and 100,000 Attack in each language), comprehensively considering the balance and harmfulness of data.
36
 
37
- πŸ“£ [2024/10/25]: We release all 20,000 base risk prompts and 200,000 corresponding attack prompts ([Version-0.1.2](https://github.com/IS2Lab/S-Eval)). We also update [πŸ† LeaderBoard v0.1.2](https://huggingface.co/spaces/IS2Lab/S-Eval_v0.1.2) with new evaluation results including GPT-4 and other models.
38
  πŸŽ‰ S-Eval has achieved about **7,000** total views and about **2,000** total downloads across multiple platforms. πŸŽ‰
39
 
40
  πŸ“£ [2024/06/17]: We further release 10,000 base risk prompts and 100,000 corresponding attack prompts ([Version-0.1.1](https://github.com/IS2Lab/S-Eval)). If you require automatic safety evaluations, please feel free to submit a request via [Issues](https://huggingface.co/spaces/IS2Lab/S-Eval/discussions) or contact us by [Email](mailto:[email protected]).
 
34
  ## πŸ”” Updates
35
  πŸ“£ [2025/03/30]: πŸŽ‰ Our paper has been accepted by ISSTA 2025. To meet evaluation needs under different budgets, we partition the benchmark into four scales: [Small](https://github.com/IS2Lab/S-Eval/tree/main/s_eval/small) (1,000 Base and 10,000 Attack in each language), [Medium](https://github.com/IS2Lab/S-Eval/tree/main/s_eval/medium) (3,000 Base and 30,000 Attack in each language), [Large](https://github.com/IS2Lab/S-Eval/tree/main/s_eval/large) (5,000 Base and 50,000 Attack in each language) and [Full](https://github.com/IS2Lab/S-Eval/tree/main/s_eval/full) (10,000 Base and 100,000 Attack in each language), comprehensively considering the balance and harmfulness of data.
36
 
37
+ πŸ“£ [2024/10/25]: We release all 20,000 base risk prompts and 200,000 corresponding attack prompts ([Version-0.1.2](https://github.com/IS2Lab/S-Eval)). We also update [πŸ† LeaderBoard](https://huggingface.co/spaces/IS2Lab/S-Eval) with new evaluation results including GPT-4 and other models.
38
  πŸŽ‰ S-Eval has achieved about **7,000** total views and about **2,000** total downloads across multiple platforms. πŸŽ‰
39
 
40
  πŸ“£ [2024/06/17]: We further release 10,000 base risk prompts and 100,000 corresponding attack prompts ([Version-0.1.1](https://github.com/IS2Lab/S-Eval)). If you require automatic safety evaluations, please feel free to submit a request via [Issues](https://huggingface.co/spaces/IS2Lab/S-Eval/discussions) or contact us by [Email](mailto:[email protected]).