ChineseSafe-Benchmark / changelog.md
hxiang's picture
feat: update 4 models and release a test set (#6)
53ea372 verified

A newer version of the Gradio SDK is available: 5.27.0

Upgrade

CHANGELOG

2024-7-16

version: v1.0.0

changed:
- [1]feat: upload the first version

2024-10-26

version: v1.0.1

changed:
- [1]feat: add citation

2024-11-18

version: v1.0.2

changed:
- [1]feat: add three models: Qwen2.5-72B, Qwen2.5-32B, Qwen2-72B
- [2]feat: add subclass: Discrimination

2024-11-24

version: v1.0.3

changed:
- [1]feat: add three Qwen instruct models
- [2]feat: remove Qwen base models
- [3]feat: update some models' name

2024-12-28

version: v1.0.4

changed:
- [1]feat: update 9 models due to the December's todo-list:
    - QwQ-32B-Preview
    - Llama-3.1-70B-Instruct
    - Llama-3.3-70B-Instruct
    - Mistral-Nemo-Instruct-2407
    - Ministral-8B-Instruct-2410
    - Phi-3-small-8k-instruct
    - Phi-3-small-128k-instruct
    - Phi-3-medium-4k-instruct
    - Phi-3-medium-128k-instruct

2025-4-13

version: v1.0.5

changed:
- [1]feat: update 4 models due to the February's todo-list:
    - phi-4
    - DeepSeek-R1-Distill-Llama-70B
    - Mistral-Small-24B-Instruct-2501
    - Moonlight-16B-A3B-Instruct
- [2]feat: release a test set of 20000 samples