apsys commited on
Commit
b55d067
·
1 Parent(s): 29a8d4f

Config update

Browse files
Files changed (2) hide show
  1. README.md +11 -79
  2. app.py +1 -1
README.md CHANGED
@@ -1,82 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
1
  # GuardBench Leaderboard
2
 
3
  A HuggingFace leaderboard for the GuardBench project that allows users to submit evaluation results and view the performance of different models on safety guardrails.
4
-
5
- ## Features
6
-
7
- - Display model performance across multiple safety categories
8
- - Accept JSONL submissions with evaluation results
9
- - Store submissions in a HuggingFace dataset
10
- - Secure submission process with token authentication
11
- - Automatic data refresh from HuggingFace
12
-
13
- ## Setup
14
-
15
- 1. Clone this repository
16
- 2. Install dependencies:
17
- ```
18
- pip install -r requirements.txt
19
- ```
20
- 3. Create a `.env` file based on the `.env.template`:
21
- ```
22
- cp .env.template .env
23
- ```
24
- 4. Edit the `.env` file with your HuggingFace credentials and settings
25
- 5. Run the application:
26
- ```
27
- python app.py
28
- ```
29
-
30
- ## Submission Format
31
-
32
- Submissions should be in JSONL format, with each line containing a JSON object with the following structure:
33
-
34
- ```json
35
- {
36
- "model_name": "model-name",
37
- "per_category_metrics": {
38
- "Category Name": {
39
- "default_prompts": {
40
- "f1_binary": 0.95,
41
- "recall_binary": 0.93,
42
- "precision_binary": 1.0,
43
- "error_ratio": 0.0,
44
- "avg_runtime_ms": 3000
45
- },
46
- "jailbreaked_prompts": { ... },
47
- "default_answers": { ... },
48
- "jailbreaked_answers": { ... }
49
- },
50
- ...
51
- },
52
- "avg_metrics": {
53
- "default_prompts": {
54
- "f1_binary": 0.97,
55
- "recall_binary": 0.95,
56
- "precision_binary": 1.0,
57
- "error_ratio": 0.0,
58
- "avg_runtime_ms": 3000
59
- },
60
- "jailbreaked_prompts": { ... },
61
- "default_answers": { ... },
62
- "jailbreaked_answers": { ... }
63
- }
64
- }
65
- ```
66
-
67
- ## Environment Variables
68
-
69
- - `HF_TOKEN`: Your HuggingFace write token
70
- - `OWNER`: Your HuggingFace username or organization
71
- - `RESULTS_DATASET_ID`: The ID of the dataset to store results (e.g., "username/guardbench-results")
72
- - `SUBMITTER_TOKEN`: A secret token required for submissions
73
- - `ADMIN_USERNAME`: Username for admin access to the leaderboard
74
- - `ADMIN_PASSWORD`: Password for admin access to the leaderboard
75
-
76
- ## Deployment
77
-
78
- This application can be deployed as a HuggingFace Space for public access. Follow the HuggingFace Spaces documentation for deployment instructions.
79
-
80
- ## License
81
-
82
- MIT
 
1
+ ---
2
+ title: "Guard Bench"
3
+ emoji: "🧷"
4
+ colorFrom: "gray"
5
+ colorTo: "indigo"
6
+ sdk: "gradio"
7
+ sdk_version: "4.44.1"
8
+ app_file: app.py
9
+ pinned: false
10
+ ---
11
+
12
  # GuardBench Leaderboard
13
 
14
  A HuggingFace leaderboard for the GuardBench project that allows users to submit evaluation results and view the performance of different models on safety guardrails.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
app.py CHANGED
@@ -796,6 +796,6 @@ scheduler.start()
796
  # Launch the app
797
  if __name__ == "__main__":
798
 
799
- demo.launch(server_name="0.0.0.0", server_port=7860, share=True)
800
 
801
 
 
796
  # Launch the app
797
  if __name__ == "__main__":
798
 
799
+ demo.launch(server_name="0.0.0.0", server_port=7860)
800
 
801