dcarpintero commited on
Commit
eb220d9
·
verified ·
1 Parent(s): 840c317

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -9
README.md CHANGED
@@ -3,7 +3,9 @@ library_name: transformers
3
  license: apache-2.0
4
  base_model: answerdotai/ModernBERT-base
5
  tags:
6
- - generated_from_trainer
 
 
7
  metrics:
8
  - f1
9
  - accuracy
@@ -16,15 +18,15 @@ model-index:
16
 
17
  LLM applications face critical security challenges in form of prompt injections and jailbreaks. This can result in models leaking sensitive data or deviating from their intended behavior. Existing safeguard models are not fully open and have limited context windows (e.g., only 512 tokens in LlamaGuard).
18
 
19
- PangolinGuard is a ModernBERT (Base), lightweight model that discriminates malicious prompts (i.e. prompt injection attacks).
20
 
21
  🤗 [Tech-Blog](https://huggingface.co/blog/dcarpintero/pangolin-fine-tuning-modern-bert) | [GitHub Repo](https://github.com/dcarpintero/pangolin-guard)
22
 
23
- ## Intended uses
24
 
25
- - Adding a self-hosted, inexpensive safety checks (against prompt injection attacks) to AI agents and conversational interfaces.
26
 
27
- ## Evaluation data
28
 
29
  Evaluated on unseen data from a subset of specialized benchmarks targeting prompt safety and malicious input detection, while testing over-defense behavior:
30
 
@@ -35,9 +37,20 @@ Evaluated on unseen data from a subset of specialized benchmarks targeting promp
35
 
36
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64a13b68b14ab77f9e3eb061/ygIo-Yo3NN7mDhZlLFvZb.png)
37
 
38
- ## Training procedure
39
 
40
- ### Training hyperparameters
 
 
 
 
 
 
 
 
 
 
 
 
41
 
42
  The following hyperparameters were used during training:
43
  - learning_rate: 5e-05
@@ -48,7 +61,7 @@ The following hyperparameters were used during training:
48
  - lr_scheduler_type: linear
49
  - num_epochs: 2
50
 
51
- ### Training results
52
 
53
  | Training Loss | Epoch | Step | Validation Loss | F1 | Accuracy |
54
  |:-------------:|:------:|:----:|:---------------:|:------:|:--------:|
@@ -78,4 +91,4 @@ The following hyperparameters were used during training:
78
  - Transformers 4.50.0
79
  - Pytorch 2.6.0+cu124
80
  - Datasets 3.4.1
81
- - Tokenizers 0.21.1
 
3
  license: apache-2.0
4
  base_model: answerdotai/ModernBERT-base
5
  tags:
6
+ - ai-safety
7
+ - safeguards
8
+ - guardrails
9
  metrics:
10
  - f1
11
  - accuracy
 
18
 
19
  LLM applications face critical security challenges in form of prompt injections and jailbreaks. This can result in models leaking sensitive data or deviating from their intended behavior. Existing safeguard models are not fully open and have limited context windows (e.g., only 512 tokens in LlamaGuard).
20
 
21
+ **Pangolin Guard** is a ModernBERT (Base), lightweight model that discriminates malicious prompts (i.e. prompt injection attacks).
22
 
23
  🤗 [Tech-Blog](https://huggingface.co/blog/dcarpintero/pangolin-fine-tuning-modern-bert) | [GitHub Repo](https://github.com/dcarpintero/pangolin-guard)
24
 
25
+ ## Intended Use Cases
26
 
27
+ - Adding a self-hosted, inexpensive defense mechanism against prompt injection attacks to AI agents and conversational interfaces.
28
 
29
+ ## Evaluation Data
30
 
31
  Evaluated on unseen data from a subset of specialized benchmarks targeting prompt safety and malicious input detection, while testing over-defense behavior:
32
 
 
37
 
38
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64a13b68b14ab77f9e3eb061/ygIo-Yo3NN7mDhZlLFvZb.png)
39
 
 
40
 
41
+ ## Inference
42
+
43
+ ```python
44
+ from transformers import pipeline
45
+
46
+ classifier = pipeline("text-classification", "dcarpintero/pangolin-guard-base")
47
+ text = "your input text"
48
+ output = classifier(text)
49
+ ```
50
+
51
+ ## Training Procedure
52
+
53
+ ### Training Hyperparameters
54
 
55
  The following hyperparameters were used during training:
56
  - learning_rate: 5e-05
 
61
  - lr_scheduler_type: linear
62
  - num_epochs: 2
63
 
64
+ ### Training Results
65
 
66
  | Training Loss | Epoch | Step | Validation Loss | F1 | Accuracy |
67
  |:-------------:|:------:|:----:|:---------------:|:------:|:--------:|
 
91
  - Transformers 4.50.0
92
  - Pytorch 2.6.0+cu124
93
  - Datasets 3.4.1
94
+ - Tokenizers 0.21.1