tomaarsen HF Staff commited on
Commit
ae76f33
·
verified ·
1 Parent(s): 6d3d4b5

Add new CrossEncoder model

Browse files
Files changed (6) hide show
  1. README.md +738 -0
  2. config.json +56 -0
  3. model.safetensors +3 -0
  4. special_tokens_map.json +37 -0
  5. tokenizer.json +0 -0
  6. tokenizer_config.json +945 -0
README.md ADDED
@@ -0,0 +1,738 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ tags:
5
+ - sentence-transformers
6
+ - cross-encoder
7
+ - generated_from_trainer
8
+ - dataset_size:399282
9
+ - loss:LambdaLoss
10
+ base_model: answerdotai/ModernBERT-base
11
+ datasets:
12
+ - sentence-transformers/msmarco
13
+ pipeline_tag: text-ranking
14
+ library_name: sentence-transformers
15
+ metrics:
16
+ - map
17
+ - mrr@10
18
+ - ndcg@10
19
+ model-index:
20
+ - name: CrossEncoder based on answerdotai/ModernBERT-base
21
+ results:
22
+ - task:
23
+ type: cross-encoder-reranking
24
+ name: Cross Encoder Reranking
25
+ dataset:
26
+ name: NanoMSMARCO R100
27
+ type: NanoMSMARCO_R100
28
+ metrics:
29
+ - type: map
30
+ value: 0.6768
31
+ name: Map
32
+ - type: mrr@10
33
+ value: 0.669
34
+ name: Mrr@10
35
+ - type: ndcg@10
36
+ value: 0.7251
37
+ name: Ndcg@10
38
+ - task:
39
+ type: cross-encoder-reranking
40
+ name: Cross Encoder Reranking
41
+ dataset:
42
+ name: NanoNFCorpus R100
43
+ type: NanoNFCorpus_R100
44
+ metrics:
45
+ - type: map
46
+ value: 0.3576
47
+ name: Map
48
+ - type: mrr@10
49
+ value: 0.5819
50
+ name: Mrr@10
51
+ - type: ndcg@10
52
+ value: 0.4143
53
+ name: Ndcg@10
54
+ - task:
55
+ type: cross-encoder-reranking
56
+ name: Cross Encoder Reranking
57
+ dataset:
58
+ name: NanoNQ R100
59
+ type: NanoNQ_R100
60
+ metrics:
61
+ - type: map
62
+ value: 0.7134
63
+ name: Map
64
+ - type: mrr@10
65
+ value: 0.7402
66
+ name: Mrr@10
67
+ - type: ndcg@10
68
+ value: 0.7594
69
+ name: Ndcg@10
70
+ - task:
71
+ type: cross-encoder-nano-beir
72
+ name: Cross Encoder Nano BEIR
73
+ dataset:
74
+ name: NanoBEIR R100 mean
75
+ type: NanoBEIR_R100_mean
76
+ metrics:
77
+ - type: map
78
+ value: 0.5826
79
+ name: Map
80
+ - type: mrr@10
81
+ value: 0.6637
82
+ name: Mrr@10
83
+ - type: ndcg@10
84
+ value: 0.6329
85
+ name: Ndcg@10
86
+ ---
87
+
88
+ # CrossEncoder based on answerdotai/ModernBERT-base
89
+
90
+ This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on the [msmarco](https://huggingface.co/datasets/sentence-transformers/msmarco) dataset using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
91
+
92
+ ## Model Details
93
+
94
+ ### Model Description
95
+ - **Model Type:** Cross Encoder
96
+ - **Base model:** [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) <!-- at revision 8949b909ec900327062f0ebf497f51aef5e6f0c8 -->
97
+ - **Maximum Sequence Length:** 8192 tokens
98
+ - **Number of Output Labels:** 1 label
99
+ - **Training Dataset:**
100
+ - [msmarco](https://huggingface.co/datasets/sentence-transformers/msmarco)
101
+ - **Language:** en
102
+ <!-- - **License:** Unknown -->
103
+
104
+ ### Model Sources
105
+
106
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
107
+ - **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
108
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
109
+ - **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder)
110
+
111
+ ## Usage
112
+
113
+ ### Direct Usage (Sentence Transformers)
114
+
115
+ First install the Sentence Transformers library:
116
+
117
+ ```bash
118
+ pip install -U sentence-transformers
119
+ ```
120
+
121
+ Then you can load this model and run inference.
122
+ ```python
123
+ from sentence_transformers import CrossEncoder
124
+
125
+ # Download from the 🤗 Hub
126
+ model = CrossEncoder("tomaarsen/reranker-msmarco-ModernBERT-base-lambdaloss")
127
+ # Get scores for pairs of texts
128
+ pairs = [
129
+ ['How many calories in an egg', 'There are on average between 55 and 80 calories in an egg depending on its size.'],
130
+ ['How many calories in an egg', 'Egg whites are very low in calories, have no fat, no cholesterol, and are loaded with protein.'],
131
+ ['How many calories in an egg', 'Most of the calories in an egg come from the yellow yolk in the center.'],
132
+ ]
133
+ scores = model.predict(pairs)
134
+ print(scores.shape)
135
+ # (3,)
136
+
137
+ # Or rank different texts based on similarity to a single text
138
+ ranks = model.rank(
139
+ 'How many calories in an egg',
140
+ [
141
+ 'There are on average between 55 and 80 calories in an egg depending on its size.',
142
+ 'Egg whites are very low in calories, have no fat, no cholesterol, and are loaded with protein.',
143
+ 'Most of the calories in an egg come from the yellow yolk in the center.',
144
+ ]
145
+ )
146
+ # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
147
+ ```
148
+
149
+ <!--
150
+ ### Direct Usage (Transformers)
151
+
152
+ <details><summary>Click to see the direct usage in Transformers</summary>
153
+
154
+ </details>
155
+ -->
156
+
157
+ <!--
158
+ ### Downstream Usage (Sentence Transformers)
159
+
160
+ You can finetune this model on your own dataset.
161
+
162
+ <details><summary>Click to expand</summary>
163
+
164
+ </details>
165
+ -->
166
+
167
+ <!--
168
+ ### Out-of-Scope Use
169
+
170
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
171
+ -->
172
+
173
+ ## Evaluation
174
+
175
+ ### Metrics
176
+
177
+ #### Cross Encoder Reranking
178
+
179
+ * Datasets: `NanoMSMARCO_R100`, `NanoNFCorpus_R100` and `NanoNQ_R100`
180
+ * Evaluated with [<code>CrossEncoderRerankingEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderRerankingEvaluator) with these parameters:
181
+ ```json
182
+ {
183
+ "at_k": 10,
184
+ "always_rerank_positives": true
185
+ }
186
+ ```
187
+
188
+ | Metric | NanoMSMARCO_R100 | NanoNFCorpus_R100 | NanoNQ_R100 |
189
+ |:------------|:---------------------|:---------------------|:---------------------|
190
+ | map | 0.6768 (+0.1872) | 0.3576 (+0.0966) | 0.7134 (+0.2938) |
191
+ | mrr@10 | 0.6690 (+0.1915) | 0.5819 (+0.0820) | 0.7402 (+0.3135) |
192
+ | **ndcg@10** | **0.7251 (+0.1847)** | **0.4143 (+0.0892)** | **0.7594 (+0.2587)** |
193
+
194
+ #### Cross Encoder Nano BEIR
195
+
196
+ * Dataset: `NanoBEIR_R100_mean`
197
+ * Evaluated with [<code>CrossEncoderNanoBEIREvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderNanoBEIREvaluator) with these parameters:
198
+ ```json
199
+ {
200
+ "dataset_names": [
201
+ "msmarco",
202
+ "nfcorpus",
203
+ "nq"
204
+ ],
205
+ "rerank_k": 100,
206
+ "at_k": 10,
207
+ "always_rerank_positives": true
208
+ }
209
+ ```
210
+
211
+ | Metric | Value |
212
+ |:------------|:---------------------|
213
+ | map | 0.5826 (+0.1925) |
214
+ | mrr@10 | 0.6637 (+0.1957) |
215
+ | **ndcg@10** | **0.6329 (+0.1776)** |
216
+
217
+ <!--
218
+ ## Bias, Risks and Limitations
219
+
220
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
221
+ -->
222
+
223
+ <!--
224
+ ### Recommendations
225
+
226
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
227
+ -->
228
+
229
+ ## Training Details
230
+
231
+ ### Training Dataset
232
+
233
+ #### msmarco
234
+
235
+ * Dataset: [msmarco](https://huggingface.co/datasets/sentence-transformers/msmarco) at [a0537b6](https://huggingface.co/datasets/sentence-transformers/msmarco/tree/a0537b6c8669051b215b020183c276a1eb2027d5)
236
+ * Size: 399,282 training samples
237
+ * Columns: <code>query_id</code>, <code>doc_ids</code>, and <code>labels</code>
238
+ * Approximate statistics based on the first 1000 samples:
239
+ | | query_id | doc_ids | labels |
240
+ |:--------|:----------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------|
241
+ | type | string | list | list |
242
+ | details | <ul><li>min: 6 characters</li><li>mean: 33.0 characters</li><li>max: 154 characters</li></ul> | <ul><li>min: 6 elements</li><li>mean: 13.23 elements</li><li>max: 20 elements</li></ul> | <ul><li>min: 6 elements</li><li>mean: 13.23 elements</li><li>max: 20 elements</li></ul> |
243
+ * Samples:
244
+ | query_id | doc_ids | labels |
245
+ |:-----------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------|
246
+ | <code>intel current gen core processors</code> | <code>["Identical or more capable versions of Core processors are also sold as Xeon processors for the server and workstation markets. As of 2017 the current lineup of Core processors included the Intel Core i7, Intel Core i5, and Intel Core i3, along with the Y - Series Intel Core CPU's.", "Most noticeably that Panasonic switched from Intel Core 2 Duo power to the latest Intel Core i3 and i5 processors. The three processors available in the new Toughbook 31, together with the new Mobile Intel QM57 Express chipset, are all part of Intel's Calpella platform.", 'The new 7th Gen Intel Core i7-7700HQ processor gives the 14-inch Razer Blade 2.8GHz of quad-core processing power and Turbo Boost speeds, which automatically increases the speed of active cores â\x80\x93 up to 3.8GHz.', 'Key difference: Intel Core i3 is a type of dual-core processor. i5 processors have 2 to 4 cores. A dual-core processor is a type of a central processing unit (CPU) that has two complete execution cores. Hence, it has t...</code> | <code>[1, 0, 0, 0, 0, ...]</code> |
247
+ | <code>renovation definition</code> | <code>['Renovation is the act of renewing or restoring something. If your kitchen is undergoing a renovation, thereâ\x80\x99s probably plaster and paint all over the place and you should probably get take-out.', 'NEW GALLERY SPACES OPENING IN 2017. In early 2017, our fourth floor will be transformed into a new destination for historical education and innovation. During the current renovation, objects from our permanent collection are on view throughout the Museum.', 'A same level house extension in Australia will cost approximately $60,000 to $200,000+. Adding a room or extending your living area on the ground floor are affordable ways of creating more space.Here are some key points to consider that will help you keep your renovation costs in check.RTICLE Stephanie Matheson. A same level house extension in Australia will cost approximately $60,000 to $200,000+. Adding a room or extending your living area on the ground floor are affordable ways of creating more space. Here are some key points...</code> | <code>[1, 0, 0, 0, 0, ...]</code> |
248
+ | <code>what is a girasol</code> | <code>['Girasol definition, an opal that reflects light in a bright luminous glow. See more.', 'Also, a type of opal from Mexico, referred to as Mexican water opal, is a colorless opal which exhibits either a bluish or golden internal sheen. Girasol opal is a term sometimes mistakenly and improperly used to refer to fire opals, as well as a type of transparent to semitransparent type milky quartz from Madagascar which displays an asterism, or star effect, when cut properly.', 'What is the meaning of Girasol? How popular is the baby name Girasol? Learn the origin and popularity plus how to pronounce Girasol', 'There are 5 basic types of opal. These types are Peruvian Opal, Fire Opal, Girasol Opal, Common opal and Precious Opal. There are 5 basic types of opal. These types are Peruvian Opal, Fire Opal, Girasol Opal, Common opal and Precious Opal.', 'girasol (Ë\x88dÊ\x92ɪrÉ\x99Ë\x8csÉ\x92l; -Ë\x8csÉ\x99Ê\x8al) , girosol or girasole n (Jewellery) a type of opal that has a red or pink glow in br...</code> | <code>[1, 0, 0, 0, 0, ...]</code> |
249
+ * Loss: [<code>LambdaLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#lambdaloss) with these parameters:
250
+ ```json
251
+ {
252
+ "weighting_scheme": "sentence_transformers.cross_encoder.losses.LambdaLoss.NDCGLoss2PPScheme",
253
+ "k": null,
254
+ "sigma": 1.0,
255
+ "eps": 1e-10,
256
+ "reduction_log": "binary",
257
+ "activation_fct": "torch.nn.modules.linear.Identity",
258
+ "mini_batch_size": 8
259
+ }
260
+ ```
261
+
262
+ ### Evaluation Dataset
263
+
264
+ #### msmarco
265
+
266
+ * Dataset: [msmarco](https://huggingface.co/datasets/sentence-transformers/msmarco) at [a0537b6](https://huggingface.co/datasets/sentence-transformers/msmarco/tree/a0537b6c8669051b215b020183c276a1eb2027d5)
267
+ * Size: 1,000 evaluation samples
268
+ * Columns: <code>query_id</code>, <code>doc_ids</code>, and <code>labels</code>
269
+ * Approximate statistics based on the first 1000 samples:
270
+ | | query_id | doc_ids | labels |
271
+ |:--------|:------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------|
272
+ | type | string | list | list |
273
+ | details | <ul><li>min: 10 characters</li><li>mean: 33.63 characters</li><li>max: 137 characters</li></ul> | <ul><li>min: 3 elements</li><li>mean: 12.50 elements</li><li>max: 20 elements</li></ul> | <ul><li>min: 3 elements</li><li>mean: 12.50 elements</li><li>max: 20 elements</li></ul> |
274
+ * Samples:
275
+ | query_id | doc_ids | labels |
276
+ |:----------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------|
277
+ | <code>can marijuana help dementia</code> | <code>["Cannabis 'could stop dementia in its tracks'. Cannabis may help keep Alzheimer's disease at bay. In experiments, a marijuana-based medicine triggered the formation of new brain cells and cut inflammation linked to dementia. The researchers say that using the information to create a pill suitable for people could help prevent or delay the onset of Alzheimer's.", 'Marijuana (cannabis): Marijuana in any form is not allowed on aircraft and is not allowed in the secure part of the airport (beyond the TSA screening areas). In addition it is illegal to import marijuana or marijuana-related items into the US.', 'Depakote and dementia - Can dementia be cured? Unfortunately, no. Dementia is a progressive disease. Even available treatments only slow progression or tame symptoms.', 'Marijuana Prices. The price of marijuana listed below is the typical price to buy marijuana on the black market in U.S. dollars. How much marijuana cost and the sale price of marijuana are based upon the United Natio...</code> | <code>[1, 0, 0, 0, 0, ...]</code> |
278
+ | <code>what are carcinogen</code> | <code>['Written By: Carcinogen, any of a number of agents that can cause cancer in humans. They can be divided into three major categories: chemical carcinogens (including those from biological sources), physical carcinogens, and oncogenic (cancer-causing) viruses. 1 Most carcinogens, singly or in combination, produce cancer by interacting with DNA in cells and thereby interfering with normal cellular function.', 'Tarragon (Artemisia dracunculus) is a species of perennial herb in the sunflower family. It is widespread in the wild across much of Eurasia and North America, and is cultivated for culinary and medicinal purposes in many lands.One sub-species, Artemisia dracunculus var. sativa, is cultivated for use of the leaves as an aromatic culinary herb.arragon has an aromatic property reminiscent of anise, due to the presence of estragole, a known carcinogen and teratogen in mice. The European Union investigation revealed that the danger of estragole is minimal even at 100â\x80\x931,000 tim...</code> | <code>[1, 0, 0, 0, 0, ...]</code> |
279
+ | <code>who played ben geller in friends</code> | <code>["Noelle and Cali aren't the only twins to have played one child character in Friends. Double vision: Ross' cheeky son Ben (pictured), from his first marriage to Carol, was also played by twins, Dylan and Cole Sprouse, who are now 22.", 'Update 7/29/06: There are now three â\x80\x9cTeaching Pastorsâ\x80\x9d at Applegate Christian Fellowship, according to their web site. Jon Courson is now back at Applegate. The other two listed as Teaching Pastors are Jonâ\x80\x99s two sons: Peter John and Ben Courson.on Courson has been appreciated over the years by many people who are my friends and whom I respect. I believe that he preaches the real Jesus and the true Gospel, for which I rejoice. I also believe that his ministry and church organization is a reasonable example with which to examine important issues together.', 'Ben 10 (Reboot) Ben 10: Omniverse is the fourth iteration of the Ben 10 franchise, and it is the sequel of Ben 10: Ultimate Alien. Ben was all set to be a solo hero with his n...</code> | <code>[1, 0, 0, 0, 0, ...]</code> |
280
+ * Loss: [<code>LambdaLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#lambdaloss) with these parameters:
281
+ ```json
282
+ {
283
+ "weighting_scheme": "sentence_transformers.cross_encoder.losses.LambdaLoss.NDCGLoss2PPScheme",
284
+ "k": null,
285
+ "sigma": 1.0,
286
+ "eps": 1e-10,
287
+ "reduction_log": "binary",
288
+ "activation_fct": "torch.nn.modules.linear.Identity",
289
+ "mini_batch_size": 8
290
+ }
291
+ ```
292
+
293
+ ### Training Hyperparameters
294
+ #### Non-Default Hyperparameters
295
+
296
+ - `eval_strategy`: steps
297
+ - `num_train_epochs`: 1
298
+ - `warmup_ratio`: 0.1
299
+ - `seed`: 12
300
+ - `bf16`: True
301
+ - `load_best_model_at_end`: True
302
+
303
+ #### All Hyperparameters
304
+ <details><summary>Click to expand</summary>
305
+
306
+ - `overwrite_output_dir`: False
307
+ - `do_predict`: False
308
+ - `eval_strategy`: steps
309
+ - `prediction_loss_only`: True
310
+ - `per_device_train_batch_size`: 8
311
+ - `per_device_eval_batch_size`: 8
312
+ - `per_gpu_train_batch_size`: None
313
+ - `per_gpu_eval_batch_size`: None
314
+ - `gradient_accumulation_steps`: 1
315
+ - `eval_accumulation_steps`: None
316
+ - `torch_empty_cache_steps`: None
317
+ - `learning_rate`: 5e-05
318
+ - `weight_decay`: 0.0
319
+ - `adam_beta1`: 0.9
320
+ - `adam_beta2`: 0.999
321
+ - `adam_epsilon`: 1e-08
322
+ - `max_grad_norm`: 1.0
323
+ - `num_train_epochs`: 1
324
+ - `max_steps`: -1
325
+ - `lr_scheduler_type`: linear
326
+ - `lr_scheduler_kwargs`: {}
327
+ - `warmup_ratio`: 0.1
328
+ - `warmup_steps`: 0
329
+ - `log_level`: passive
330
+ - `log_level_replica`: warning
331
+ - `log_on_each_node`: True
332
+ - `logging_nan_inf_filter`: True
333
+ - `save_safetensors`: True
334
+ - `save_on_each_node`: False
335
+ - `save_only_model`: False
336
+ - `restore_callback_states_from_checkpoint`: False
337
+ - `no_cuda`: False
338
+ - `use_cpu`: False
339
+ - `use_mps_device`: False
340
+ - `seed`: 12
341
+ - `data_seed`: None
342
+ - `jit_mode_eval`: False
343
+ - `use_ipex`: False
344
+ - `bf16`: True
345
+ - `fp16`: False
346
+ - `fp16_opt_level`: O1
347
+ - `half_precision_backend`: auto
348
+ - `bf16_full_eval`: False
349
+ - `fp16_full_eval`: False
350
+ - `tf32`: None
351
+ - `local_rank`: 0
352
+ - `ddp_backend`: None
353
+ - `tpu_num_cores`: None
354
+ - `tpu_metrics_debug`: False
355
+ - `debug`: []
356
+ - `dataloader_drop_last`: False
357
+ - `dataloader_num_workers`: 0
358
+ - `dataloader_prefetch_factor`: None
359
+ - `past_index`: -1
360
+ - `disable_tqdm`: False
361
+ - `remove_unused_columns`: True
362
+ - `label_names`: None
363
+ - `load_best_model_at_end`: True
364
+ - `ignore_data_skip`: False
365
+ - `fsdp`: []
366
+ - `fsdp_min_num_params`: 0
367
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
368
+ - `fsdp_transformer_layer_cls_to_wrap`: None
369
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
370
+ - `deepspeed`: None
371
+ - `label_smoothing_factor`: 0.0
372
+ - `optim`: adamw_torch
373
+ - `optim_args`: None
374
+ - `adafactor`: False
375
+ - `group_by_length`: False
376
+ - `length_column_name`: length
377
+ - `ddp_find_unused_parameters`: None
378
+ - `ddp_bucket_cap_mb`: None
379
+ - `ddp_broadcast_buffers`: False
380
+ - `dataloader_pin_memory`: True
381
+ - `dataloader_persistent_workers`: False
382
+ - `skip_memory_metrics`: True
383
+ - `use_legacy_prediction_loop`: False
384
+ - `push_to_hub`: False
385
+ - `resume_from_checkpoint`: None
386
+ - `hub_model_id`: None
387
+ - `hub_strategy`: every_save
388
+ - `hub_private_repo`: None
389
+ - `hub_always_push`: False
390
+ - `gradient_checkpointing`: False
391
+ - `gradient_checkpointing_kwargs`: None
392
+ - `include_inputs_for_metrics`: False
393
+ - `include_for_metrics`: []
394
+ - `eval_do_concat_batches`: True
395
+ - `fp16_backend`: auto
396
+ - `push_to_hub_model_id`: None
397
+ - `push_to_hub_organization`: None
398
+ - `mp_parameters`:
399
+ - `auto_find_batch_size`: False
400
+ - `full_determinism`: False
401
+ - `torchdynamo`: None
402
+ - `ray_scope`: last
403
+ - `ddp_timeout`: 1800
404
+ - `torch_compile`: False
405
+ - `torch_compile_backend`: None
406
+ - `torch_compile_mode`: None
407
+ - `dispatch_batches`: None
408
+ - `split_batches`: None
409
+ - `include_tokens_per_second`: False
410
+ - `include_num_input_tokens_seen`: False
411
+ - `neftune_noise_alpha`: None
412
+ - `optim_target_modules`: None
413
+ - `batch_eval_metrics`: False
414
+ - `eval_on_start`: False
415
+ - `use_liger_kernel`: False
416
+ - `eval_use_gather_object`: False
417
+ - `average_tokens_across_devices`: False
418
+ - `prompts`: None
419
+ - `batch_sampler`: batch_sampler
420
+ - `multi_dataset_batch_sampler`: proportional
421
+
422
+ </details>
423
+
424
+ ### Training Logs
425
+ <details><summary>Click to expand</summary>
426
+
427
+ | Epoch | Step | Training Loss | Validation Loss | NanoMSMARCO_R100_ndcg@10 | NanoNFCorpus_R100_ndcg@10 | NanoNQ_R100_ndcg@10 | NanoBEIR_R100_mean_ndcg@10 |
428
+ |:----------:|:---------:|:-------------:|:---------------:|:------------------------:|:-------------------------:|:--------------------:|:--------------------------:|
429
+ | -1 | -1 | - | - | 0.0234 (-0.5170) | 0.3412 (+0.0161) | 0.0321 (-0.4686) | 0.1322 (-0.3231) |
430
+ | 0.0000 | 1 | 0.8349 | - | - | - | - | - |
431
+ | 0.0040 | 200 | 0.8417 | - | - | - | - | - |
432
+ | 0.0080 | 400 | 0.8371 | - | - | - | - | - |
433
+ | 0.0120 | 600 | 0.8288 | - | - | - | - | - |
434
+ | 0.0160 | 800 | 0.8076 | - | - | - | - | - |
435
+ | 0.0200 | 1000 | 0.7802 | 0.7316 | 0.2004 (-0.3400) | 0.3110 (-0.0140) | 0.2594 (-0.2413) | 0.2569 (-0.1984) |
436
+ | 0.0240 | 1200 | 0.6988 | - | - | - | - | - |
437
+ | 0.0280 | 1400 | 0.4688 | - | - | - | - | - |
438
+ | 0.0321 | 1600 | 0.3742 | - | - | - | - | - |
439
+ | 0.0361 | 1800 | 0.3441 | - | - | - | - | - |
440
+ | 0.0401 | 2000 | 0.3058 | 0.1975 | 0.6091 (+0.0687) | 0.3978 (+0.0727) | 0.6645 (+0.1639) | 0.5571 (+0.1018) |
441
+ | 0.0441 | 2200 | 0.2812 | - | - | - | - | - |
442
+ | 0.0481 | 2400 | 0.2748 | - | - | - | - | - |
443
+ | 0.0521 | 2600 | 0.2518 | - | - | - | - | - |
444
+ | 0.0561 | 2800 | 0.2591 | - | - | - | - | - |
445
+ | 0.0601 | 3000 | 0.2508 | 0.1673 | 0.7137 (+0.1733) | 0.3980 (+0.0730) | 0.7471 (+0.2464) | 0.6196 (+0.1642) |
446
+ | 0.0641 | 3200 | 0.2446 | - | - | - | - | - |
447
+ | 0.0681 | 3400 | 0.2385 | - | - | - | - | - |
448
+ | 0.0721 | 3600 | 0.2381 | - | - | - | - | - |
449
+ | 0.0761 | 3800 | 0.2204 | - | - | - | - | - |
450
+ | 0.0801 | 4000 | 0.221 | 0.1757 | 0.6321 (+0.0916) | 0.3937 (+0.0687) | 0.7029 (+0.2023) | 0.5762 (+0.1209) |
451
+ | 0.0841 | 4200 | 0.2131 | - | - | - | - | - |
452
+ | 0.0882 | 4400 | 0.2222 | - | - | - | - | - |
453
+ | 0.0922 | 4600 | 0.2307 | - | - | - | - | - |
454
+ | 0.0962 | 4800 | 0.2104 | - | - | - | - | - |
455
+ | 0.1002 | 5000 | 0.2151 | 0.1697 | 0.6388 (+0.0984) | 0.3846 (+0.0595) | 0.6659 (+0.1653) | 0.5631 (+0.1077) |
456
+ | 0.1042 | 5200 | 0.208 | - | - | - | - | - |
457
+ | 0.1082 | 5400 | 0.2147 | - | - | - | - | - |
458
+ | 0.1122 | 5600 | 0.2114 | - | - | - | - | - |
459
+ | 0.1162 | 5800 | 0.2224 | - | - | - | - | - |
460
+ | 0.1202 | 6000 | 0.2094 | 0.1583 | 0.6165 (+0.0761) | 0.3969 (+0.0718) | 0.6968 (+0.1961) | 0.5700 (+0.1147) |
461
+ | 0.1242 | 6200 | 0.2065 | - | - | - | - | - |
462
+ | 0.1282 | 6400 | 0.2191 | - | - | - | - | - |
463
+ | 0.1322 | 6600 | 0.2108 | - | - | - | - | - |
464
+ | 0.1362 | 6800 | 0.2067 | - | - | - | - | - |
465
+ | 0.1402 | 7000 | 0.2055 | 0.1554 | 0.6295 (+0.0891) | 0.3968 (+0.0718) | 0.6862 (+0.1855) | 0.5708 (+0.1155) |
466
+ | 0.1443 | 7200 | 0.1994 | - | - | - | - | - |
467
+ | 0.1483 | 7400 | 0.2067 | - | - | - | - | - |
468
+ | 0.1523 | 7600 | 0.1933 | - | - | - | - | - |
469
+ | 0.1563 | 7800 | 0.1903 | - | - | - | - | - |
470
+ | 0.1603 | 8000 | 0.1837 | 0.1569 | 0.6236 (+0.0831) | 0.4196 (+0.0946) | 0.6927 (+0.1920) | 0.5786 (+0.1232) |
471
+ | 0.1643 | 8200 | 0.1968 | - | - | - | - | - |
472
+ | 0.1683 | 8400 | 0.2037 | - | - | - | - | - |
473
+ | 0.1723 | 8600 | 0.2052 | - | - | - | - | - |
474
+ | 0.1763 | 8800 | 0.2007 | - | - | - | - | - |
475
+ | 0.1803 | 9000 | 0.1771 | 0.1642 | 0.6579 (+0.1175) | 0.3949 (+0.0699) | 0.6931 (+0.1924) | 0.5820 (+0.1266) |
476
+ | 0.1843 | 9200 | 0.1828 | - | - | - | - | - |
477
+ | 0.1883 | 9400 | 0.195 | - | - | - | - | - |
478
+ | 0.1923 | 9600 | 0.1992 | - | - | - | - | - |
479
+ | 0.1963 | 9800 | 0.1859 | - | - | - | - | - |
480
+ | 0.2004 | 10000 | 0.1934 | 0.1514 | 0.6756 (+0.1351) | 0.4280 (+0.1029) | 0.7235 (+0.2228) | 0.6090 (+0.1536) |
481
+ | 0.2044 | 10200 | 0.1828 | - | - | - | - | - |
482
+ | 0.2084 | 10400 | 0.1749 | - | - | - | - | - |
483
+ | 0.2124 | 10600 | 0.1908 | - | - | - | - | - |
484
+ | 0.2164 | 10800 | 0.1837 | - | - | - | - | - |
485
+ | 0.2204 | 11000 | 0.1726 | 0.1469 | 0.6427 (+0.1023) | 0.4170 (+0.0920) | 0.7408 (+0.2402) | 0.6002 (+0.1448) |
486
+ | 0.2244 | 11200 | 0.1922 | - | - | - | - | - |
487
+ | 0.2284 | 11400 | 0.1853 | - | - | - | - | - |
488
+ | 0.2324 | 11600 | 0.1856 | - | - | - | - | - |
489
+ | 0.2364 | 11800 | 0.1797 | - | - | - | - | - |
490
+ | 0.2404 | 12000 | 0.1631 | 0.1508 | 0.6758 (+0.1354) | 0.4076 (+0.0825) | 0.7316 (+0.2310) | 0.6050 (+0.1496) |
491
+ | 0.2444 | 12200 | 0.1778 | - | - | - | - | - |
492
+ | 0.2484 | 12400 | 0.174 | - | - | - | - | - |
493
+ | 0.2524 | 12600 | 0.159 | - | - | - | - | - |
494
+ | 0.2565 | 12800 | 0.1744 | - | - | - | - | - |
495
+ | 0.2605 | 13000 | 0.1828 | 0.1524 | 0.6696 (+0.1291) | 0.4039 (+0.0788) | 0.7001 (+0.1994) | 0.5912 (+0.1358) |
496
+ | 0.2645 | 13200 | 0.1726 | - | - | - | - | - |
497
+ | 0.2685 | 13400 | 0.1947 | - | - | - | - | - |
498
+ | 0.2725 | 13600 | 0.1697 | - | - | - | - | - |
499
+ | 0.2765 | 13800 | 0.1958 | - | - | - | - | - |
500
+ | 0.2805 | 14000 | 0.1917 | 0.1442 | 0.6612 (+0.1208) | 0.4091 (+0.0841) | 0.6987 (+0.1980) | 0.5897 (+0.1343) |
501
+ | 0.2845 | 14200 | 0.1863 | - | - | - | - | - |
502
+ | 0.2885 | 14400 | 0.1844 | - | - | - | - | - |
503
+ | 0.2925 | 14600 | 0.1764 | - | - | - | - | - |
504
+ | 0.2965 | 14800 | 0.1719 | - | - | - | - | - |
505
+ | 0.3005 | 15000 | 0.1844 | 0.1481 | 0.6572 (+0.1168) | 0.3984 (+0.0733) | 0.7382 (+0.2376) | 0.5979 (+0.1426) |
506
+ | 0.3045 | 15200 | 0.176 | - | - | - | - | - |
507
+ | 0.3085 | 15400 | 0.1724 | - | - | - | - | - |
508
+ | 0.3126 | 15600 | 0.1747 | - | - | - | - | - |
509
+ | 0.3166 | 15800 | 0.1649 | - | - | - | - | - |
510
+ | 0.3206 | 16000 | 0.1779 | 0.1450 | 0.6168 (+0.0763) | 0.4096 (+0.0846) | 0.7118 (+0.2112) | 0.5794 (+0.1240) |
511
+ | 0.3246 | 16200 | 0.1755 | - | - | - | - | - |
512
+ | 0.3286 | 16400 | 0.1567 | - | - | - | - | - |
513
+ | 0.3326 | 16600 | 0.1749 | - | - | - | - | - |
514
+ | 0.3366 | 16800 | 0.1827 | - | - | - | - | - |
515
+ | 0.3406 | 17000 | 0.1773 | 0.1394 | 0.6868 (+0.1464) | 0.3943 (+0.0693) | 0.7007 (+0.2001) | 0.5940 (+0.1386) |
516
+ | 0.3446 | 17200 | 0.1747 | - | - | - | - | - |
517
+ | 0.3486 | 17400 | 0.1805 | - | - | - | - | - |
518
+ | 0.3526 | 17600 | 0.1688 | - | - | - | - | - |
519
+ | 0.3566 | 17800 | 0.1649 | - | - | - | - | - |
520
+ | 0.3606 | 18000 | 0.1747 | 0.1405 | 0.6390 (+0.0986) | 0.3952 (+0.0701) | 0.7370 (+0.2364) | 0.5904 (+0.1350) |
521
+ | 0.3646 | 18200 | 0.1797 | - | - | - | - | - |
522
+ | 0.3687 | 18400 | 0.1557 | - | - | - | - | - |
523
+ | 0.3727 | 18600 | 0.1644 | - | - | - | - | - |
524
+ | 0.3767 | 18800 | 0.1701 | - | - | - | - | - |
525
+ | 0.3807 | 19000 | 0.1673 | 0.1433 | 0.6799 (+0.1395) | 0.4012 (+0.0762) | 0.7286 (+0.2279) | 0.6032 (+0.1479) |
526
+ | 0.3847 | 19200 | 0.1736 | - | - | - | - | - |
527
+ | 0.3887 | 19400 | 0.1767 | - | - | - | - | - |
528
+ | 0.3927 | 19600 | 0.1735 | - | - | - | - | - |
529
+ | 0.3967 | 19800 | 0.1758 | - | - | - | - | - |
530
+ | 0.4007 | 20000 | 0.1711 | 0.1380 | 0.6773 (+0.1369) | 0.4149 (+0.0898) | 0.7166 (+0.2159) | 0.6029 (+0.1476) |
531
+ | 0.4047 | 20200 | 0.1704 | - | - | - | - | - |
532
+ | 0.4087 | 20400 | 0.1637 | - | - | - | - | - |
533
+ | 0.4127 | 20600 | 0.1783 | - | - | - | - | - |
534
+ | 0.4167 | 20800 | 0.1585 | - | - | - | - | - |
535
+ | 0.4207 | 21000 | 0.1769 | 0.1399 | 0.6832 (+0.1428) | 0.4254 (+0.1003) | 0.6977 (+0.1970) | 0.6021 (+0.1467) |
536
+ | 0.4248 | 21200 | 0.1644 | - | - | - | - | - |
537
+ | 0.4288 | 21400 | 0.1693 | - | - | - | - | - |
538
+ | 0.4328 | 21600 | 0.1604 | - | - | - | - | - |
539
+ | 0.4368 | 21800 | 0.1714 | - | - | - | - | - |
540
+ | 0.4408 | 22000 | 0.1577 | 0.1392 | 0.6715 (+0.1311) | 0.4199 (+0.0948) | 0.7038 (+0.2032) | 0.5984 (+0.1430) |
541
+ | 0.4448 | 22200 | 0.1742 | - | - | - | - | - |
542
+ | 0.4488 | 22400 | 0.1744 | - | - | - | - | - |
543
+ | 0.4528 | 22600 | 0.1682 | - | - | - | - | - |
544
+ | 0.4568 | 22800 | 0.1597 | - | - | - | - | - |
545
+ | 0.4608 | 23000 | 0.1626 | 0.1364 | 0.6698 (+0.1294) | 0.4191 (+0.0941) | 0.7255 (+0.2249) | 0.6048 (+0.1494) |
546
+ | 0.4648 | 23200 | 0.1543 | - | - | - | - | - |
547
+ | 0.4688 | 23400 | 0.1571 | - | - | - | - | - |
548
+ | 0.4728 | 23600 | 0.1576 | - | - | - | - | - |
549
+ | 0.4768 | 23800 | 0.1644 | - | - | - | - | - |
550
+ | 0.4809 | 24000 | 0.1542 | 0.1444 | 0.6618 (+0.1213) | 0.4095 (+0.0844) | 0.7442 (+0.2436) | 0.6052 (+0.1498) |
551
+ | 0.4849 | 24200 | 0.1826 | - | - | - | - | - |
552
+ | 0.4889 | 24400 | 0.1649 | - | - | - | - | - |
553
+ | 0.4929 | 24600 | 0.154 | - | - | - | - | - |
554
+ | 0.4969 | 24800 | 0.1779 | - | - | - | - | - |
555
+ | 0.5009 | 25000 | 0.1615 | 0.1373 | 0.6506 (+0.1102) | 0.3971 (+0.0721) | 0.7165 (+0.2159) | 0.5881 (+0.1327) |
556
+ | 0.5049 | 25200 | 0.1558 | - | - | - | - | - |
557
+ | 0.5089 | 25400 | 0.1741 | - | - | - | - | - |
558
+ | 0.5129 | 25600 | 0.151 | - | - | - | - | - |
559
+ | 0.5169 | 25800 | 0.1654 | - | - | - | - | - |
560
+ | 0.5209 | 26000 | 0.1656 | 0.1368 | 0.6631 (+0.1226) | 0.3888 (+0.0638) | 0.7092 (+0.2085) | 0.5870 (+0.1317) |
561
+ | 0.5249 | 26200 | 0.1603 | - | - | - | - | - |
562
+ | 0.5289 | 26400 | 0.1547 | - | - | - | - | - |
563
+ | 0.5329 | 26600 | 0.1782 | - | - | - | - | - |
564
+ | 0.5370 | 26800 | 0.1571 | - | - | - | - | - |
565
+ | 0.5410 | 27000 | 0.1595 | 0.1376 | 0.6352 (+0.0948) | 0.3960 (+0.0710) | 0.7081 (+0.2074) | 0.5798 (+0.1244) |
566
+ | 0.5450 | 27200 | 0.1764 | - | - | - | - | - |
567
+ | 0.5490 | 27400 | 0.1672 | - | - | - | - | - |
568
+ | 0.5530 | 27600 | 0.1669 | - | - | - | - | - |
569
+ | 0.5570 | 27800 | 0.1719 | - | - | - | - | - |
570
+ | 0.5610 | 28000 | 0.1759 | 0.1355 | 0.6629 (+0.1225) | 0.4013 (+0.0762) | 0.7671 (+0.2665) | 0.6104 (+0.1551) |
571
+ | 0.5650 | 28200 | 0.1595 | - | - | - | - | - |
572
+ | 0.5690 | 28400 | 0.1558 | - | - | - | - | - |
573
+ | 0.5730 | 28600 | 0.1617 | - | - | - | - | - |
574
+ | 0.5770 | 28800 | 0.1669 | - | - | - | - | - |
575
+ | 0.5810 | 29000 | 0.1481 | 0.1363 | 0.6613 (+0.1208) | 0.3961 (+0.0710) | 0.7413 (+0.2406) | 0.5995 (+0.1442) |
576
+ | 0.5850 | 29200 | 0.1584 | - | - | - | - | - |
577
+ | 0.5890 | 29400 | 0.1654 | - | - | - | - | - |
578
+ | 0.5931 | 29600 | 0.1659 | - | - | - | - | - |
579
+ | 0.5971 | 29800 | 0.1653 | - | - | - | - | - |
580
+ | 0.6011 | 30000 | 0.1606 | 0.1368 | 0.6554 (+0.1150) | 0.3927 (+0.0676) | 0.7139 (+0.2132) | 0.5873 (+0.1320) |
581
+ | 0.6051 | 30200 | 0.1625 | - | - | - | - | - |
582
+ | 0.6091 | 30400 | 0.1581 | - | - | - | - | - |
583
+ | 0.6131 | 30600 | 0.145 | - | - | - | - | - |
584
+ | 0.6171 | 30800 | 0.1584 | - | - | - | - | - |
585
+ | 0.6211 | 31000 | 0.1566 | 0.1325 | 0.6680 (+0.1275) | 0.3978 (+0.0728) | 0.7372 (+0.2365) | 0.6010 (+0.1456) |
586
+ | 0.6251 | 31200 | 0.1611 | - | - | - | - | - |
587
+ | 0.6291 | 31400 | 0.1724 | - | - | - | - | - |
588
+ | 0.6331 | 31600 | 0.1609 | - | - | - | - | - |
589
+ | 0.6371 | 31800 | 0.1621 | - | - | - | - | - |
590
+ | 0.6411 | 32000 | 0.1537 | 0.1300 | 0.6615 (+0.1211) | 0.4063 (+0.0813) | 0.7697 (+0.2691) | 0.6125 (+0.1571) |
591
+ | 0.6451 | 32200 | 0.1641 | - | - | - | - | - |
592
+ | 0.6492 | 32400 | 0.1487 | - | - | - | - | - |
593
+ | 0.6532 | 32600 | 0.1456 | - | - | - | - | - |
594
+ | 0.6572 | 32800 | 0.1514 | - | - | - | - | - |
595
+ | 0.6612 | 33000 | 0.158 | 0.1309 | 0.6556 (+0.1152) | 0.4125 (+0.0875) | 0.7479 (+0.2473) | 0.6053 (+0.1500) |
596
+ | 0.6652 | 33200 | 0.1451 | - | - | - | - | - |
597
+ | 0.6692 | 33400 | 0.1495 | - | - | - | - | - |
598
+ | 0.6732 | 33600 | 0.1467 | - | - | - | - | - |
599
+ | 0.6772 | 33800 | 0.143 | - | - | - | - | - |
600
+ | 0.6812 | 34000 | 0.1639 | 0.1334 | 0.6769 (+0.1365) | 0.4002 (+0.0752) | 0.7420 (+0.2414) | 0.6064 (+0.1510) |
601
+ | 0.6852 | 34200 | 0.1542 | - | - | - | - | - |
602
+ | 0.6892 | 34400 | 0.1592 | - | - | - | - | - |
603
+ | 0.6932 | 34600 | 0.1452 | - | - | - | - | - |
604
+ | 0.6972 | 34800 | 0.1569 | - | - | - | - | - |
605
+ | 0.7012 | 35000 | 0.1502 | 0.1299 | 0.6648 (+0.1243) | 0.3834 (+0.0583) | 0.7684 (+0.2678) | 0.6055 (+0.1501) |
606
+ | 0.7053 | 35200 | 0.1564 | - | - | - | - | - |
607
+ | 0.7093 | 35400 | 0.1509 | - | - | - | - | - |
608
+ | 0.7133 | 35600 | 0.156 | - | - | - | - | - |
609
+ | 0.7173 | 35800 | 0.1547 | - | - | - | - | - |
610
+ | 0.7213 | 36000 | 0.1595 | 0.1297 | 0.6521 (+0.1117) | 0.3916 (+0.0665) | 0.7318 (+0.2311) | 0.5918 (+0.1364) |
611
+ | 0.7253 | 36200 | 0.1457 | - | - | - | - | - |
612
+ | 0.7293 | 36400 | 0.1615 | - | - | - | - | - |
613
+ | 0.7333 | 36600 | 0.1508 | - | - | - | - | - |
614
+ | 0.7373 | 36800 | 0.1478 | - | - | - | - | - |
615
+ | 0.7413 | 37000 | 0.1455 | 0.1322 | 0.6614 (+0.1210) | 0.4132 (+0.0882) | 0.7656 (+0.2650) | 0.6134 (+0.1581) |
616
+ | 0.7453 | 37200 | 0.1526 | - | - | - | - | - |
617
+ | 0.7493 | 37400 | 0.1571 | - | - | - | - | - |
618
+ | 0.7533 | 37600 | 0.141 | - | - | - | - | - |
619
+ | 0.7573 | 37800 | 0.1418 | - | - | - | - | - |
620
+ | 0.7614 | 38000 | 0.1597 | 0.1347 | 0.6707 (+0.1302) | 0.4175 (+0.0925) | 0.7568 (+0.2561) | 0.6150 (+0.1596) |
621
+ | 0.7654 | 38200 | 0.1512 | - | - | - | - | - |
622
+ | 0.7694 | 38400 | 0.1424 | - | - | - | - | - |
623
+ | 0.7734 | 38600 | 0.1601 | - | - | - | - | - |
624
+ | 0.7774 | 38800 | 0.13 | - | - | - | - | - |
625
+ | 0.7814 | 39000 | 0.1508 | 0.1322 | 0.6960 (+0.1556) | 0.4032 (+0.0781) | 0.7585 (+0.2579) | 0.6192 (+0.1639) |
626
+ | 0.7854 | 39200 | 0.1456 | - | - | - | - | - |
627
+ | 0.7894 | 39400 | 0.1502 | - | - | - | - | - |
628
+ | 0.7934 | 39600 | 0.1507 | - | - | - | - | - |
629
+ | 0.7974 | 39800 | 0.1696 | - | - | - | - | - |
630
+ | **0.8014** | **40000** | **0.1381** | **0.1289** | **0.7251 (+0.1847)** | **0.4143 (+0.0892)** | **0.7594 (+0.2587)** | **0.6329 (+0.1776)** |
631
+ | 0.8054 | 40200 | 0.1544 | - | - | - | - | - |
632
+ | 0.8094 | 40400 | 0.1541 | - | - | - | - | - |
633
+ | 0.8134 | 40600 | 0.1458 | - | - | - | - | - |
634
+ | 0.8175 | 40800 | 0.1411 | - | - | - | - | - |
635
+ | 0.8215 | 41000 | 0.1495 | 0.1280 | 0.7051 (+0.1646) | 0.4102 (+0.0851) | 0.7520 (+0.2514) | 0.6224 (+0.1670) |
636
+ | 0.8255 | 41200 | 0.1465 | - | - | - | - | - |
637
+ | 0.8295 | 41400 | 0.1577 | - | - | - | - | - |
638
+ | 0.8335 | 41600 | 0.1489 | - | - | - | - | - |
639
+ | 0.8375 | 41800 | 0.1481 | - | - | - | - | - |
640
+ | 0.8415 | 42000 | 0.148 | 0.1304 | 0.6944 (+0.1539) | 0.4023 (+0.0772) | 0.7440 (+0.2433) | 0.6135 (+0.1582) |
641
+ | 0.8455 | 42200 | 0.1529 | - | - | - | - | - |
642
+ | 0.8495 | 42400 | 0.1522 | - | - | - | - | - |
643
+ | 0.8535 | 42600 | 0.1455 | - | - | - | - | - |
644
+ | 0.8575 | 42800 | 0.1567 | - | - | - | - | - |
645
+ | 0.8615 | 43000 | 0.1435 | 0.1304 | 0.6710 (+0.1306) | 0.4130 (+0.0880) | 0.7493 (+0.2486) | 0.6111 (+0.1557) |
646
+ | 0.8655 | 43200 | 0.1426 | - | - | - | - | - |
647
+ | 0.8695 | 43400 | 0.1527 | - | - | - | - | - |
648
+ | 0.8736 | 43600 | 0.1431 | - | - | - | - | - |
649
+ | 0.8776 | 43800 | 0.1382 | - | - | - | - | - |
650
+ | 0.8816 | 44000 | 0.1554 | 0.1288 | 0.6842 (+0.1437) | 0.3996 (+0.0746) | 0.7535 (+0.2529) | 0.6124 (+0.1571) |
651
+ | 0.8856 | 44200 | 0.1491 | - | - | - | - | - |
652
+ | 0.8896 | 44400 | 0.1626 | - | - | - | - | - |
653
+ | 0.8936 | 44600 | 0.1471 | - | - | - | - | - |
654
+ | 0.8976 | 44800 | 0.1459 | - | - | - | - | - |
655
+ | 0.9016 | 45000 | 0.1501 | 0.1284 | 0.6995 (+0.1590) | 0.4051 (+0.0801) | 0.7608 (+0.2602) | 0.6218 (+0.1664) |
656
+ | 0.9056 | 45200 | 0.1513 | - | - | - | - | - |
657
+ | 0.9096 | 45400 | 0.1521 | - | - | - | - | - |
658
+ | 0.9136 | 45600 | 0.1417 | - | - | - | - | - |
659
+ | 0.9176 | 45800 | 0.1452 | - | - | - | - | - |
660
+ | 0.9216 | 46000 | 0.1591 | 0.1254 | 0.7086 (+0.1682) | 0.3940 (+0.0690) | 0.7567 (+0.2561) | 0.6198 (+0.1644) |
661
+ | 0.9256 | 46200 | 0.1473 | - | - | - | - | - |
662
+ | 0.9297 | 46400 | 0.1329 | - | - | - | - | - |
663
+ | 0.9337 | 46600 | 0.1523 | - | - | - | - | - |
664
+ | 0.9377 | 46800 | 0.1385 | - | - | - | - | - |
665
+ | 0.9417 | 47000 | 0.1393 | 0.1267 | 0.7161 (+0.1756) | 0.3941 (+0.0690) | 0.7662 (+0.2656) | 0.6255 (+0.1701) |
666
+ | 0.9457 | 47200 | 0.1421 | - | - | - | - | - |
667
+ | 0.9497 | 47400 | 0.1509 | - | - | - | - | - |
668
+ | 0.9537 | 47600 | 0.1587 | - | - | - | - | - |
669
+ | 0.9577 | 47800 | 0.1402 | - | - | - | - | - |
670
+ | 0.9617 | 48000 | 0.1355 | 0.1278 | 0.6976 (+0.1571) | 0.3958 (+0.0708) | 0.7538 (+0.2531) | 0.6157 (+0.1603) |
671
+ | 0.9657 | 48200 | 0.1518 | - | - | - | - | - |
672
+ | 0.9697 | 48400 | 0.1369 | - | - | - | - | - |
673
+ | 0.9737 | 48600 | 0.1475 | - | - | - | - | - |
674
+ | 0.9777 | 48800 | 0.1495 | - | - | - | - | - |
675
+ | 0.9817 | 49000 | 0.1402 | 0.1275 | 0.6973 (+0.1568) | 0.3990 (+0.0740) | 0.7534 (+0.2528) | 0.6166 (+0.1612) |
676
+ | 0.9858 | 49200 | 0.1527 | - | - | - | - | - |
677
+ | 0.9898 | 49400 | 0.143 | - | - | - | - | - |
678
+ | 0.9938 | 49600 | 0.1619 | - | - | - | - | - |
679
+ | 0.9978 | 49800 | 0.1422 | - | - | - | - | - |
680
+ | -1 | -1 | - | - | 0.7251 (+0.1847) | 0.4143 (+0.0892) | 0.7594 (+0.2587) | 0.6329 (+0.1776) |
681
+
682
+ * The bold row denotes the saved checkpoint.
683
+ </details>
684
+
685
+ ### Framework Versions
686
+ - Python: 3.11.10
687
+ - Sentence Transformers: 3.5.0.dev0
688
+ - Transformers: 4.49.0
689
+ - PyTorch: 2.5.1+cu124
690
+ - Accelerate: 1.2.0
691
+ - Datasets: 2.21.0
692
+ - Tokenizers: 0.21.0
693
+
694
+ ## Citation
695
+
696
+ ### BibTeX
697
+
698
+ #### Sentence Transformers
699
+ ```bibtex
700
+ @inproceedings{reimers-2019-sentence-bert,
701
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
702
+ author = "Reimers, Nils and Gurevych, Iryna",
703
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
704
+ month = "11",
705
+ year = "2019",
706
+ publisher = "Association for Computational Linguistics",
707
+ url = "https://arxiv.org/abs/1908.10084",
708
+ }
709
+ ```
710
+
711
+ #### LambdaLoss
712
+ ```bibtex
713
+ @inproceedings{wang2018lambdaloss,
714
+ title={The lambdaloss framework for ranking metric optimization},
715
+ author={Wang, Xuanhui and Li, Cheng and Golbandi, Nadav and Bendersky, Michael and Najork, Marc},
716
+ booktitle={Proceedings of the 27th ACM international conference on information and knowledge management},
717
+ pages={1313--1322},
718
+ year={2018}
719
+ }
720
+ ```
721
+
722
+ <!--
723
+ ## Glossary
724
+
725
+ *Clearly define terms in order to be accessible across audiences.*
726
+ -->
727
+
728
+ <!--
729
+ ## Model Card Authors
730
+
731
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
732
+ -->
733
+
734
+ <!--
735
+ ## Model Card Contact
736
+
737
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
738
+ -->
config.json ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "answerdotai/ModernBERT-base",
3
+ "architectures": [
4
+ "ModernBertForSequenceClassification"
5
+ ],
6
+ "attention_bias": false,
7
+ "attention_dropout": 0.1,
8
+ "bos_token_id": 50281,
9
+ "classifier_activation": "gelu",
10
+ "classifier_bias": false,
11
+ "classifier_dropout": 0.1,
12
+ "classifier_pooling": "mean",
13
+ "cls_token_id": 50281,
14
+ "decoder_bias": true,
15
+ "deterministic_flash_attn": false,
16
+ "embedding_dropout": 0.1,
17
+ "eos_token_id": 50282,
18
+ "global_attn_every_n_layers": 3,
19
+ "global_rope_theta": 160000.0,
20
+ "gradient_checkpointing": false,
21
+ "hidden_activation": "gelu",
22
+ "hidden_size": 768,
23
+ "id2label": {
24
+ "0": "LABEL_0"
25
+ },
26
+ "initializer_cutoff_factor": 2.0,
27
+ "initializer_range": 0.02,
28
+ "intermediate_size": 1152,
29
+ "label2id": {
30
+ "LABEL_0": 0
31
+ },
32
+ "layer_norm_eps": 1e-05,
33
+ "local_attention": 128,
34
+ "local_rope_theta": 10000.0,
35
+ "max_position_embeddings": 8192,
36
+ "mlp_bias": false,
37
+ "mlp_dropout": 0.1,
38
+ "model_type": "modernbert",
39
+ "norm_bias": false,
40
+ "norm_eps": 1e-05,
41
+ "num_attention_heads": 12,
42
+ "num_hidden_layers": 22,
43
+ "pad_token_id": 50283,
44
+ "position_embedding_type": "absolute",
45
+ "reference_compile": true,
46
+ "repad_logits_with_grad": false,
47
+ "sentence_transformers": {
48
+ "activation_fn": "torch.nn.modules.activation.Sigmoid"
49
+ },
50
+ "sep_token_id": 50282,
51
+ "sparse_pred_ignore_index": -100,
52
+ "sparse_prediction": false,
53
+ "torch_dtype": "float32",
54
+ "transformers_version": "4.49.0",
55
+ "vocab_size": 50368
56
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b5c0d9f8a2857f833ae60d38d66ba08f30359868092b6f403a403ecc787707c1
3
+ size 598436708
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": true,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,945 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "|||IP_ADDRESS|||",
5
+ "lstrip": false,
6
+ "normalized": true,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": false
10
+ },
11
+ "1": {
12
+ "content": "<|padding|>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "50254": {
20
+ "content": " ",
21
+ "lstrip": false,
22
+ "normalized": true,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": false
26
+ },
27
+ "50255": {
28
+ "content": " ",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": false
34
+ },
35
+ "50256": {
36
+ "content": " ",
37
+ "lstrip": false,
38
+ "normalized": true,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": false
42
+ },
43
+ "50257": {
44
+ "content": " ",
45
+ "lstrip": false,
46
+ "normalized": true,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": false
50
+ },
51
+ "50258": {
52
+ "content": " ",
53
+ "lstrip": false,
54
+ "normalized": true,
55
+ "rstrip": false,
56
+ "single_word": false,
57
+ "special": false
58
+ },
59
+ "50259": {
60
+ "content": " ",
61
+ "lstrip": false,
62
+ "normalized": true,
63
+ "rstrip": false,
64
+ "single_word": false,
65
+ "special": false
66
+ },
67
+ "50260": {
68
+ "content": " ",
69
+ "lstrip": false,
70
+ "normalized": true,
71
+ "rstrip": false,
72
+ "single_word": false,
73
+ "special": false
74
+ },
75
+ "50261": {
76
+ "content": " ",
77
+ "lstrip": false,
78
+ "normalized": true,
79
+ "rstrip": false,
80
+ "single_word": false,
81
+ "special": false
82
+ },
83
+ "50262": {
84
+ "content": " ",
85
+ "lstrip": false,
86
+ "normalized": true,
87
+ "rstrip": false,
88
+ "single_word": false,
89
+ "special": false
90
+ },
91
+ "50263": {
92
+ "content": " ",
93
+ "lstrip": false,
94
+ "normalized": true,
95
+ "rstrip": false,
96
+ "single_word": false,
97
+ "special": false
98
+ },
99
+ "50264": {
100
+ "content": " ",
101
+ "lstrip": false,
102
+ "normalized": true,
103
+ "rstrip": false,
104
+ "single_word": false,
105
+ "special": false
106
+ },
107
+ "50265": {
108
+ "content": " ",
109
+ "lstrip": false,
110
+ "normalized": true,
111
+ "rstrip": false,
112
+ "single_word": false,
113
+ "special": false
114
+ },
115
+ "50266": {
116
+ "content": " ",
117
+ "lstrip": false,
118
+ "normalized": true,
119
+ "rstrip": false,
120
+ "single_word": false,
121
+ "special": false
122
+ },
123
+ "50267": {
124
+ "content": " ",
125
+ "lstrip": false,
126
+ "normalized": true,
127
+ "rstrip": false,
128
+ "single_word": false,
129
+ "special": false
130
+ },
131
+ "50268": {
132
+ "content": " ",
133
+ "lstrip": false,
134
+ "normalized": true,
135
+ "rstrip": false,
136
+ "single_word": false,
137
+ "special": false
138
+ },
139
+ "50269": {
140
+ "content": " ",
141
+ "lstrip": false,
142
+ "normalized": true,
143
+ "rstrip": false,
144
+ "single_word": false,
145
+ "special": false
146
+ },
147
+ "50270": {
148
+ "content": " ",
149
+ "lstrip": false,
150
+ "normalized": true,
151
+ "rstrip": false,
152
+ "single_word": false,
153
+ "special": false
154
+ },
155
+ "50271": {
156
+ "content": " ",
157
+ "lstrip": false,
158
+ "normalized": true,
159
+ "rstrip": false,
160
+ "single_word": false,
161
+ "special": false
162
+ },
163
+ "50272": {
164
+ "content": " ",
165
+ "lstrip": false,
166
+ "normalized": true,
167
+ "rstrip": false,
168
+ "single_word": false,
169
+ "special": false
170
+ },
171
+ "50273": {
172
+ "content": " ",
173
+ "lstrip": false,
174
+ "normalized": true,
175
+ "rstrip": false,
176
+ "single_word": false,
177
+ "special": false
178
+ },
179
+ "50274": {
180
+ "content": " ",
181
+ "lstrip": false,
182
+ "normalized": true,
183
+ "rstrip": false,
184
+ "single_word": false,
185
+ "special": false
186
+ },
187
+ "50275": {
188
+ "content": " ",
189
+ "lstrip": false,
190
+ "normalized": true,
191
+ "rstrip": false,
192
+ "single_word": false,
193
+ "special": false
194
+ },
195
+ "50276": {
196
+ "content": " ",
197
+ "lstrip": false,
198
+ "normalized": true,
199
+ "rstrip": false,
200
+ "single_word": false,
201
+ "special": false
202
+ },
203
+ "50277": {
204
+ "content": "|||EMAIL_ADDRESS|||",
205
+ "lstrip": false,
206
+ "normalized": true,
207
+ "rstrip": false,
208
+ "single_word": false,
209
+ "special": false
210
+ },
211
+ "50278": {
212
+ "content": "|||PHONE_NUMBER|||",
213
+ "lstrip": false,
214
+ "normalized": true,
215
+ "rstrip": false,
216
+ "single_word": false,
217
+ "special": false
218
+ },
219
+ "50279": {
220
+ "content": "<|endoftext|>",
221
+ "lstrip": false,
222
+ "normalized": false,
223
+ "rstrip": false,
224
+ "single_word": false,
225
+ "special": true
226
+ },
227
+ "50280": {
228
+ "content": "[UNK]",
229
+ "lstrip": false,
230
+ "normalized": false,
231
+ "rstrip": false,
232
+ "single_word": false,
233
+ "special": true
234
+ },
235
+ "50281": {
236
+ "content": "[CLS]",
237
+ "lstrip": false,
238
+ "normalized": false,
239
+ "rstrip": false,
240
+ "single_word": false,
241
+ "special": true
242
+ },
243
+ "50282": {
244
+ "content": "[SEP]",
245
+ "lstrip": false,
246
+ "normalized": false,
247
+ "rstrip": false,
248
+ "single_word": false,
249
+ "special": true
250
+ },
251
+ "50283": {
252
+ "content": "[PAD]",
253
+ "lstrip": false,
254
+ "normalized": false,
255
+ "rstrip": false,
256
+ "single_word": false,
257
+ "special": true
258
+ },
259
+ "50284": {
260
+ "content": "[MASK]",
261
+ "lstrip": true,
262
+ "normalized": false,
263
+ "rstrip": false,
264
+ "single_word": false,
265
+ "special": true
266
+ },
267
+ "50285": {
268
+ "content": "[unused0]",
269
+ "lstrip": false,
270
+ "normalized": true,
271
+ "rstrip": false,
272
+ "single_word": false,
273
+ "special": false
274
+ },
275
+ "50286": {
276
+ "content": "[unused1]",
277
+ "lstrip": false,
278
+ "normalized": true,
279
+ "rstrip": false,
280
+ "single_word": false,
281
+ "special": false
282
+ },
283
+ "50287": {
284
+ "content": "[unused2]",
285
+ "lstrip": false,
286
+ "normalized": true,
287
+ "rstrip": false,
288
+ "single_word": false,
289
+ "special": false
290
+ },
291
+ "50288": {
292
+ "content": "[unused3]",
293
+ "lstrip": false,
294
+ "normalized": true,
295
+ "rstrip": false,
296
+ "single_word": false,
297
+ "special": false
298
+ },
299
+ "50289": {
300
+ "content": "[unused4]",
301
+ "lstrip": false,
302
+ "normalized": true,
303
+ "rstrip": false,
304
+ "single_word": false,
305
+ "special": false
306
+ },
307
+ "50290": {
308
+ "content": "[unused5]",
309
+ "lstrip": false,
310
+ "normalized": true,
311
+ "rstrip": false,
312
+ "single_word": false,
313
+ "special": false
314
+ },
315
+ "50291": {
316
+ "content": "[unused6]",
317
+ "lstrip": false,
318
+ "normalized": true,
319
+ "rstrip": false,
320
+ "single_word": false,
321
+ "special": false
322
+ },
323
+ "50292": {
324
+ "content": "[unused7]",
325
+ "lstrip": false,
326
+ "normalized": true,
327
+ "rstrip": false,
328
+ "single_word": false,
329
+ "special": false
330
+ },
331
+ "50293": {
332
+ "content": "[unused8]",
333
+ "lstrip": false,
334
+ "normalized": true,
335
+ "rstrip": false,
336
+ "single_word": false,
337
+ "special": false
338
+ },
339
+ "50294": {
340
+ "content": "[unused9]",
341
+ "lstrip": false,
342
+ "normalized": true,
343
+ "rstrip": false,
344
+ "single_word": false,
345
+ "special": false
346
+ },
347
+ "50295": {
348
+ "content": "[unused10]",
349
+ "lstrip": false,
350
+ "normalized": true,
351
+ "rstrip": false,
352
+ "single_word": false,
353
+ "special": false
354
+ },
355
+ "50296": {
356
+ "content": "[unused11]",
357
+ "lstrip": false,
358
+ "normalized": true,
359
+ "rstrip": false,
360
+ "single_word": false,
361
+ "special": false
362
+ },
363
+ "50297": {
364
+ "content": "[unused12]",
365
+ "lstrip": false,
366
+ "normalized": true,
367
+ "rstrip": false,
368
+ "single_word": false,
369
+ "special": false
370
+ },
371
+ "50298": {
372
+ "content": "[unused13]",
373
+ "lstrip": false,
374
+ "normalized": true,
375
+ "rstrip": false,
376
+ "single_word": false,
377
+ "special": false
378
+ },
379
+ "50299": {
380
+ "content": "[unused14]",
381
+ "lstrip": false,
382
+ "normalized": true,
383
+ "rstrip": false,
384
+ "single_word": false,
385
+ "special": false
386
+ },
387
+ "50300": {
388
+ "content": "[unused15]",
389
+ "lstrip": false,
390
+ "normalized": true,
391
+ "rstrip": false,
392
+ "single_word": false,
393
+ "special": false
394
+ },
395
+ "50301": {
396
+ "content": "[unused16]",
397
+ "lstrip": false,
398
+ "normalized": true,
399
+ "rstrip": false,
400
+ "single_word": false,
401
+ "special": false
402
+ },
403
+ "50302": {
404
+ "content": "[unused17]",
405
+ "lstrip": false,
406
+ "normalized": true,
407
+ "rstrip": false,
408
+ "single_word": false,
409
+ "special": false
410
+ },
411
+ "50303": {
412
+ "content": "[unused18]",
413
+ "lstrip": false,
414
+ "normalized": true,
415
+ "rstrip": false,
416
+ "single_word": false,
417
+ "special": false
418
+ },
419
+ "50304": {
420
+ "content": "[unused19]",
421
+ "lstrip": false,
422
+ "normalized": true,
423
+ "rstrip": false,
424
+ "single_word": false,
425
+ "special": false
426
+ },
427
+ "50305": {
428
+ "content": "[unused20]",
429
+ "lstrip": false,
430
+ "normalized": true,
431
+ "rstrip": false,
432
+ "single_word": false,
433
+ "special": false
434
+ },
435
+ "50306": {
436
+ "content": "[unused21]",
437
+ "lstrip": false,
438
+ "normalized": true,
439
+ "rstrip": false,
440
+ "single_word": false,
441
+ "special": false
442
+ },
443
+ "50307": {
444
+ "content": "[unused22]",
445
+ "lstrip": false,
446
+ "normalized": true,
447
+ "rstrip": false,
448
+ "single_word": false,
449
+ "special": false
450
+ },
451
+ "50308": {
452
+ "content": "[unused23]",
453
+ "lstrip": false,
454
+ "normalized": true,
455
+ "rstrip": false,
456
+ "single_word": false,
457
+ "special": false
458
+ },
459
+ "50309": {
460
+ "content": "[unused24]",
461
+ "lstrip": false,
462
+ "normalized": true,
463
+ "rstrip": false,
464
+ "single_word": false,
465
+ "special": false
466
+ },
467
+ "50310": {
468
+ "content": "[unused25]",
469
+ "lstrip": false,
470
+ "normalized": true,
471
+ "rstrip": false,
472
+ "single_word": false,
473
+ "special": false
474
+ },
475
+ "50311": {
476
+ "content": "[unused26]",
477
+ "lstrip": false,
478
+ "normalized": true,
479
+ "rstrip": false,
480
+ "single_word": false,
481
+ "special": false
482
+ },
483
+ "50312": {
484
+ "content": "[unused27]",
485
+ "lstrip": false,
486
+ "normalized": true,
487
+ "rstrip": false,
488
+ "single_word": false,
489
+ "special": false
490
+ },
491
+ "50313": {
492
+ "content": "[unused28]",
493
+ "lstrip": false,
494
+ "normalized": true,
495
+ "rstrip": false,
496
+ "single_word": false,
497
+ "special": false
498
+ },
499
+ "50314": {
500
+ "content": "[unused29]",
501
+ "lstrip": false,
502
+ "normalized": true,
503
+ "rstrip": false,
504
+ "single_word": false,
505
+ "special": false
506
+ },
507
+ "50315": {
508
+ "content": "[unused30]",
509
+ "lstrip": false,
510
+ "normalized": true,
511
+ "rstrip": false,
512
+ "single_word": false,
513
+ "special": false
514
+ },
515
+ "50316": {
516
+ "content": "[unused31]",
517
+ "lstrip": false,
518
+ "normalized": true,
519
+ "rstrip": false,
520
+ "single_word": false,
521
+ "special": false
522
+ },
523
+ "50317": {
524
+ "content": "[unused32]",
525
+ "lstrip": false,
526
+ "normalized": true,
527
+ "rstrip": false,
528
+ "single_word": false,
529
+ "special": false
530
+ },
531
+ "50318": {
532
+ "content": "[unused33]",
533
+ "lstrip": false,
534
+ "normalized": true,
535
+ "rstrip": false,
536
+ "single_word": false,
537
+ "special": false
538
+ },
539
+ "50319": {
540
+ "content": "[unused34]",
541
+ "lstrip": false,
542
+ "normalized": true,
543
+ "rstrip": false,
544
+ "single_word": false,
545
+ "special": false
546
+ },
547
+ "50320": {
548
+ "content": "[unused35]",
549
+ "lstrip": false,
550
+ "normalized": true,
551
+ "rstrip": false,
552
+ "single_word": false,
553
+ "special": false
554
+ },
555
+ "50321": {
556
+ "content": "[unused36]",
557
+ "lstrip": false,
558
+ "normalized": true,
559
+ "rstrip": false,
560
+ "single_word": false,
561
+ "special": false
562
+ },
563
+ "50322": {
564
+ "content": "[unused37]",
565
+ "lstrip": false,
566
+ "normalized": true,
567
+ "rstrip": false,
568
+ "single_word": false,
569
+ "special": false
570
+ },
571
+ "50323": {
572
+ "content": "[unused38]",
573
+ "lstrip": false,
574
+ "normalized": true,
575
+ "rstrip": false,
576
+ "single_word": false,
577
+ "special": false
578
+ },
579
+ "50324": {
580
+ "content": "[unused39]",
581
+ "lstrip": false,
582
+ "normalized": true,
583
+ "rstrip": false,
584
+ "single_word": false,
585
+ "special": false
586
+ },
587
+ "50325": {
588
+ "content": "[unused40]",
589
+ "lstrip": false,
590
+ "normalized": true,
591
+ "rstrip": false,
592
+ "single_word": false,
593
+ "special": false
594
+ },
595
+ "50326": {
596
+ "content": "[unused41]",
597
+ "lstrip": false,
598
+ "normalized": true,
599
+ "rstrip": false,
600
+ "single_word": false,
601
+ "special": false
602
+ },
603
+ "50327": {
604
+ "content": "[unused42]",
605
+ "lstrip": false,
606
+ "normalized": true,
607
+ "rstrip": false,
608
+ "single_word": false,
609
+ "special": false
610
+ },
611
+ "50328": {
612
+ "content": "[unused43]",
613
+ "lstrip": false,
614
+ "normalized": true,
615
+ "rstrip": false,
616
+ "single_word": false,
617
+ "special": false
618
+ },
619
+ "50329": {
620
+ "content": "[unused44]",
621
+ "lstrip": false,
622
+ "normalized": true,
623
+ "rstrip": false,
624
+ "single_word": false,
625
+ "special": false
626
+ },
627
+ "50330": {
628
+ "content": "[unused45]",
629
+ "lstrip": false,
630
+ "normalized": true,
631
+ "rstrip": false,
632
+ "single_word": false,
633
+ "special": false
634
+ },
635
+ "50331": {
636
+ "content": "[unused46]",
637
+ "lstrip": false,
638
+ "normalized": true,
639
+ "rstrip": false,
640
+ "single_word": false,
641
+ "special": false
642
+ },
643
+ "50332": {
644
+ "content": "[unused47]",
645
+ "lstrip": false,
646
+ "normalized": true,
647
+ "rstrip": false,
648
+ "single_word": false,
649
+ "special": false
650
+ },
651
+ "50333": {
652
+ "content": "[unused48]",
653
+ "lstrip": false,
654
+ "normalized": true,
655
+ "rstrip": false,
656
+ "single_word": false,
657
+ "special": false
658
+ },
659
+ "50334": {
660
+ "content": "[unused49]",
661
+ "lstrip": false,
662
+ "normalized": true,
663
+ "rstrip": false,
664
+ "single_word": false,
665
+ "special": false
666
+ },
667
+ "50335": {
668
+ "content": "[unused50]",
669
+ "lstrip": false,
670
+ "normalized": true,
671
+ "rstrip": false,
672
+ "single_word": false,
673
+ "special": false
674
+ },
675
+ "50336": {
676
+ "content": "[unused51]",
677
+ "lstrip": false,
678
+ "normalized": true,
679
+ "rstrip": false,
680
+ "single_word": false,
681
+ "special": false
682
+ },
683
+ "50337": {
684
+ "content": "[unused52]",
685
+ "lstrip": false,
686
+ "normalized": true,
687
+ "rstrip": false,
688
+ "single_word": false,
689
+ "special": false
690
+ },
691
+ "50338": {
692
+ "content": "[unused53]",
693
+ "lstrip": false,
694
+ "normalized": true,
695
+ "rstrip": false,
696
+ "single_word": false,
697
+ "special": false
698
+ },
699
+ "50339": {
700
+ "content": "[unused54]",
701
+ "lstrip": false,
702
+ "normalized": true,
703
+ "rstrip": false,
704
+ "single_word": false,
705
+ "special": false
706
+ },
707
+ "50340": {
708
+ "content": "[unused55]",
709
+ "lstrip": false,
710
+ "normalized": true,
711
+ "rstrip": false,
712
+ "single_word": false,
713
+ "special": false
714
+ },
715
+ "50341": {
716
+ "content": "[unused56]",
717
+ "lstrip": false,
718
+ "normalized": true,
719
+ "rstrip": false,
720
+ "single_word": false,
721
+ "special": false
722
+ },
723
+ "50342": {
724
+ "content": "[unused57]",
725
+ "lstrip": false,
726
+ "normalized": true,
727
+ "rstrip": false,
728
+ "single_word": false,
729
+ "special": false
730
+ },
731
+ "50343": {
732
+ "content": "[unused58]",
733
+ "lstrip": false,
734
+ "normalized": true,
735
+ "rstrip": false,
736
+ "single_word": false,
737
+ "special": false
738
+ },
739
+ "50344": {
740
+ "content": "[unused59]",
741
+ "lstrip": false,
742
+ "normalized": true,
743
+ "rstrip": false,
744
+ "single_word": false,
745
+ "special": false
746
+ },
747
+ "50345": {
748
+ "content": "[unused60]",
749
+ "lstrip": false,
750
+ "normalized": true,
751
+ "rstrip": false,
752
+ "single_word": false,
753
+ "special": false
754
+ },
755
+ "50346": {
756
+ "content": "[unused61]",
757
+ "lstrip": false,
758
+ "normalized": true,
759
+ "rstrip": false,
760
+ "single_word": false,
761
+ "special": false
762
+ },
763
+ "50347": {
764
+ "content": "[unused62]",
765
+ "lstrip": false,
766
+ "normalized": true,
767
+ "rstrip": false,
768
+ "single_word": false,
769
+ "special": false
770
+ },
771
+ "50348": {
772
+ "content": "[unused63]",
773
+ "lstrip": false,
774
+ "normalized": true,
775
+ "rstrip": false,
776
+ "single_word": false,
777
+ "special": false
778
+ },
779
+ "50349": {
780
+ "content": "[unused64]",
781
+ "lstrip": false,
782
+ "normalized": true,
783
+ "rstrip": false,
784
+ "single_word": false,
785
+ "special": false
786
+ },
787
+ "50350": {
788
+ "content": "[unused65]",
789
+ "lstrip": false,
790
+ "normalized": true,
791
+ "rstrip": false,
792
+ "single_word": false,
793
+ "special": false
794
+ },
795
+ "50351": {
796
+ "content": "[unused66]",
797
+ "lstrip": false,
798
+ "normalized": true,
799
+ "rstrip": false,
800
+ "single_word": false,
801
+ "special": false
802
+ },
803
+ "50352": {
804
+ "content": "[unused67]",
805
+ "lstrip": false,
806
+ "normalized": true,
807
+ "rstrip": false,
808
+ "single_word": false,
809
+ "special": false
810
+ },
811
+ "50353": {
812
+ "content": "[unused68]",
813
+ "lstrip": false,
814
+ "normalized": true,
815
+ "rstrip": false,
816
+ "single_word": false,
817
+ "special": false
818
+ },
819
+ "50354": {
820
+ "content": "[unused69]",
821
+ "lstrip": false,
822
+ "normalized": true,
823
+ "rstrip": false,
824
+ "single_word": false,
825
+ "special": false
826
+ },
827
+ "50355": {
828
+ "content": "[unused70]",
829
+ "lstrip": false,
830
+ "normalized": true,
831
+ "rstrip": false,
832
+ "single_word": false,
833
+ "special": false
834
+ },
835
+ "50356": {
836
+ "content": "[unused71]",
837
+ "lstrip": false,
838
+ "normalized": true,
839
+ "rstrip": false,
840
+ "single_word": false,
841
+ "special": false
842
+ },
843
+ "50357": {
844
+ "content": "[unused72]",
845
+ "lstrip": false,
846
+ "normalized": true,
847
+ "rstrip": false,
848
+ "single_word": false,
849
+ "special": false
850
+ },
851
+ "50358": {
852
+ "content": "[unused73]",
853
+ "lstrip": false,
854
+ "normalized": true,
855
+ "rstrip": false,
856
+ "single_word": false,
857
+ "special": false
858
+ },
859
+ "50359": {
860
+ "content": "[unused74]",
861
+ "lstrip": false,
862
+ "normalized": true,
863
+ "rstrip": false,
864
+ "single_word": false,
865
+ "special": false
866
+ },
867
+ "50360": {
868
+ "content": "[unused75]",
869
+ "lstrip": false,
870
+ "normalized": true,
871
+ "rstrip": false,
872
+ "single_word": false,
873
+ "special": false
874
+ },
875
+ "50361": {
876
+ "content": "[unused76]",
877
+ "lstrip": false,
878
+ "normalized": true,
879
+ "rstrip": false,
880
+ "single_word": false,
881
+ "special": false
882
+ },
883
+ "50362": {
884
+ "content": "[unused77]",
885
+ "lstrip": false,
886
+ "normalized": true,
887
+ "rstrip": false,
888
+ "single_word": false,
889
+ "special": false
890
+ },
891
+ "50363": {
892
+ "content": "[unused78]",
893
+ "lstrip": false,
894
+ "normalized": true,
895
+ "rstrip": false,
896
+ "single_word": false,
897
+ "special": false
898
+ },
899
+ "50364": {
900
+ "content": "[unused79]",
901
+ "lstrip": false,
902
+ "normalized": true,
903
+ "rstrip": false,
904
+ "single_word": false,
905
+ "special": false
906
+ },
907
+ "50365": {
908
+ "content": "[unused80]",
909
+ "lstrip": false,
910
+ "normalized": true,
911
+ "rstrip": false,
912
+ "single_word": false,
913
+ "special": false
914
+ },
915
+ "50366": {
916
+ "content": "[unused81]",
917
+ "lstrip": false,
918
+ "normalized": true,
919
+ "rstrip": false,
920
+ "single_word": false,
921
+ "special": false
922
+ },
923
+ "50367": {
924
+ "content": "[unused82]",
925
+ "lstrip": false,
926
+ "normalized": true,
927
+ "rstrip": false,
928
+ "single_word": false,
929
+ "special": false
930
+ }
931
+ },
932
+ "clean_up_tokenization_spaces": true,
933
+ "cls_token": "[CLS]",
934
+ "extra_special_tokens": {},
935
+ "mask_token": "[MASK]",
936
+ "model_input_names": [
937
+ "input_ids",
938
+ "attention_mask"
939
+ ],
940
+ "model_max_length": 8192,
941
+ "pad_token": "[PAD]",
942
+ "sep_token": "[SEP]",
943
+ "tokenizer_class": "PreTrainedTokenizer",
944
+ "unk_token": "[UNK]"
945
+ }