SentenceTransformer based on cl-nagoya/sup-simcse-ja-base

This is a sentence-transformers model finetuned from cl-nagoya/sup-simcse-ja-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: cl-nagoya/sup-simcse-ja-base
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Detomo/cl-nagoya-sup-simcse-ja-for-standard-name-v0_9_6")
# Run inference
sentences = [
    '科目:タイル。名称:床磁器質タイル。',
    '科目:ユニット及びその他。名称:#救助袋サイン(ガラス面)。',
    '科目:ユニット及びその他。名称:案内スタンドサイン。',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 9,054 training samples
  • Columns: sentence and label
  • Approximate statistics based on the first 1000 samples:
    sentence label
    type string int
    details
    • min: 11 tokens
    • mean: 17.79 tokens
    • max: 32 tokens
    • 0: ~0.20%
    • 1: ~0.30%
    • 2: ~0.30%
    • 3: ~0.30%
    • 4: ~0.20%
    • 5: ~0.20%
    • 6: ~0.20%
    • 7: ~0.20%
    • 8: ~0.20%
    • 9: ~0.20%
    • 10: ~0.30%
    • 11: ~0.20%
    • 12: ~0.20%
    • 13: ~0.20%
    • 14: ~0.20%
    • 15: ~0.20%
    • 16: ~0.20%
    • 17: ~0.40%
    • 18: ~0.20%
    • 19: ~0.20%
    • 20: ~0.20%
    • 21: ~0.20%
    • 22: ~0.20%
    • 23: ~0.20%
    • 24: ~0.20%
    • 25: ~0.20%
    • 26: ~0.20%
    • 27: ~0.20%
    • 28: ~0.20%
    • 29: ~0.20%
    • 30: ~0.20%
    • 31: ~0.20%
    • 32: ~0.20%
    • 33: ~0.20%
    • 34: ~0.20%
    • 35: ~0.20%
    • 36: ~0.20%
    • 37: ~0.20%
    • 38: ~0.20%
    • 39: ~0.20%
    • 40: ~0.20%
    • 41: ~0.20%
    • 42: ~0.20%
    • 43: ~0.60%
    • 44: ~0.70%
    • 45: ~0.20%
    • 46: ~0.20%
    • 47: ~0.20%
    • 48: ~0.20%
    • 49: ~0.20%
    • 50: ~0.30%
    • 51: ~0.20%
    • 52: ~0.20%
    • 53: ~0.20%
    • 54: ~0.20%
    • 55: ~0.30%
    • 56: ~0.40%
    • 57: ~0.30%
    • 58: ~0.20%
    • 59: ~0.20%
    • 60: ~0.20%
    • 61: ~0.20%
    • 62: ~0.20%
    • 63: ~0.30%
    • 64: ~0.20%
    • 65: ~0.20%
    • 66: ~0.20%
    • 67: ~0.20%
    • 68: ~0.40%
    • 69: ~0.40%
    • 70: ~0.20%
    • 71: ~0.60%
    • 72: ~0.20%
    • 73: ~0.20%
    • 74: ~0.20%
    • 75: ~0.20%
    • 76: ~0.20%
    • 77: ~0.30%
    • 78: ~0.20%
    • 79: ~0.40%
    • 80: ~0.20%
    • 81: ~0.20%
    • 82: ~0.50%
    • 83: ~0.30%
    • 84: ~0.60%
    • 85: ~0.20%
    • 86: ~0.30%
    • 87: ~0.20%
    • 88: ~0.20%
    • 89: ~0.20%
    • 90: ~0.20%
    • 91: ~1.10%
    • 92: ~1.70%
    • 93: ~2.20%
    • 94: ~0.50%
    • 95: ~0.20%
    • 96: ~0.20%
    • 97: ~1.50%
    • 98: ~0.20%
    • 99: ~0.20%
    • 100: ~0.20%
    • 101: ~0.20%
    • 102: ~0.30%
    • 103: ~1.70%
    • 104: ~0.20%
    • 105: ~0.20%
    • 106: ~0.40%
    • 107: ~0.40%
    • 108: ~0.20%
    • 109: ~0.20%
    • 110: ~0.20%
    • 111: ~1.10%
    • 112: ~0.20%
    • 113: ~0.50%
    • 114: ~0.50%
    • 115: ~0.20%
    • 116: ~0.20%
    • 117: ~0.20%
    • 118: ~0.20%
    • 119: ~0.50%
    • 120: ~0.20%
    • 121: ~0.20%
    • 122: ~0.20%
    • 123: ~0.20%
    • 124: ~0.20%
    • 125: ~0.20%
    • 126: ~0.30%
    • 127: ~0.20%
    • 128: ~0.20%
    • 129: ~0.20%
    • 130: ~0.50%
    • 131: ~0.20%
    • 132: ~0.20%
    • 133: ~0.20%
    • 134: ~0.20%
    • 135: ~0.20%
    • 136: ~0.20%
    • 137: ~0.20%
    • 138: ~0.20%
    • 139: ~0.20%
    • 140: ~0.20%
    • 141: ~1.80%
    • 142: ~0.20%
    • 143: ~0.20%
    • 144: ~1.70%
    • 145: ~0.30%
    • 146: ~0.30%
    • 147: ~0.50%
    • 148: ~0.50%
    • 149: ~0.50%
    • 150: ~0.20%
    • 151: ~0.20%
    • 152: ~0.20%
    • 153: ~0.20%
    • 154: ~0.20%
    • 155: ~0.20%
    • 156: ~0.20%
    • 157: ~0.20%
    • 158: ~0.60%
    • 159: ~0.20%
    • 160: ~0.20%
    • 161: ~0.20%
    • 162: ~0.20%
    • 163: ~0.20%
    • 164: ~0.50%
    • 165: ~0.20%
    • 166: ~0.20%
    • 167: ~0.20%
    • 168: ~0.20%
    • 169: ~0.20%
    • 170: ~0.30%
    • 171: ~0.30%
    • 172: ~0.20%
    • 173: ~0.20%
    • 174: ~0.20%
    • 175: ~0.20%
    • 176: ~0.20%
    • 177: ~0.60%
    • 178: ~0.20%
    • 179: ~0.20%
    • 180: ~0.20%
    • 181: ~0.20%
    • 182: ~0.20%
    • 183: ~0.40%
    • 184: ~0.20%
    • 185: ~0.20%
    • 186: ~0.30%
    • 187: ~0.20%
    • 188: ~0.90%
    • 189: ~0.30%
    • 190: ~0.30%
    • 191: ~0.20%
    • 192: ~0.30%
    • 193: ~0.20%
    • 194: ~0.80%
    • 195: ~0.20%
    • 196: ~0.20%
    • 197: ~0.30%
    • 198: ~0.20%
    • 199: ~0.20%
    • 200: ~0.20%
    • 201: ~0.20%
    • 202: ~0.20%
    • 203: ~1.20%
    • 204: ~0.40%
    • 205: ~0.20%
    • 206: ~0.20%
    • 207: ~0.20%
    • 208: ~0.20%
    • 209: ~1.00%
    • 210: ~0.20%
    • 211: ~0.30%
    • 212: ~0.20%
    • 213: ~1.10%
    • 214: ~0.30%
    • 215: ~0.20%
    • 216: ~0.20%
    • 217: ~0.20%
    • 218: ~0.20%
    • 219: ~0.20%
    • 220: ~0.20%
    • 221: ~0.20%
    • 222: ~0.30%
    • 223: ~0.20%
    • 224: ~0.90%
    • 225: ~4.70%
    • 226: ~0.20%
    • 227: ~0.20%
    • 228: ~0.20%
    • 229: ~0.70%
    • 230: ~0.20%
    • 231: ~0.80%
    • 232: ~0.20%
    • 233: ~0.40%
    • 234: ~0.30%
    • 235: ~0.40%
    • 236: ~0.20%
    • 237: ~0.30%
    • 238: ~0.50%
    • 239: ~0.30%
    • 240: ~0.20%
    • 241: ~0.20%
    • 242: ~0.30%
    • 243: ~0.30%
    • 244: ~0.30%
    • 245: ~0.60%
    • 246: ~0.20%
    • 247: ~0.20%
    • 248: ~0.20%
    • 249: ~0.30%
    • 250: ~0.30%
    • 251: ~1.90%
    • 252: ~0.20%
    • 253: ~0.20%
    • 254: ~0.20%
    • 255: ~0.20%
    • 256: ~0.20%
    • 257: ~0.50%
    • 258: ~0.20%
    • 259: ~0.30%
    • 260: ~0.20%
    • 261: ~0.20%
    • 262: ~1.00%
    • 263: ~0.20%
    • 264: ~0.20%
    • 265: ~0.20%
    • 266: ~0.40%
    • 267: ~0.20%
    • 268: ~0.20%
    • 269: ~0.20%
    • 270: ~0.20%
    • 271: ~0.20%
    • 272: ~0.20%
    • 273: ~0.20%
    • 274: ~3.60%
    • 275: ~0.20%
    • 276: ~0.20%
    • 277: ~0.40%
    • 278: ~0.20%
    • 279: ~0.20%
    • 280: ~0.90%
    • 281: ~0.40%
    • 282: ~0.20%
    • 283: ~2.30%
    • 284: ~0.30%
  • Samples:
    sentence label
    科目:コンクリート。名称:免震基礎天端グラウト注入。 0
    科目:コンクリート。名称:免震基礎天端グラウト注入。 0
    科目:コンクリート。名称:コンクリートポンプ圧送。 1
  • Loss: BatchAllTripletLoss

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 512
  • per_device_eval_batch_size: 512
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • num_train_epochs: 500
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: group_by_label

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 512
  • per_device_eval_batch_size: 512
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 500
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: group_by_label
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss
2.8889 50 0.7963
5.8333 100 0.7067
8.7778 150 0.6532
11.7222 200 0.6806
14.6667 250 0.652
17.6111 300 0.6508
20.5556 350 0.6566
23.5 400 0.6237
26.4444 450 0.6363
29.3889 500 0.6554
32.3333 550 0.6007
35.2778 600 0.6016
38.2222 650 0.5687
2.8889 50 0.5655
5.8333 100 0.6139
8.7778 150 0.514
11.7222 200 0.5867
14.6667 250 0.5699
17.6111 300 0.5472
20.5556 350 0.5793
23.5 400 0.5196
26.4444 450 0.5572
29.3889 500 0.5279
32.3333 550 0.5095
35.2778 600 0.4488
38.2222 650 0.4189
41.1667 700 0.5164
44.1111 750 0.591
47.0556 800 0.52
49.9444 850 0.5235
52.8889 900 0.5317
55.8333 950 0.5517
58.7778 1000 0.5618
61.7222 1050 0.5318
64.6667 1100 0.4685
67.6111 1150 0.4836
70.5556 1200 0.5426
73.5 1250 0.5356
76.4444 1300 0.4231
79.3889 1350 0.5104
82.3333 1400 0.4944
85.2778 1450 0.5301
88.2222 1500 0.4499
91.1667 1550 0.4745
94.1111 1600 0.4432
97.0556 1650 0.3892
99.9444 1700 0.4429
102.8889 1750 0.4973
105.8333 1800 0.5222
108.7778 1850 0.4502
111.7222 1900 0.4073
114.6667 1950 0.408
117.6111 2000 0.403
120.5556 2050 0.4122
123.5 2100 0.4357
126.4444 2150 0.4765
129.3889 2200 0.4069
132.3333 2250 0.388
135.2778 2300 0.341
138.2222 2350 0.333
141.1667 2400 0.4587
144.1111 2450 0.355
147.0556 2500 0.3552
149.9444 2550 0.3804
152.8889 2600 0.3692
155.8333 2650 0.3367
158.7778 2700 0.3662
161.7222 2750 0.3089
164.6667 2800 0.3016
167.6111 2850 0.3252
170.5556 2900 0.3409
173.5 2950 0.3128
176.4444 3000 0.3287
179.3889 3050 0.3148
182.3333 3100 0.3843
185.2778 3150 0.2281
188.2222 3200 0.2973
191.1667 3250 0.2891
194.1111 3300 0.3623
197.0556 3350 0.3626
199.9444 3400 0.2931
202.8889 3450 0.2755
205.8333 3500 0.2849
208.7778 3550 0.2608
211.7222 3600 0.3081
214.6667 3650 0.2724
217.6111 3700 0.2583
220.5556 3750 0.3132
223.5 3800 0.196
226.4444 3850 0.2554
229.3889 3900 0.2
232.3333 3950 0.2936
235.2778 4000 0.2326
238.2222 4050 0.2031
241.1667 4100 0.2492
244.1111 4150 0.2234
247.0556 4200 0.3034
249.9444 4250 0.2325
252.8889 4300 0.2453
255.8333 4350 0.2848
258.7778 4400 0.2447
261.7222 4450 0.2599
264.6667 4500 0.2073
267.6111 4550 0.2134
270.5556 4600 0.1886
273.5 4650 0.1229
276.4444 4700 0.2147
279.3889 4750 0.1993
282.3333 4800 0.1814
285.2778 4850 0.202
288.2222 4900 0.1947
291.1667 4950 0.14
294.1111 5000 0.2394
297.0556 5050 0.1798
299.9444 5100 0.1534
302.8889 5150 0.2622
305.8333 5200 0.1636
308.7778 5250 0.1966
311.7222 5300 0.1365
314.6667 5350 0.1501
317.6111 5400 0.1494
320.5556 5450 0.1341
323.5 5500 0.1791
326.4444 5550 0.1609
329.3889 5600 0.2268
332.3333 5650 0.2145
335.2778 5700 0.095
338.2222 5750 0.1161
341.1667 5800 0.1615
344.1111 5850 0.1261
347.0556 5900 0.2022
349.9444 5950 0.1503
352.8889 6000 0.1473
355.8333 6050 0.1703
358.7778 6100 0.1441
361.7222 6150 0.1439
364.6667 6200 0.1192
367.6111 6250 0.1312
370.5556 6300 0.0933
373.5 6350 0.1281
376.4444 6400 0.1516
379.3889 6450 0.1819
382.3333 6500 0.1877
385.2778 6550 0.1372
388.2222 6600 0.1551
391.1667 6650 0.1343
394.1111 6700 0.2394
397.0556 6750 0.1882
399.9444 6800 0.1786
402.8889 6850 0.125
405.8333 6900 0.1059
408.7778 6950 0.1414
411.7222 7000 0.0593
414.6667 7050 0.1037
417.6111 7100 0.098
420.5556 7150 0.1457
423.5 7200 0.1193
426.4444 7250 0.1061
429.3889 7300 0.1305
432.3333 7350 0.1416
435.2778 7400 0.1117
438.2222 7450 0.1003
441.1667 7500 0.1217
444.1111 7550 0.0872
447.0556 7600 0.1219
449.9444 7650 0.1061
452.8889 7700 0.1559
455.8333 7750 0.1599
458.7778 7800 0.1436
461.7222 7850 0.1207
464.6667 7900 0.1272
467.6111 7950 0.1048
470.5556 8000 0.1216
473.5 8050 0.133
476.4444 8100 0.0971
479.3889 8150 0.154
482.3333 8200 0.0697
485.2778 8250 0.136
488.2222 8300 0.1315
491.1667 8350 0.1103
494.1111 8400 0.1065
497.0556 8450 0.0784
499.9444 8500 0.134

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 3.4.1
  • Transformers: 4.49.0
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.5.2
  • Datasets: 3.4.1
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

BatchAllTripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
Downloads last month
5
Safetensors
Model size
111M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Detomo/cl-nagoya-sup-simcse-ja-for-standard-name-v0_9_6

Finetuned
(4)
this model