SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2 on the gooaq dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
  • Language: en

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("ayushexel/emb-all-MiniLM-L6-v2-gooaq-6-epochs")
# Run inference
sentences = [
    'what is usb c ss?',
    'The USB Type-C specification is pretty confusing. ... The standard USB logo to identify USB 2.0 ports or slower. "SS" markings, which stand for SuperSpeed, to identify USB 3.0 ports, otherwise known as USB 3.1 gen 1. "10" markings, which stand for 10 Gbps, to identify USB 3.1 gen 2 ports with ultra-fast connectivity.',
    '“Global warming” refers to the rise in global temperatures due mainly to the increasing concentrations of greenhouse gases in the atmosphere. “Climate change” refers to the increasing changes in the measures of climate over a long period of time – including precipitation, temperature, and wind patterns.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

Metric Value
cosine_accuracy 0.6018

Training Details

Training Dataset

gooaq

  • Dataset: gooaq at b089f72
  • Size: 1,995,000 training samples
  • Columns: question and answer
  • Approximate statistics based on the first 1000 samples:
    question answer
    type string string
    details
    • min: 8 tokens
    • mean: 11.86 tokens
    • max: 23 tokens
    • min: 14 tokens
    • mean: 60.74 tokens
    • max: 133 tokens
  • Samples:
    question answer
    can twine be a noun? noun. a strong thread or string composed of two or more strands twisted together. an act of twining, twisting, or interweaving.
    what is bo id in nsdl? The demat account number allotted to the beneficiary holder(s) by DP is known as the BO-ID. In CDSL it is 16 digits number. It is an intermediary (an institution) between the investor and the depository.
    how much does it cost to run an electric fan all night? The average indoor ceiling fan costs around 0.13c to 1.29c per hour to run, or between $1.90 and $18.85 each year. This will depend on the fan's speed settings, how frequently it's used, and the rate you pay on electricity. Like most electrical appliances, a ceiling fan's power is measured in watts.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

gooaq

  • Dataset: gooaq at b089f72
  • Size: 5,000 evaluation samples
  • Columns: question and answer
  • Approximate statistics based on the first 1000 samples:
    question answer
    type string string
    details
    • min: 8 tokens
    • mean: 11.8 tokens
    • max: 21 tokens
    • min: 14 tokens
    • mean: 60.68 tokens
    • max: 123 tokens
  • Samples:
    question answer
    how much water should a person drink in 8 hours? Health authorities commonly recommend eight 8-ounce glasses, which equals about 2 liters, or half a gallon. This is called the 8×8 rule and is very easy to remember. However, some health gurus believe that you need to sip on water constantly throughout the day, even when you're not thirsty.
    what does this mean in excel #name? Important: The #NAME? error signifies that something needs to be corrected in the syntax, so when you see the error in your formula, resolve it. Do not use any error-handling functions such as IFERROR to mask the error. To avoid typos in formula names, use the Formula Wizard in Excel.
    are hydroflask good for the environment? Hydro Flasks are a new fad among many students and adults to help minimize plastic waste in the oceans. Hydro Flasks are great because they use a type of metal called TempShield, which keeps your beverage or food either hot for up to six hours or cold for up to twenty-four hours.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 256
  • num_train_epochs: 6
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 256
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 6
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss gooqa-dev_cosine_accuracy
-1 -1 - - 0.5378
0.0128 100 0.0766 - -
0.0257 200 0.0733 - -
0.0385 300 0.0697 - -
0.0513 400 0.0689 - -
0.0642 500 0.0693 - -
0.0770 600 0.07 - -
0.0898 700 0.0672 - -
0.1027 800 0.0636 - -
0.1155 900 0.0664 - -
0.1283 1000 0.0656 0.0454 0.5484
0.1412 1100 0.0648 - -
0.1540 1200 0.0643 - -
0.1668 1300 0.0613 - -
0.1796 1400 0.0617 - -
0.1925 1500 0.0609 - -
0.2053 1600 0.0622 - -
0.2181 1700 0.0628 - -
0.2310 1800 0.0616 - -
0.2438 1900 0.0632 - -
0.2566 2000 0.0614 0.0405 0.5446
0.2695 2100 0.0617 - -
0.2823 2200 0.0577 - -
0.2951 2300 0.0579 - -
0.3080 2400 0.0599 - -
0.3208 2500 0.056 - -
0.3336 2600 0.0574 - -
0.3465 2700 0.0595 - -
0.3593 2800 0.0581 - -
0.3721 2900 0.0588 - -
0.3850 3000 0.0576 0.0412 0.5386
0.3978 3100 0.0613 - -
0.4106 3200 0.0556 - -
0.4235 3300 0.0575 - -
0.4363 3400 0.0595 - -
0.4491 3500 0.0559 - -
0.4620 3600 0.0558 - -
0.4748 3700 0.0544 - -
0.4876 3800 0.0554 - -
0.5004 3900 0.0581 - -
0.5133 4000 0.0546 0.0406 0.5462
0.5261 4100 0.0559 - -
0.5389 4200 0.0535 - -
0.5518 4300 0.0533 - -
0.5646 4400 0.0565 - -
0.5774 4500 0.0562 - -
0.5903 4600 0.0555 - -
0.6031 4700 0.055 - -
0.6159 4800 0.0548 - -
0.6288 4900 0.0567 - -
0.6416 5000 0.0564 0.0374 0.5444
0.6544 5100 0.0541 - -
0.6673 5200 0.0538 - -
0.6801 5300 0.0541 - -
0.6929 5400 0.0585 - -
0.7058 5500 0.0539 - -
0.7186 5600 0.0551 - -
0.7314 5700 0.0534 - -
0.7443 5800 0.0517 - -
0.7571 5900 0.0525 - -
0.7699 6000 0.0503 0.0380 0.5482
0.7828 6100 0.0541 - -
0.7956 6200 0.0526 - -
0.8084 6300 0.0498 - -
0.8212 6400 0.0544 - -
0.8341 6500 0.0511 - -
0.8469 6600 0.0539 - -
0.8597 6700 0.05 - -
0.8726 6800 0.05 - -
0.8854 6900 0.0508 - -
0.8982 7000 0.0498 0.0347 0.5558
0.9111 7100 0.0481 - -
0.9239 7200 0.0509 - -
0.9367 7300 0.0481 - -
0.9496 7400 0.0487 - -
0.9624 7500 0.0508 - -
0.9752 7600 0.0477 - -
0.9881 7700 0.0482 - -
1.0009 7800 0.0469 - -
1.0137 7900 0.0436 - -
1.0266 8000 0.042 0.0333 0.5560
1.0394 8100 0.0408 - -
1.0522 8200 0.0417 - -
1.0651 8300 0.0449 - -
1.0779 8400 0.042 - -
1.0907 8500 0.0425 - -
1.1036 8600 0.043 - -
1.1164 8700 0.0389 - -
1.1292 8800 0.043 - -
1.1421 8900 0.0419 - -
1.1549 9000 0.0418 0.0336 0.5556
1.1677 9100 0.0435 - -
1.1805 9200 0.0404 - -
1.1934 9300 0.0406 - -
1.2062 9400 0.0406 - -
1.2190 9500 0.0431 - -
1.2319 9600 0.0385 - -
1.2447 9700 0.0428 - -
1.2575 9800 0.0424 - -
1.2704 9900 0.0412 - -
1.2832 10000 0.0389 0.0336 0.5688
1.2960 10100 0.0382 - -
1.3089 10200 0.0423 - -
1.3217 10300 0.0423 - -
1.3345 10400 0.039 - -
1.3474 10500 0.0404 - -
1.3602 10600 0.0383 - -
1.3730 10700 0.0416 - -
1.3859 10800 0.0403 - -
1.3987 10900 0.0408 - -
1.4115 11000 0.0388 0.0301 0.5752
1.4244 11100 0.039 - -
1.4372 11200 0.0391 - -
1.4500 11300 0.0399 - -
1.4629 11400 0.0387 - -
1.4757 11500 0.0387 - -
1.4885 11600 0.0395 - -
1.5013 11700 0.0424 - -
1.5142 11800 0.0382 - -
1.5270 11900 0.0393 - -
1.5398 12000 0.04 0.0320 0.5664
1.5527 12100 0.0369 - -
1.5655 12200 0.0385 - -
1.5783 12300 0.0414 - -
1.5912 12400 0.0406 - -
1.6040 12500 0.0407 - -
1.6168 12600 0.0378 - -
1.6297 12700 0.0393 - -
1.6425 12800 0.0389 - -
1.6553 12900 0.0407 - -
1.6682 13000 0.0413 0.0300 0.5674
1.6810 13100 0.0383 - -
1.6938 13200 0.0374 - -
1.7067 13300 0.0388 - -
1.7195 13400 0.0395 - -
1.7323 13500 0.0381 - -
1.7452 13600 0.039 - -
1.7580 13700 0.0403 - -
1.7708 13800 0.0381 - -
1.7837 13900 0.0389 - -
1.7965 14000 0.0387 0.0289 0.5726
1.8093 14100 0.0402 - -
1.8221 14200 0.0375 - -
1.8350 14300 0.0355 - -
1.8478 14400 0.0383 - -
1.8606 14500 0.037 - -
1.8735 14600 0.0363 - -
1.8863 14700 0.0361 - -
1.8991 14800 0.0375 - -
1.9120 14900 0.0381 - -
1.9248 15000 0.0387 0.0292 0.5776
1.9376 15100 0.0381 - -
1.9505 15200 0.0385 - -
1.9633 15300 0.0359 - -
1.9761 15400 0.039 - -
1.9890 15500 0.0379 - -
2.0018 15600 0.0364 - -
2.0146 15700 0.0302 - -
2.0275 15800 0.033 - -
2.0403 15900 0.0317 - -
2.0531 16000 0.0303 0.0276 0.5712
2.0660 16100 0.0311 - -
2.0788 16200 0.0328 - -
2.0916 16300 0.032 - -
2.1045 16400 0.0304 - -
2.1173 16500 0.0292 - -
2.1301 16600 0.0302 - -
2.1429 16700 0.0326 - -
2.1558 16800 0.0303 - -
2.1686 16900 0.0309 - -
2.1814 17000 0.0315 0.0279 0.5760
2.1943 17100 0.0329 - -
2.2071 17200 0.0303 - -
2.2199 17300 0.0318 - -
2.2328 17400 0.0312 - -
2.2456 17500 0.0321 - -
2.2584 17600 0.0303 - -
2.2713 17700 0.0314 - -
2.2841 17800 0.0297 - -
2.2969 17900 0.03 - -
2.3098 18000 0.0317 0.0268 0.5898
2.3226 18100 0.0333 - -
2.3354 18200 0.0286 - -
2.3483 18300 0.032 - -
2.3611 18400 0.0311 - -
2.3739 18500 0.0298 - -
2.3868 18600 0.0296 - -
2.3996 18700 0.032 - -
2.4124 18800 0.0301 - -
2.4253 18900 0.0303 - -
2.4381 19000 0.0296 0.0252 0.5816
2.4509 19100 0.0304 - -
2.4637 19200 0.0311 - -
2.4766 19300 0.0306 - -
2.4894 19400 0.0309 - -
2.5022 19500 0.0312 - -
2.5151 19600 0.0284 - -
2.5279 19700 0.0296 - -
2.5407 19800 0.0309 - -
2.5536 19900 0.0305 - -
2.5664 20000 0.0289 0.0264 0.5888
2.5792 20100 0.03 - -
2.5921 20200 0.0274 - -
2.6049 20300 0.0296 - -
2.6177 20400 0.0297 - -
2.6306 20500 0.0303 - -
2.6434 20600 0.0324 - -
2.6562 20700 0.0309 - -
2.6691 20800 0.031 - -
2.6819 20900 0.0286 - -
2.6947 21000 0.0286 0.0258 0.5854
2.7076 21100 0.0289 - -
2.7204 21200 0.0294 - -
2.7332 21300 0.0301 - -
2.7461 21400 0.0289 - -
2.7589 21500 0.0318 - -
2.7717 21600 0.0288 - -
2.7846 21700 0.0291 - -
2.7974 21800 0.0298 - -
2.8102 21900 0.0297 - -
2.8230 22000 0.0284 0.0266 0.5864
2.8359 22100 0.0294 - -
2.8487 22200 0.0284 - -
2.8615 22300 0.031 - -
2.8744 22400 0.0294 - -
2.8872 22500 0.0301 - -
2.9000 22600 0.0293 - -
2.9129 22700 0.0296 - -
2.9257 22800 0.029 - -
2.9385 22900 0.0292 - -
2.9514 23000 0.0305 0.0256 0.5876
2.9642 23100 0.0296 - -
2.9770 23200 0.0299 - -
2.9899 23300 0.0316 - -
3.0027 23400 0.0273 - -
3.0155 23500 0.0252 - -
3.0284 23600 0.0249 - -
3.0412 23700 0.0237 - -
3.0540 23800 0.0242 - -
3.0669 23900 0.0256 - -
3.0797 24000 0.0245 0.0245 0.5834
3.0925 24100 0.0248 - -
3.1054 24200 0.0251 - -
3.1182 24300 0.026 - -
3.1310 24400 0.0261 - -
3.1438 24500 0.0243 - -
3.1567 24600 0.0243 - -
3.1695 24700 0.0256 - -
3.1823 24800 0.0262 - -
3.1952 24900 0.0261 - -
3.2080 25000 0.0265 0.0249 0.5816
3.2208 25100 0.0276 - -
3.2337 25200 0.0237 - -
3.2465 25300 0.0285 - -
3.2593 25400 0.0234 - -
3.2722 25500 0.0252 - -
3.2850 25600 0.0249 - -
3.2978 25700 0.0245 - -
3.3107 25800 0.0254 - -
3.3235 25900 0.025 - -
3.3363 26000 0.0273 0.0243 0.5914
3.3492 26100 0.0234 - -
3.3620 26200 0.0247 - -
3.3748 26300 0.0244 - -
3.3877 26400 0.0242 - -
3.4005 26500 0.0264 - -
3.4133 26600 0.026 - -
3.4262 26700 0.0259 - -
3.4390 26800 0.0253 - -
3.4518 26900 0.0251 - -
3.4646 27000 0.0249 0.0234 0.5950
3.4775 27100 0.0257 - -
3.4903 27200 0.0259 - -
3.5031 27300 0.0262 - -
3.5160 27400 0.0252 - -
3.5288 27500 0.0268 - -
3.5416 27600 0.0245 - -
3.5545 27700 0.026 - -
3.5673 27800 0.0237 - -
3.5801 27900 0.0261 - -
3.5930 28000 0.0256 0.0226 0.5944
3.6058 28100 0.0256 - -
3.6186 28200 0.0243 - -
3.6315 28300 0.0262 - -
3.6443 28400 0.0265 - -
3.6571 28500 0.0237 - -
3.6700 28600 0.0243 - -
3.6828 28700 0.0247 - -
3.6956 28800 0.024 - -
3.7085 28900 0.0259 - -
3.7213 29000 0.0268 0.0238 0.5942
3.7341 29100 0.0255 - -
3.7470 29200 0.0249 - -
3.7598 29300 0.0244 - -
3.7726 29400 0.0258 - -
3.7854 29500 0.0253 - -
3.7983 29600 0.0246 - -
3.8111 29700 0.0245 - -
3.8239 29800 0.0269 - -
3.8368 29900 0.0256 - -
3.8496 30000 0.0242 0.0236 0.5976
3.8624 30100 0.0235 - -
3.8753 30200 0.025 - -
3.8881 30300 0.024 - -
3.9009 30400 0.0247 - -
3.9138 30500 0.0261 - -
3.9266 30600 0.0251 - -
3.9394 30700 0.0233 - -
3.9523 30800 0.0239 - -
3.9651 30900 0.0252 - -
3.9779 31000 0.0252 0.0228 0.5956
3.9908 31100 0.0236 - -
4.0036 31200 0.0253 - -
4.0164 31300 0.0223 - -
4.0293 31400 0.0224 - -
4.0421 31500 0.0229 - -
4.0549 31600 0.0234 - -
4.0678 31700 0.0223 - -
4.0806 31800 0.0225 - -
4.0934 31900 0.0239 - -
4.1062 32000 0.0221 0.0241 0.5892
4.1191 32100 0.0227 - -
4.1319 32200 0.0232 - -
4.1447 32300 0.0214 - -
4.1576 32400 0.0224 - -
4.1704 32500 0.0229 - -
4.1832 32600 0.0214 - -
4.1961 32700 0.0237 - -
4.2089 32800 0.0213 - -
4.2217 32900 0.0212 - -
4.2346 33000 0.0213 0.0230 0.5938
4.2474 33100 0.0225 - -
4.2602 33200 0.0237 - -
4.2731 33300 0.0236 - -
4.2859 33400 0.0217 - -
4.2987 33500 0.0212 - -
4.3116 33600 0.0227 - -
4.3244 33700 0.0223 - -
4.3372 33800 0.0211 - -
4.3501 33900 0.0224 - -
4.3629 34000 0.0231 0.0223 0.6000
4.3757 34100 0.0218 - -
4.3886 34200 0.0227 - -
4.4014 34300 0.0223 - -
4.4142 34400 0.0227 - -
4.4270 34500 0.022 - -
4.4399 34600 0.021 - -
4.4527 34700 0.0203 - -
4.4655 34800 0.0219 - -
4.4784 34900 0.0221 - -
4.4912 35000 0.0213 0.0219 0.5924
4.5040 35100 0.0229 - -
4.5169 35200 0.0217 - -
4.5297 35300 0.0207 - -
4.5425 35400 0.022 - -
4.5554 35500 0.0228 - -
4.5682 35600 0.0214 - -
4.5810 35700 0.0225 - -
4.5939 35800 0.0209 - -
4.6067 35900 0.0214 - -
4.6195 36000 0.0208 0.0218 0.5994
4.6324 36100 0.021 - -
4.6452 36200 0.0198 - -
4.6580 36300 0.021 - -
4.6709 36400 0.022 - -
4.6837 36500 0.0209 - -
4.6965 36600 0.0207 - -
4.7094 36700 0.0212 - -
4.7222 36800 0.0219 - -
4.7350 36900 0.0218 - -
4.7479 37000 0.0233 0.0217 0.5984
4.7607 37100 0.0227 - -
4.7735 37200 0.0218 - -
4.7863 37300 0.023 - -
4.7992 37400 0.0228 - -
4.8120 37500 0.0217 - -
4.8248 37600 0.0214 - -
4.8377 37700 0.0225 - -
4.8505 37800 0.0214 - -
4.8633 37900 0.0194 - -
4.8762 38000 0.0217 0.0217 0.6020
4.8890 38100 0.0218 - -
4.9018 38200 0.0225 - -
4.9147 38300 0.0218 - -
4.9275 38400 0.021 - -
4.9403 38500 0.0221 - -
4.9532 38600 0.0239 - -
4.9660 38700 0.0213 - -
4.9788 38800 0.0218 - -
4.9917 38900 0.0219 - -
5.0045 39000 0.0214 0.0206 0.5936
5.0173 39100 0.02 - -
5.0302 39200 0.0202 - -
5.0430 39300 0.02 - -
5.0558 39400 0.0198 - -
5.0687 39500 0.0189 - -
5.0815 39600 0.0207 - -
5.0943 39700 0.0201 - -
5.1071 39800 0.0211 - -
5.1200 39900 0.0209 - -
5.1328 40000 0.0206 0.0213 0.6004
5.1456 40100 0.0194 - -
5.1585 40200 0.0205 - -
5.1713 40300 0.0195 - -
5.1841 40400 0.0208 - -
5.1970 40500 0.0196 - -
5.2098 40600 0.0211 - -
5.2226 40700 0.019 - -
5.2355 40800 0.0208 - -
5.2483 40900 0.02 - -
5.2611 41000 0.0213 0.0209 0.6006
5.2740 41100 0.0195 - -
5.2868 41200 0.0208 - -
5.2996 41300 0.0215 - -
5.3125 41400 0.0195 - -
5.3253 41500 0.0194 - -
5.3381 41600 0.0201 - -
5.3510 41700 0.0208 - -
5.3638 41800 0.0192 - -
5.3766 41900 0.0205 - -
5.3895 42000 0.0201 0.0211 0.6012
5.4023 42100 0.0199 - -
5.4151 42200 0.0183 - -
5.4279 42300 0.0184 - -
5.4408 42400 0.0196 - -
5.4536 42500 0.0191 - -
5.4664 42600 0.0193 - -
5.4793 42700 0.019 - -
5.4921 42800 0.0205 - -
5.5049 42900 0.0193 - -
5.5178 43000 0.0189 0.0210 0.6028
5.5306 43100 0.0213 - -
5.5434 43200 0.0203 - -
5.5563 43300 0.0206 - -
5.5691 43400 0.0193 - -
5.5819 43500 0.019 - -
5.5948 43600 0.0186 - -
5.6076 43700 0.0192 - -
5.6204 43800 0.0198 - -
5.6333 43900 0.0207 - -
5.6461 44000 0.0208 0.0208 0.6038
5.6589 44100 0.0188 - -
5.6718 44200 0.0182 - -
5.6846 44300 0.0196 - -
5.6974 44400 0.0198 - -
5.7103 44500 0.0206 - -
5.7231 44600 0.0205 - -
5.7359 44700 0.0195 - -
5.7487 44800 0.0191 - -
5.7616 44900 0.0198 - -
5.7744 45000 0.0202 0.0208 0.6054
5.7872 45100 0.0198 - -
5.8001 45200 0.018 - -
5.8129 45300 0.0199 - -
5.8257 45400 0.0191 - -
5.8386 45500 0.0202 - -
5.8514 45600 0.0195 - -
5.8642 45700 0.0194 - -
5.8771 45800 0.0195 - -
5.8899 45900 0.0189 - -
5.9027 46000 0.0199 0.0208 0.6044
5.9156 46100 0.0192 - -
5.9284 46200 0.0204 - -
5.9412 46300 0.0187 - -
5.9541 46400 0.0184 - -
5.9669 46500 0.0202 - -
5.9797 46600 0.0191 - -
5.9926 46700 0.0196 - -
-1 -1 - - 0.6018

Framework Versions

  • Python: 3.11.0
  • Sentence Transformers: 4.0.1
  • Transformers: 4.50.3
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.5.2
  • Datasets: 3.5.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
22
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ayushexel/emb-all-MiniLM-L6-v2-gooaq-6-epochs

Finetuned
(352)
this model

Dataset used to train ayushexel/emb-all-MiniLM-L6-v2-gooaq-6-epochs

Evaluation results