gist_small_ft_gooaq / README.md
moshew's picture
Add new SentenceTransformer model
cde2343 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:200
  - loss:CoSENTLoss
base_model: avsolatorio/GIST-small-Embedding-v0
widget:
  - source_sentence: who is imf chief economist?
    sentences:
      - >-
        Metoprolol succinate is also known by the brand name Toprol XL. It is
        the extended-release form of metoprolol. Metoprolol succinate is
        approved to treat high blood pressure, chronic chest pain, and
        congestive heart failure.
      - >-
        He wants to confirm if he is talking to Priya or Angel Priya (I.e., if
        he is really talking to a girl or just a guy with fake profile) They are
        talking to you and want to see how you look. I found it normal but would
        say, be careful about whom do you share your picture with as they might
        misuse it. I hate this one.
      - >-
        A Dependent Care Flexible Spending Account, or “FSA,” is a pre-tax
        benefit account used to pay for dependent care services while you are at
        work. The money you contribute to a Dependent Care FSA is not subject to
        payroll taxes, so you end up paying less in taxes and taking home more
        of your paycheck.
  - source_sentence: is it possible to get a false negative flu test?
    sentences:
      - >-
        The saying "a piece of cake" means something that's simple to
        accomplish. If a school assignment is a piece of cake, it's so easy that
        you will barely have to think about it. Other ways to say "it's a piece
        of cake" include no problem or it's a breeze.
      - >-
        This variation in ability to detect viruses can result in some people
        who are infected with the flu having a negative rapid test result. (This
        situation is called a false negative test result.)
      - >-
        Unstable Wi-Fi is often caused by wireless congestion. Congestion
        problems are common in apartment complexes or densely packed
        neighborhoods. The more people using the internet, the greater the
        instability. When many people in the same area are working from home,
        connectivity suffers.
  - source_sentence: what are the requirements to become a health inspector?
    sentences:
      - >-
        You'll need an accredited health and safety qualification to become a
        health and safety inspector. Many recruiters ask for a NEBOSH diploma as
        it's accredited by the Institution of Occupational Health and Safety.
        This is a degree-level course that you can study at a variety of
        institutions, as well as online.
      - >-
        ['Open a PDF file in Acrobat DC.', 'Click on the “Export PDF” tool in
        the right pane.', 'Choose Microsoft Word as your export format, and then
        choose “Word Document.”', 'Click “Export.” If your PDF contains scanned
        text, the Acrobat Word converter will run text recognition
        automatically.']
      - >-
        ['Remain calm. ... ', "Don't take it personally. ... ", 'Use your best
        listening skills. ... ', 'Actively sympathize. ... ', 'Apologize
        gracefully. ... ', 'Find a solution. ... ', 'Take a few minutes on your
        own.']
  - source_sentence: is toprol xl the same as metoprolol?
    sentences:
      - >-
        Carbs: 35 grams. Fiber: 11 grams. Folate: 88% of the DV. Copper: 50% of
        the DV.
      - >-
        A Dependent Care Flexible Spending Account, or “FSA,” is a pre-tax
        benefit account used to pay for dependent care services while you are at
        work. The money you contribute to a Dependent Care FSA is not subject to
        payroll taxes, so you end up paying less in taxes and taking home more
        of your paycheck.
      - >-
        Metoprolol succinate is also known by the brand name Toprol XL. It is
        the extended-release form of metoprolol. Metoprolol succinate is
        approved to treat high blood pressure, chronic chest pain, and
        congestive heart failure.
  - source_sentence: how can i get copy of marriage license?
    sentences:
      - >-
        Probiotics can help with digestion Without probiotics, antibiotics can
        sometimes wipe out the protective gut bacteria, which is no good for
        your digestive system. Probiotics are thought to directly kill or
        inhibit the growth of harmful bacteria, stopping them from producing
        toxic substances that can make you ill.
      - >-
        Order in person You can order a certificate in person from Monday to
        Friday between 9am and 5pm. Please come to the register office at 45
        Beavor Lane, Hammersmith, London W6 9AR.
      - >-
        Worms and ants are more related because spiders contain hair and ants do
        not. Worms do not contain hair as well.
pipeline_tag: sentence-similarity
library_name: sentence-transformers

SentenceTransformer based on avsolatorio/GIST-small-Embedding-v0

This is a sentence-transformers model finetuned from avsolatorio/GIST-small-Embedding-v0. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: avsolatorio/GIST-small-Embedding-v0
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("moshew/gist_small_ft_gooaq")
# Run inference
sentences = [
    'how can i get copy of marriage license?',
    'Order in person You can order a certificate in person from Monday to Friday between 9am and 5pm. Please come to the register office at 45 Beavor Lane, Hammersmith, London W6 9AR.',
    'Worms and ants are more related because spiders contain hair and ants do not. Worms do not contain hair as well.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 200 training samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 200 samples:
    sentence1 sentence2 label
    type string string float
    details
    • min: 8 tokens
    • mean: 11.8 tokens
    • max: 23 tokens
    • min: 18 tokens
    • mean: 61.8 tokens
    • max: 125 tokens
    • min: 0.0
    • mean: 0.5
    • max: 1.0
  • Samples:
    sentence1 sentence2 label
    how many days can i drive my car without mot? If your car fails its MOT you can only continue to drive it if the previous year's MOT is still valid - which might occur if you submitted the car for its test two weeks early. You can still drive it away from the testing centre or garage if no 'dangerous' problems were identified during the MOT. 1.0
    how many days can i drive my car without mot? Low-FODMAP vegetables include: Bean sprouts, capsicum, carrot, choy sum, eggplant, kale, tomato, spinach and zucchini ( 7 , 8 ). Summary: Vegetables contain a diverse range of FODMAPs. However, many vegetables are naturally low in FODMAPs. 0.0
    what are underlying shares of stock? Underlying Shares means the shares of Common Stock issued and issuable upon conversion of the Preferred Stock, upon exercise of the Warrants and issued and issuable in lieu of the cash payment of dividends on the Preferred Stock in accordance with the terms of the Certificate of Designation. 1.0
  • Loss: CoSENTLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "pairwise_cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • seed: 12
  • bf16: True
  • half_precision_backend: cpu_amp
  • dataloader_num_workers: 4

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 12
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: cpu_amp
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 4
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss
0.0769 1 0.2709

Framework Versions

  • Python: 3.11.12
  • Sentence Transformers: 4.1.0
  • Transformers: 4.51.1
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.5.2
  • Datasets: 3.5.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CoSENTLoss

@online{kexuefm-8847,
    title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
    author={Su Jianlin},
    year={2022},
    month={Jan},
    url={https://kexue.fm/archives/8847},
}