diff --git a/index.faiss b/index.faiss index 498f50a877a9001f92dd69e256ee2844cecce54e..57faf696f1bf59f4029f0ea30bf64b36e93848dd 100644 --- a/index.faiss +++ b/index.faiss @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:78a068ac98a5de614955c9c1e307b40f7b403bd46d315cf3b583f22466bf5e7a -size 3545133 +oid sha256:9c3f83a20b1e8774d976546f71114c9bd9195f65e2e101a24bf21a31b9945b0c +size 4187181 diff --git a/index_to_metadata.pkl b/index_to_metadata.pkl index 8cb94ccb77d189c08a76714e23fb550c0b595e8f..e72fbc4a72a31804e985cd2d71cbfc3353e75044 100644 --- a/index_to_metadata.pkl +++ b/index_to_metadata.pkl @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:fe076a7a46265654075c23b846b259d7beaef163f5a5c72d14246a0e3d73579f -size 530243 +oid sha256:febfb797520f91f6872504c1bdcbe58d5446caf4da8e9bdcd38c99bab4882e26 +size 689653 diff --git a/model_data_json/Alibaba-NLP_gte-modernbert-base.json b/model_data_json/Alibaba-NLP_gte-modernbert-base.json new file mode 100644 index 0000000000000000000000000000000000000000..63901e78b106dd55649ae3969248536cb0f3bd02 --- /dev/null +++ b/model_data_json/Alibaba-NLP_gte-modernbert-base.json @@ -0,0 +1,26 @@ +{ + "model_id": "Alibaba-NLP/gte-modernbert-base", + "downloads": 69419, + "tags": [ + "transformers", + "pytorch", + "onnx", + "safetensors", + "modernbert", + "feature-extraction", + "sentence-transformers", + "mteb", + "embedding", + "transformers.js", + "sentence-similarity", + "en", + "arxiv:2308.03281", + "base_model:answerdotai/ModernBERT-base", + "base_model:finetune:answerdotai/ModernBERT-base", + "license:apache-2.0", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: apache-2.0 language: - en base_model: - answerdotai/ModernBERT-base base_model_relation: finetune pipeline_tag: sentence-similarity library_name: transformers tags: - sentence-transformers - mteb - embedding - transformers.js --- # gte-modernbert-base We are excited to introduce the series of models, which are built upon the latest modernBERT pre-trained encoder-only foundation models. The series models include both text embedding models and rerank models. The models demonstrates competitive performance in several text embedding and text retrieval evaluation tasks when compared to similar-scale models from the current open-source community. This includes assessments such as MTEB, LoCO, and COIR evaluation. ## Model Overview - Developed by: Tongyi Lab, Alibaba Group - Model Type: Text Embedding - Primary Language: English - Model Size: 149M - Max Input Length: 8192 tokens - Output Dimension: 768 ### Model list | Models | Language | Model Type | Model Size | Max Seq. Length | Dimension | MTEB-en | BEIR | LoCo | CoIR | |:--------------------------------------------------------------------------------------:|:--------:|:----------------------:|:----------:|:---------------:|:---------:|:-------:|:----:|:----:|:----:| | []( | English | text embedding | 149M | 8192 | 768 | 64.38 | 55.33 | 87.57 | 79.31 | | []( | English | text reranker | 149M | 8192 | - | - | 56.19 | 90.68 | 79.99 | ## Usage > [!TIP] > For and , if your GPU supports it, the efficient Flash Attention 2 will be used automatically if you have installed. It is not mandatory. > > Use with Use with : Use with : ## Training Details The series of models follows the training scheme of the previous GTE models, with the only difference being that the pre-training language model base has been replaced from GTE-MLM to ModernBert. For more training details, please refer to our paper: mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval ## Evaluation ### MTEB The results of other models are retrieved from MTEB leaderboard. Given that all models in the series have a size of less than 1B parameters, we focused exclusively on the results of models under 1B from the MTEB leaderboard. | Model Name | Param Size (M) | Dimension | Sequence Length | Average (56) | Class. (12) | Clust. (11) | Pair Class. (3) | Reran. (4) | Retr. (15) | STS (10) | Summ. (1) | |:------------------------------------------------------------------------------------------------:|:--------------:|:---------:|:---------------:|:------------:|:-----------:|:---:|:---:|:---:|:---:|:-----------:|:--------:| | mxbai-embed-large-v1 | 335 | 1024 | 512 | 64.68 | 75.64 | 46.71 | 87.2 | 60.11 | 54.39 | 85 | 32.71 | | multilingual-e5-large-instruct | 560 | 1024 | 514 | 64.41 | 77.56 | 47.1 | 86.19 | 58.58 | 52.47 | 84.78 | 30.39 | | bge-large-en-v1.5 | 335 | 1024 | 512 | 64.23 | 75.97 | 46.08 | 87.12 | 60.03 | 54.29 | 83.11 | 31.61 | | gte-base-en-v1.5 | 137 | 768 | 8192 | 64.11 | 77.17 | 46.82 | 85.33 | 57.66 | 54.09 | 81.97 | 31.17 | | bge-base-en-v1.5 | 109 | 768 | 512 | 63.55 | 75.53 | 45.77 | 86.55 | 58.86 | 53.25 | 82.4 | 31.07 | | gte-large-en-v1.5 | 409 | 1024 | 8192 | 65.39 | 77.75 | 47.95 | 84.63 | 58.50 | 57.91 | 81.43 | 30.91 | | modernbert-embed-base | 149 | 768 | 8192 | 62.62 | 74.31 | 44.98 | 83.96 | 56.42 | 52.89 | 81.78 | 31.39 | | nomic-embed-text-v1.5 | | 768 | 8192 | 62.28 | 73.55 | 43.93 | 84.61 | 55.78 | 53.01| 81.94 | 30.4 | | gte-multilingual-base | 305 | 768 | 8192 | 61.4 | 70.89 | 44.31 | 84.24 | 57.47 |51.08 | 82.11 | 30.58 | | jina-embeddings-v3 | 572 | 1024 | 8192 | 65.51 | 82.58 |45.21 |84.01 |58.13 |53.88 | 85.81 | 29.71 | | **gte-modernbert-base** | 149 | 768 | 8192 | **64.38** | **76.99** | **46.47** | **85.93** | **59.24** | **55.33** | **81.57** | **30.68** | ### LoCo (Long Document Retrieval)(NDCG@10) | Model Name | Dimension | Sequence Length | Average (5) | QsmsumRetrieval | SummScreenRetrieval | QasperAbastractRetrieval | QasperTitleRetrieval | GovReportRetrieval | |:----:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| | gte-qwen1.5-7b | 4096 | 32768 | 87.57 | 49.37 | 93.10 | 99.67 | 97.54 | 98.21 | | gte-large-v1.5 |1024 | 8192 | 86.71 | 44.55 | 92.61 | 99.82 | 97.81 | 98.74 | | gte-base-v1.5 | 768 | 8192 | 87.44 | 49.91 | 91.78 | 99.82 | 97.13 | 98.58 | | gte-modernbert-base | 768 | 8192 | 88.88 | 54.45 | 93.00 | 99.82 | 98.03 | 98.70 | | gte-reranker-modernbert-base | - | 8192 | 90.68 | 70.86 | 94.06 | 99.73 | 99.11 | 89.67 | ### COIR (Code Retrieval Task)(NDCG@10) | Model Name | Dimension | Sequence Length | Average(20) | CodeSearchNet-ccr-go | CodeSearchNet-ccr-java | CodeSearchNet-ccr-javascript | CodeSearchNet-ccr-php | CodeSearchNet-ccr-python | CodeSearchNet-ccr-ruby | CodeSearchNet-go | CodeSearchNet-java | CodeSearchNet-javascript | CodeSearchNet-php | CodeSearchNet-python | CodeSearchNet-ruby | apps | codefeedback-mt | codefeedback-st | codetrans-contest | codetrans-dl | cosqa | stackoverflow-qa | synthetic-text2sql | |:----:|:---:|:---:|:---:|:---:| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | | gte-modernbert-base | 768 | 8192 | 79.31 | 94.15 | 93.57 | 94.27 | 91.51 | 93.93 | 90.63 | 88.32 | 83.27 | 76.05 | 85.12 | 88.16 | 77.59 | 57.54 | 82.34 | 85.95 | 71.89 | 35.46 | 43.47 | 91.2 | 61.87 | | gte-reranker-modernbert-base | - | 8192 | 79.99 | 96.43 | 96.88 | 98.32 | 91.81 | 97.7 | 91.96 | 88.81 | 79.71 | 76.27 | 89.39 | 98.37 | 84.11 | 47.57 | 83.37 | 88.91 | 49.66 | 36.36 | 44.37 | 89.58 | 64.21 | ### BEIR(NDCG@10) | Model Name | Dimension | Sequence Length | Average(15) | ArguAna | ClimateFEVER | CQADupstackAndroidRetrieval | DBPedia | FEVER | FiQA2018 | HotpotQA | MSMARCO | NFCorpus | NQ | QuoraRetrieval | SCIDOCS | SciFact | Touche2020 | TRECCOVID | | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | | gte-modernbert-base | 768 | 8192 | 55.33 | 72.68 | 37.74 | 42.63 | 41.79 | 91.03 | 48.81 | 69.47 | 40.9 | 36.44 | 57.62 | 88.55 | 21.29 | 77.4 | 21.68 | 81.95 | | gte-reranker-modernbert-base | - | 8192 | 56.73 | 69.03 | 37.79 | 44.68 | 47.23 | 94.54 | 49.81 | 78.16 | 45.38 | 30.69 | 64.57 | 87.77 | 20.60 | 73.57 | 27.36 | 79.89 | ## Hiring We have open positions for **Research Interns** and **Full-Time Researchers** to join our team at Tongyi Lab. We are seeking passionate individuals with expertise in representation learning, LLM-driven information retrieval, Retrieval-Augmented Generation (RAG), and agent-based systems. Our team is located in the vibrant cities of **Beijing** and **Hangzhou**. If you are driven by curiosity and eager to make a meaningful impact through your work, we would love to hear from you. Please submit your resume along with a brief introduction to dingkun.ldk@alibaba-inc.com. ## Citation If you find our paper or models helpful, feel free to give us a cite.", + "model_explanation_gemini": "Generates English text embeddings for sentence similarity and retrieval tasks, achieving competitive performance in MTEB, LoCO, and COIR evaluations compared to similar-scale models. \n\n**Features:** \n- **Model Type:** Text Embedding \n- **Base Model:** ModernBERT-base (finetuned) \n- **Language:** English \n- **Size:** 149M parameters \n- **Max Input Length:** 8192 tokens \n- **Output Dimension:** 768 \n- **Key Metrics:** MT" +} \ No newline at end of file diff --git a/model_data_json/BAAI_bge-multilingual-gemma2.json b/model_data_json/BAAI_bge-multilingual-gemma2.json new file mode 100644 index 0000000000000000000000000000000000000000..8a7d6eca228577cf7e238f0f4d1b9c79c5dcc736 --- /dev/null +++ b/model_data_json/BAAI_bge-multilingual-gemma2.json @@ -0,0 +1,23 @@ +{ + "model_id": "BAAI/bge-multilingual-gemma2", + "downloads": 76038, + "tags": [ + "sentence-transformers", + "safetensors", + "gemma2", + "feature-extraction", + "sentence-similarity", + "transformers", + "mteb", + "arxiv:2402.03216", + "arxiv:2309.07597", + "license:gemma", + "model-index", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- tags: - feature-extraction - sentence-similarity - sentence-transformers - transformers - mteb license: gemma model-index: - name: bge-multilingual-gemma2 results: - task: type: Retrieval dataset: type: mteb/nfcorpus name: MTEB NFCorpus config: default split: test revision: ec0fa4fe99da2ff19ca1214b7966684033a58814 metrics: - type: main_score value: 38.11433513284057 - type: ndcg_at_1 value: 48.45201238390093 - type: ndcg_at_3 value: 44.451438575534574 - type: ndcg_at_5 value: 41.13929990797894 - type: ndcg_at_10 value: 38.11433513284057 - type: ndcg_at_100 value: 35.36065387898559 - type: ndcg_at_1000 value: 44.01125752781003 - type: map_at_1 value: 5.638004398054564 - type: map_at_3 value: 10.375632572339333 - type: map_at_5 value: 11.820531148202422 - type: map_at_10 value: 14.087436978063389 - type: map_at_100 value: 18.25397463114958 - type: map_at_1000 value: 19.868440221606203 - type: precision_at_1 value: 49.84520123839009 - type: precision_at_3 value: 41.89886480908153 - type: precision_at_5 value: 35.356037151702814 - type: precision_at_10 value: 28.513931888544857 - type: precision_at_100 value: 9.337461300309604 - type: precision_at_1000 value: 2.210216718266251 - type: recall_at_1 value: 5.638004398054564 - type: recall_at_3 value: 11.938154656310312 - type: recall_at_5 value: 14.06183119422843 - type: recall_at_10 value: 18.506397834147705 - type: recall_at_100 value: 35.96995569451433 - type: recall_at_1000 value: 68.31771509404795 - task: type: Retrieval dataset: type: mteb/msmarco name: MTEB MSMARCO config: default split: dev revision: c5a29a104738b98a9e76336939199e264163d4a0 metrics: - type: main_score value: 45.70688915742828 - type: ndcg_at_1 value: 26.002865329512893 - type: ndcg_at_3 value: 37.49665652114275 - type: ndcg_at_5 value: 41.684045067615834 - type: ndcg_at_10 value: 45.70688915742828 - type: ndcg_at_100 value: 51.08932609519671 - type: ndcg_at_1000 value: 51.98806137292924 - type: map_at_1 value: 25.35219675262655 - type: map_at_3 value: 34.39549506526583 - type: map_at_5 value: 36.74936326010824 - type: map_at_10 value: 38.44429852488596 - type: map_at_100 value: 39.60260286311527 - type: map_at_1000 value: 39.64076154054021 - type: precision_at_1 value: 26.002865329512893 - type: precision_at_3 value: 15.840496657115954 - type: precision_at_5 value: 11.647564469914684 - type: precision_at_10 value: 7.1275071633243705 - type: precision_at_100 value: 0.9782234957019871 - type: precision_at_1000 value: 0.10565902578797497 - type: recall_at_1 value: 25.35219675262655 - type: recall_at_3 value: 45.78438395415474 - type: recall_at_5 value: 55.83213944603631 - type: recall_at_10 value: 68.08500477554918 - type: recall_at_100 value: 92.55133715377269 - type: recall_at_1000 value: 99.29083094555875 - task: type: Retrieval dataset: type: mteb/fiqa name: MTEB FiQA2018 config: default split: test revision: 27a168819829fe9bcd655c2df245fb19452e8e06 metrics: - type: main_score value: 60.04205769404706 - type: ndcg_at_1 value: 59.25925925925925 - type: ndcg_at_3 value: 55.96637679199298 - type: ndcg_at_5 value: 56.937223390223956 - type: ndcg_at_10 value: 60.04205769404706 - type: ndcg_at_100 value: 66.01619664462949 - type: ndcg_at_1000 value: 67.59651529720728 - type: map_at_1 value: 31.5081163692275 - type: map_at_3 value: 45.7486689836227 - type: map_at_5 value: 48.944906602314 - type: map_at_10 value: 51.85427043799874 - type: map_at_100 value: 53.92920237379484 - type: map_at_1000 value: 54.04694438963671 - type: precision_at_1 value: 59.25925925925925 - type: precision_at_3 value: 37.44855967078195 - type: precision_at_5 value: 26.913580246913547 - type: precision_at_10 value: 16.52777777777774 - type: precision_at_100 value: 2.2962962962962754 - type: precision_at_1000 value: 0.2566358024691334 - type: recall_at_1 value: 31.5081163692275 - type: recall_at_3 value: 50.71759045138676 - type: recall_at_5 value: 57.49321152098932 - type: recall_at_10 value: 67.36356750245642 - type: recall_at_100 value: 88.67335767798735 - type: recall_at_1000 value: 97.83069725199356 - task: type: Retrieval dataset: type: mteb/scidocs name: MTEB SCIDOCS config: default split: test revision: f8c2fcf00f625baaa80f62ec5bd9e1fff3b8ae88 metrics: - type: main_score value: 26.93150756480961 - type: ndcg_at_1 value: 30.8 - type: ndcg_at_3 value: 25.048085553386628 - type: ndcg_at_5 value: 22.351207380852305 - type: ndcg_at_10 value: 26.93150756480961 - type: ndcg_at_100 value: 37.965486832874014 - type: ndcg_at_1000 value: 43.346046425140244 - type: map_at_1 value: 6.238333333333366 - type: map_at_3 value: 11.479166666666679 - type: map_at_5 value: 14.215999999999983 - type: map_at_10 value: 16.774632936507945 - type: map_at_100 value: 20.148869158557293 - type: map_at_1000 value: 20.528644104490823 - type: precision_at_1 value: 30.8 - type: precision_at_3 value: 23.466666666666736 - type: precision_at_5 value: 19.899999999999967 - type: precision_at_10 value: 14.069999999999938 - type: precision_at_100 value: 2.9770000000000065 - type: precision_at_1000 value: 0.42569999999999486 - type: recall_at_1 value: 6.238333333333366 - type: recall_at_3 value: 14.29333333333338 - type: recall_at_5 value: 20.206666666666628 - type: recall_at_10 value: 28.573333333333224 - type: recall_at_100 value: 60.43666666666675 - type: recall_at_1000 value: 86.3649999999997 - task: type: Retrieval dataset: type: mteb/fever name: MTEB FEVER config: default split: test revision: bea83ef9e8fb933d90a2f1d5515737465d613e12 metrics: - type: main_score value: 90.38165339181239 - type: ndcg_at_1 value: 84.86348634863486 - type: ndcg_at_3 value: 88.98667069230609 - type: ndcg_at_5 value: 89.86028996734895 - type: ndcg_at_10 value: 90.38165339181239 - type: ndcg_at_100 value: 90.99655378684439 - type: ndcg_at_1000 value: 91.15536362599602 - type: map_at_1 value: 78.8556296105801 - type: map_at_3 value: 86.24061810942983 - type: map_at_5 value: 86.94776680048933 - type: map_at_10 value: 87.26956235873007 - type: map_at_100 value: 87.47986397174834 - type: map_at_1000 value: 87.4897076664281 - type: precision_at_1 value: 84.86348634863486 - type: precision_at_3 value: 34.02340234023296 - type: precision_at_5 value: 21.10411041104359 - type: precision_at_10 value: 10.828082808282083 - type: precision_at_100 value: 1.1381638163816703 - type: precision_at_1000 value: 0.11662166216622569 - type: recall_at_1 value: 78.8556296105801 - type: recall_at_3 value: 92.34465708475605 - type: recall_at_5 value: 94.58010682020583 - type: recall_at_10 value: 96.10713452297611 - type: recall_at_100 value: 98.31672452959585 - type: recall_at_1000 value: 99.25967001462051 - task: type: Retrieval dataset: type: mteb/arguana name: MTEB ArguAna config: default split: test revision: c22ab2a51041ffd869aaddef7af8d8215647e41a metrics: - type: main_score value: 77.36555747844541 - type: ndcg_at_1 value: 57.681365576102415 - type: ndcg_at_3 value: 72.01664798084765 - type: ndcg_at_5 value: 75.26345973082836 - type: ndcg_at_10 value: 77.36555747844541 - type: ndcg_at_100 value: 78.15567833673768 - type: ndcg_at_1000 value: 78.16528851292641 - type: map_at_1 value: 57.681365576102415 - type: map_at_3 value: 68.59886201991475 - type: map_at_5 value: 70.38051209103858 - type: map_at_10 value: 71.26684955632336 - type: map_at_100 value: 71.4637216600468 - type: map_at_1000 value: 71.46414501573332 - type: precision_at_1 value: 57.681365576102415 - type: precision_at_3 value: 27.287814129919084 - type: precision_at_5 value: 17.965860597439132 - type: precision_at_10 value: 9.623044096728066 - type: precision_at_100 value: 0.995732574679925 - type: precision_at_1000 value: 0.09964438122332549 - type: recall_at_1 value: 57.681365576102415 - type: recall_at_3 value: 81.86344238975818 - type: recall_at_5 value: 89.82930298719772 - type: recall_at_10 value: 96.23044096728307 - type: recall_at_100 value: 99.57325746799431 - type: recall_at_1000 value: 99.6443812233286 - task: type: Retrieval dataset: type: mteb/scifact name: MTEB SciFact config: default split: test revision: 0228b52cf27578f30900b9e5271d331663a030d7 metrics: - type: main_score value: 72.0465439956427 - type: ndcg_at_1 value: 58.666666666666664 - type: ndcg_at_3 value: 66.84566274610046 - type: ndcg_at_5 value: 69.46578881873717 - type: ndcg_at_10 value: 72.0465439956427 - type: ndcg_at_100 value: 74.25705461923272 - type: ndcg_at_1000 value: 74.63689058493014 - type: map_at_1 value: 55.59444444444445 - type: map_at_3 value: 63.71851851851852 - type: map_at_5 value: 65.5362962962963 - type: map_at_10 value: 66.84112433862435 - type: map_at_100 value: 67.36269426417417 - type: map_at_1000 value: 67.37568665562833 - type: precision_at_1 value: 58.666666666666664 - type: precision_at_3 value: 26.444444444444425 - type: precision_at_5 value: 17.66666666666672 - type: precision_at_10 value: 9.866666666666706 - type: precision_at_100 value: 1.0966666666666596 - type: precision_at_1000 value: 0.11266666666666675 - type: recall_at_1 value: 55.59444444444445 - type: recall_at_3 value: 72.72777777777777 - type: recall_at_5 value: 79.31666666666666 - type: recall_at_10 value: 86.75 - type: recall_at_100 value: 96.66666666666667 - type: recall_at_1000 value: 99.66666666666667 - task: type: Retrieval dataset: type: mteb/trec-covid name: MTEB TRECCOVID config: default split: test revision: bb9466bac8153a0349341eb1b22e06409e78ef4e metrics: - type: main_score value: 64.26928884606035 - type: ndcg_at_1 value: 63.0 - type: ndcg_at_3 value: 64.18432764386345 - type: ndcg_at_5 value: 64.73235515799435 - type: ndcg_at_10 value: 64.26928884606035 - type: ndcg_at_100 value: 52.39807133285409 - type: ndcg_at_1000 value: 52.19937563361241 - type: map_at_1 value: 0.18483494997310454 - type: map_at_3 value: 0.5139705769331114 - type: map_at_5 value: 0.8245601222717243 - type: map_at_10 value: 1.5832530269558573 - type: map_at_100 value: 9.664760850102393 - type: map_at_1000 value: 25.568347406468334 - type: precision_at_1 value: 70.0 - type: precision_at_3 value: 71.33333333333333 - type: precision_at_5 value: 71.60000000000001 - type: precision_at_10 value: 70.99999999999996 - type: precision_at_100 value: 55.140000000000015 - type: precision_at_1000 value: 23.857999999999997 - type: recall_at_1 value: 0.18483494997310454 - type: recall_at_3 value: 0.5584287301859913 - type: recall_at_5 value: 0.9489025953807098 - type: recall_at_10 value: 1.9023711039425688 - type: recall_at_100 value: 13.596810701594226 - type: recall_at_1000 value: 50.92058432920189 - task: type: Retrieval dataset: type: mteb/climate-fever name: MTEB ClimateFEVER config: default split: test revision: 47f2ac6acb640fc46020b02a5b59fdda04d39380 metrics: - type: main_score value: 39.37204193531481 - type: ndcg_at_1 value: 35.11400651465798 - type: ndcg_at_3 value: 32.36672790229743 - type: ndcg_at_5 value: 34.79369234162357 - type: ndcg_at_10 value: 39.37204193531481 - type: ndcg_at_100 value: 47.544500439419124 - type: ndcg_at_1000 value: 50.305733346049855 - type: map_at_1 value: 15.516829533116216 - type: map_at_3 value: 23.73669923995656 - type: map_at_5 value: 26.43208469055373 - type: map_at_10 value: 28.912036175309773 - type: map_at_100 value: 31.413762299240894 - type: map_at_1000 value: 31.596796093997014 - type: precision_at_1 value: 35.11400651465798 - type: precision_at_3 value: 24.994571118349487 - type: precision_at_5 value: 19.231270358305956 - type: precision_at_10 value: 12.690553745928165 - type: precision_at_100 value: 2.1576547231270466 - type: precision_at_1000 value: 0.2676221498371306 - type: recall_at_1 value: 15.516829533116216 - type: recall_at_3 value: 29.994571118349512 - type: recall_at_5 value: 37.14223669923993 - type: recall_at_10 value: 47.29207383279043 - type: recall_at_100 value: 74.37133550488598 - type: recall_at_1000 value: 89.41585233441913 - task: type: Retrieval dataset: type: mteb/hotpotqa name: MTEB HotpotQA config: default split: test revision: ab518f4d6fcca38d87c25209f94beba119d02014 metrics: - type: main_score value: 83.26282954330777 - type: ndcg_at_1 value: 87.5489534098582 - type: ndcg_at_3 value: 78.7646435855166 - type: ndcg_at_5 value: 81.41629077444277 - type: ndcg_at_10 value: 83.26282954330777 - type: ndcg_at_100 value: 85.2771369900158 - type: ndcg_at_1000 value: 85.77519303747493 - type: map_at_1 value: 43.7744767049291 - type: map_at_3 value: 73.4661264911093 - type: map_at_5 value: 75.7169705154168 - type: map_at_10 value: 76.89183627536043 - type: map_at_100 value: 77.53680315727078 - type: map_at_1000 value: 77.5649311522075 - type: precision_at_1 value: 87.5489534098582 - type: precision_at_3 value: 51.74881836596788 - type: precision_at_5 value: 33.13977042539127 - type: precision_at_10 value: 17.492234976369023 - type: precision_at_100 value: 1.9030384875084312 - type: precision_at_1000 value: 0.19679945982446267 - type: recall_at_1 value: 43.7744767049291 - type: recall_at_3 value: 77.62322754895341 - type: recall_at_5 value: 82.84942606347063 - type: recall_at_10 value: 87.4611748818366 - type: recall_at_100 value: 95.15192437542201 - type: recall_at_1000 value: 98.39972991222147 - task: type: Retrieval dataset: type: mteb/nq name: MTEB NQ config: default split: test revision: b774495ed302d8c44a3a7ea25c90dbce03968f31 metrics: - type: main_score value: 71.44670934705796 - type: ndcg_at_1 value: 54.026651216685984 - type: ndcg_at_3 value: 65.1267452491225 - type: ndcg_at_5 value: 68.6696802020747 - type: ndcg_at_10 value: 71.44670934705796 - type: ndcg_at_100 value: 73.74642927386503 - type: ndcg_at_1000 value: 73.90908268307331 - type: map_at_1 value: 48.50086906141366 - type: map_at_3 value: 61.07691193510995 - type: map_at_5 value: 63.36580243337187 - type: map_at_10 value: 64.74485498782997 - type: map_at_100 value: 65.34329174534082 - type: map_at_1000 value: 65.35107870745652 - type: precision_at_1 value: 54.026651216685984 - type: precision_at_3 value: 28.437620702974996 - type: precision_at_5 value: 19.20625724217861 - type: precision_at_10 value: 10.67207415990753 - type: precision_at_100 value: 1.1987253765932955 - type: precision_at_1000 value: 0.12143684820393259 - type: recall_at_1 value: 48.50086906141366 - type: recall_at_3 value: 73.19428350714561 - type: recall_at_5 value: 81.19689069138664 - type: recall_at_10 value: 89.04741212823485 - type: recall_at_100 value: 98.58053302433372 - type: recall_at_1000 value: 99.75376593279258 - task: type: Retrieval dataset: type: mteb/quora name: MTEB QuoraRetrieval config: default split: test revision: e4e08e0b7dbe3c8700f0daef558ff32256715259 metrics: - type: main_score value: 90.03760323006117 - type: ndcg_at_1 value: 83.53 - type: ndcg_at_3 value: 87.53800795646302 - type: ndcg_at_5 value: 88.92909168525203 - type: ndcg_at_10 value: 90.03760323006117 - type: ndcg_at_100 value: 91.08558507332712 - type: ndcg_at_1000 value: 91.1430039358834 - type: map_at_1 value: 72.61760432018744 - type: map_at_3 value: 83.8457060028347 - type: map_at_5 value: 85.6228412692169 - type: map_at_10 value: 86.67700531365115 - type: map_at_100 value: 87.29851728827602 - type: map_at_1000 value: 87.31014621733333 - type: precision_at_1 value: 83.53 - type: precision_at_3 value: 38.33666666667159 - type: precision_at_5 value: 25.12599999999881 - type: precision_at_10 value: 13.629999999998683 - type: precision_at_100 value: 1.5431999999999773 - type: precision_at_1000 value: 0.15671999999997974 - type: recall_at_1 value: 72.61760432018744 - type: recall_at_3 value: 89.06736052932686 - type: recall_at_5 value: 93.09634203522849 - type: recall_at_10 value: 96.35128012894234 - type: recall_at_100 value: 99.7740237858541 - type: recall_at_1000 value: 99.99690476190477 - task: type: Retrieval dataset: type: mteb/webis-touche2020 name: MTEB Touche2020 config: default split: test revision: a34f9a33db75fa0cbb21bb5cfc3dae8dc8bec93f metrics: - type: main_score value: 30.2563523019649 - type: ndcg_at_1 value: 37.755102040816325 - type: ndcg_at_3 value: 34.45349994459905 - type: ndcg_at_5 value: 32.508805919063086 - type: ndcg_at_10 value: 30.2563523019649 - type: ndcg_at_100 value: 40.538336664503746 - type: ndcg_at_1000 value: 52.2066951614923 - type: map_at_1 value: 2.75537988273998 - type: map_at_3 value: 6.011397290504469 - type: map_at_5 value: 8.666495836494098 - type: map_at_10 value: 12.17701515007822 - type: map_at_100 value: 18.789086471205852 - type: map_at_1000 value: 20.42972375502502 - type: precision_at_1 value: 40.816326530612244 - type: precision_at_3 value: 35.37414965986394 - type: precision_at_5 value: 32.244897959183675 - type: precision_at_10 value: 26.93877551020408 - type: precision_at_100 value: 8.163265306122451 - type: precision_at_1000 value: 1.5979591836734703 - type: recall_at_1 value: 2.75537988273998 - type: recall_at_3 value: 7.254270324385098 - type: recall_at_5 value: 11.580137100328589 - type: recall_at_10 value: 18.745232816450553 - type: recall_at_100 value: 50.196809658622755 - type: recall_at_1000 value: 85.87317364148332 - task: type: Retrieval dataset: type: mteb/dbpedia name: MTEB DBPedia config: default split: test revision: c0f706b76e590d620bd6618b3ca8efdd34e2d659 metrics: - type: main_score value: 51.36940792375597 - type: ndcg_at_1 value: 65.125 - type: ndcg_at_3 value: 55.3967569049025 - type: ndcg_at_5 value: 53.09668587926677 - type: ndcg_at_10 value: 51.36940792375597 - type: ndcg_at_100 value: 56.69623269243084 - type: ndcg_at_1000 value: 63.481061270842 - type: map_at_1 value: 10.265595545755545 - type: map_at_3 value: 16.776544233350698 - type: map_at_5 value: 20.184523605272798 - type: map_at_10 value: 24.772797659849264 - type: map_at_100 value: 36.72689012514183 - type: map_at_1000 value: 38.73869985105569 - type: precision_at_1 value: 77.5 - type: precision_at_3 value: 59.75000000000003 - type: precision_at_5 value: 52.849999999999994 - type: precision_at_10 value: 42.47499999999995 - type: precision_at_100 value: 13.614999999999993 - type: precision_at_1000 value: 2.500749999999998 - type: recall_at_1 value: 10.265595545755545 - type: recall_at_3 value: 17.819804963534246 - type: recall_at_5 value: 22.46124219601634 - type: recall_at_10 value: 30.44583516613163 - type: recall_at_100 value: 63.84118006287797 - type: recall_at_1000 value: 85.06450356093833 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackRetrieval config: default split: test revision: 4ffe81d471b1924886b33c7567bfb200e9eec5c4 metrics: - type: main_score value: 47.93921415959017 - type: ndcg_at_1 value: 36.526219490536015 - type: ndcg_at_3 value: 42.35099043224295 - type: ndcg_at_5 value: 44.989685312964156 - type: ndcg_at_10 value: 47.93921415959017 - type: ndcg_at_100 value: 53.05390282389675 - type: ndcg_at_1000 value: 54.776052731794266 - type: map_at_1 value: 30.818605279548184 - type: map_at_3 value: 38.363350019087974 - type: map_at_5 value: 40.295203936887226 - type: map_at_10 value: 41.81978941662592 - type: map_at_100 value: 43.13300727554278 - type: map_at_1000 value: 43.2351061120207 - type: precision_at_1 value: 36.526219490536015 - type: precision_at_3 value: 19.550515857206346 - type: precision_at_5 value: 13.958783060831967 - type: precision_at_10 value: 8.498592395773393 - type: precision_at_100 value: 1.3024888941713948 - type: precision_at_1000 value: 0.1630253057414617 - type: recall_at_1 value: 30.818605279548184 - type: recall_at_3 value: 45.9132085981904 - type: recall_at_5 value: 52.6851323959227 - type: recall_at_10 value: 61.39718618970463 - type: recall_at_100 value: 83.30757187969981 - type: recall_at_1000 value: 94.9192024147964 - dataset: config: en name: MTEB AmazonCounterfactualClassification (en) revision: e8379541af4e31359cca9fbcf4b00f2671dba205 split: test type: mteb/amazon_counterfactual metrics: - type: accuracy value: 89.47761194029852 - type: accuracy_stderr value: 1.6502495811564162 - type: ap value: 62.20813715457866 - type: ap_stderr value: 3.7902166647587854 - type: f1 value: 84.91493292274734 - type: f1_stderr value: 1.9572239640276208 - type: main_score value: 89.47761194029852 task: type: Classification - dataset: config: default name: MTEB AmazonPolarityClassification revision: e2d317d38cd51312af73b3d32a06d1a08b442046 split: test type: mteb/amazon_polarity metrics: - type: accuracy value: 96.89569999999999 - type: accuracy_stderr value: 0.6886368582206464 - type: ap value: 95.38531339207739 - type: ap_stderr value: 0.9009257949898158 - type: f1 value: 96.8941935264779 - type: f1_stderr value: 0.6908609132985931 - type: main_score value: 96.89569999999999 task: type: Classification - dataset: config: en name: MTEB AmazonReviewsClassification (en) revision: 1399c76144fd37290681b995c656ef9b2e06e26d split: test type: mteb/amazon_reviews_multi metrics: - type: accuracy value: 61.602000000000004 - type: accuracy_stderr value: 1.4532019818318436 - type: f1 value: 60.96100449021481 - type: f1_stderr value: 1.8031398419765765 - type: main_score value: 61.602000000000004 task: type: Classification - dataset: config: default name: MTEB ArxivClusteringP2P revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d split: test type: mteb/arxiv-clustering-p2p metrics: - type: main_score value: 54.906319409992 - type: v_measure value: 54.906319409992 - type: v_measure_std value: 14.382682652951683 task: type: Clustering - dataset: config: default name: MTEB ArxivClusteringS2S revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53 split: test type: mteb/arxiv-clustering-s2s metrics: - type: main_score value: 50.27779516565727 - type: v_measure value: 50.27779516565727 - type: v_measure_std value: 14.463711418590636 task: type: Clustering - dataset: config: default name: MTEB AskUbuntuDupQuestions revision: 2000358ca161889fa9c082cb41daa8dcfb161a54 split: test type: mteb/askubuntudupquestions-reranking metrics: - type: map value: 64.59457317979604 - type: mrr value: 78.05214791364376 - type: main_score value: 64.59457317979604 task: type: Reranking - dataset: config: default name: MTEB BIOSSES revision: d3fb88f8f02e40887cd149695127462bbcf29b4a split: test type: mteb/biosses-sts metrics: - type: cosine_pearson value: 86.5833945335644 - type: cosine_spearman value: 85.74472483606 - type: manhattan_pearson value: 85.07748703871708 - type: manhattan_spearman value: 85.1459160110718 - type: euclidean_pearson value: 85.14704290043478 - type: euclidean_spearman value: 85.10073425868336 - type: main_score value: 85.74472483606 task: type: STS - dataset: config: default name: MTEB Banking77Classification revision: 0fd18e25b25c072e09e0d92ab615fda904d66300 split: test type: mteb/banking77 metrics: - type: accuracy value: 92.53246753246755 - type: accuracy_stderr value: 0.5488837781559508 - type: f1 value: 92.5143182074032 - type: f1_stderr value: 0.5657577980223147 - type: main_score value: 92.53246753246755 task: type: Classification - dataset: config: default name: MTEB BiorxivClusteringP2P revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40 split: test type: mteb/biorxiv-clustering-p2p metrics: - type: main_score value: 52.64099497480452 - type: v_measure value: 52.64099497480452 - type: v_measure_std value: 1.081892399559334 task: type: Clustering - dataset: config: default name: MTEB BiorxivClusteringS2S revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908 split: test type: mteb/biorxiv-clustering-s2s metrics: - type: main_score value: 49.1972734308178 - type: v_measure value: 49.1972734308178 - type: v_measure_std value: 0.9081245477708283 task: type: Clustering - dataset: config: default name: MTEB EmotionClassification revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37 split: test type: mteb/emotion metrics: - type: accuracy value: 92.975 - type: accuracy_stderr value: 0.5287958017987677 - type: f1 value: 89.29755895896542 - type: f1_stderr value: 0.6485027046025079 - type: main_score value: 92.975 task: type: Classification - dataset: config: default name: MTEB ImdbClassification revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7 split: test type: mteb/imdb metrics: - type: accuracy value: 96.66480000000001 - type: accuracy_stderr value: 0.45673204398202666 - type: ap value: 95.33843919456118 - type: ap_stderr value: 0.6449846039754393 - type: f1 value: 96.6637668164617 - type: f1_stderr value: 0.45793673051468287 - type: main_score value: 96.66480000000001 task: type: Classification - dataset: config: en name: MTEB MTOPDomainClassification (en) revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf split: test type: mteb/mtop_domain metrics: - type: accuracy value: 98.61149110807114 - type: accuracy_stderr value: 0.469748178253266 - type: f1 value: 98.4685511007568 - type: f1_stderr value: 0.51636776728259 - type: main_score value: 98.61149110807114 task: type: Classification - dataset: config: en name: MTEB MTOPIntentClassification (en) revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba split: test type: mteb/mtop_intent metrics: - type: accuracy value: 95.51299589603283 - type: accuracy_stderr value: 0.3591676911539482 - type: f1 value: 85.2464691439773 - type: f1_stderr value: 0.9234502856695337 - type: main_score value: 95.51299589603283 task: type: Classification - dataset: config: en name: MTEB MassiveIntentClassification (en) revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 82.04774714189644 - type: accuracy_stderr value: 0.7288818520309376 - type: f1 value: 79.28060657840692 - type: f1_stderr value: 0.6872008571781982 - type: main_score value: 82.04774714189644 task: type: Classification - dataset: config: en name: MTEB MassiveScenarioClassification (en) revision: 7d571f92784cd94a019292a1f45445077d0ef634 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 84.40147948890383 - type: accuracy_stderr value: 1.2939587629143627 - type: f1 value: 83.97779287582267 - type: f1_stderr value: 0.9970599222060901 - type: main_score value: 84.40147948890383 task: type: Classification - dataset: config: default name: MTEB MedrxivClusteringP2P revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73 split: test type: mteb/medrxiv-clustering-p2p metrics: - type: main_score value: 45.80879120838561 - type: v_measure value: 45.80879120838561 - type: v_measure_std value: 1.257800489264564 task: type: Clustering - dataset: config: default name: MTEB MedrxivClusteringS2S revision: 35191c8c0dca72d8ff3efcd72aa802307d469663 split: test type: mteb/medrxiv-clustering-s2s metrics: - type: main_score value: 44.106849261042505 - type: v_measure value: 44.106849261042505 - type: v_measure_std value: 1.4347344477874981 task: type: Clustering - dataset: config: default name: MTEB MindSmallReranking revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69 split: test type: mteb/mind_small metrics: - type: map value: 31.794062752995345 - type: mrr value: 32.98581714772614 - type: main_score value: 31.794062752995345 task: type: Reranking - dataset: config: default name: MTEB RedditClustering revision: 24640382cdbf8abc73003fb0fa6d111a705499eb split: test type: mteb/reddit-clustering metrics: - type: main_score value: 56.03342473834434 - type: v_measure value: 56.03342473834434 - type: v_measure_std value: 5.972192613803461 task: type: Clustering - dataset: config: default name: MTEB RedditClusteringP2P revision: 282350215ef01743dc01b456c7f5241fa8937f16 split: test type: mteb/reddit-clustering-p2p metrics: - type: main_score value: 65.83156688381274 - type: v_measure value: 65.83156688381274 - type: v_measure_std value: 14.180225112120162 task: type: Clustering - dataset: config: default name: MTEB SICK-R revision: a6ea5a8cab320b040a23452cc28066d9beae2cee split: test type: mteb/sickr-sts metrics: - type: cosine_pearson value: 84.15759544348467 - type: cosine_spearman value: 82.66085892322664 - type: manhattan_pearson value: 82.27257241990692 - type: manhattan_spearman value: 82.57752467555896 - type: euclidean_pearson value: 82.20795646456065 - type: euclidean_spearman value: 82.51008729416401 - type: main_score value: 82.66085892322664 task: type: STS - dataset: config: default name: MTEB STS12 revision: a0d554a64d88156834ff5ae9920b964011b16384 split: test type: mteb/sts12-sts metrics: - type: cosine_pearson value: 84.3406321391237 - type: cosine_spearman value: 77.71091257651071 - type: manhattan_pearson value: 81.25784268400994 - type: manhattan_spearman value: 77.98426383345507 - type: euclidean_pearson value: 81.25641851462917 - type: euclidean_spearman value: 77.93254971878063 - type: main_score value: 77.71091257651071 task: type: STS - dataset: config: default name: MTEB STS13 revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca split: test type: mteb/sts13-sts metrics: - type: cosine_pearson value: 86.1528398894769 - type: cosine_spearman value: 87.44662352358895 - type: manhattan_pearson value: 86.92164570802663 - type: manhattan_spearman value: 86.9132692625668 - type: euclidean_pearson value: 87.00156426580821 - type: euclidean_spearman value: 86.98750068631274 - type: main_score value: 87.44662352358895 task: type: STS - dataset: config: default name: MTEB STS14 revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375 split: test type: mteb/sts14-sts metrics: - type: cosine_pearson value: 83.32782491176253 - type: cosine_spearman value: 83.48313793311584 - type: manhattan_pearson value: 82.60528063429948 - type: manhattan_spearman value: 83.10434862310481 - type: euclidean_pearson value: 82.68016090104034 - type: euclidean_spearman value: 83.14418662406631 - type: main_score value: 83.48313793311584 task: type: STS - dataset: config: default name: MTEB STS15 revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3 split: test type: mteb/sts15-sts metrics: - type: cosine_pearson value: 86.31535441436343 - type: cosine_spearman value: 87.63145141246594 - type: manhattan_pearson value: 86.95972711389149 - type: manhattan_spearman value: 86.9849824463052 - type: euclidean_pearson value: 86.95391575487379 - type: euclidean_spearman value: 86.97613682266213 - type: main_score value: 87.63145141246594 task: type: STS - dataset: config: default name: MTEB STS16 revision: 4d8694f8f0e0100860b497b999b3dbed754a0513 split: test type: mteb/sts16-sts metrics: - type: cosine_pearson value: 83.43854397443079 - type: cosine_spearman value: 86.70176531845136 - type: manhattan_pearson value: 85.82302317064868 - type: manhattan_spearman value: 86.36561734213241 - type: euclidean_pearson value: 85.80127366135169 - type: euclidean_spearman value: 86.34803859754834 - type: main_score value: 86.70176531845136 task: type: STS - dataset: config: en-en name: MTEB STS17 (en-en) revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d split: test type: mteb/sts17-crosslingual-sts metrics: - type: cosine_pearson value: 90.38940955877999 - type: cosine_spearman value: 91.18282119920893 - type: manhattan_pearson value: 91.31823663739615 - type: manhattan_spearman value: 90.67257321731341 - type: euclidean_pearson value: 91.30318753138528 - type: euclidean_spearman value: 90.69044765693836 - type: main_score value: 91.18282119920893 task: type: STS - dataset: config: en name: MTEB STS22 (en) revision: eea2b4fe26a775864c896887d910b76a8098ad3f split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 69.33936467780947 - type: cosine_spearman value: 69.02345807358802 - type: manhattan_pearson value: 70.11799452953082 - type: manhattan_spearman value: 68.55450923481405 - type: euclidean_pearson value: 70.10857680491809 - type: euclidean_spearman value: 68.44610245708984 - type: main_score value: 69.02345807358802 task: type: STS - dataset: config: default name: MTEB STSBenchmark revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831 split: test type: mteb/stsbenchmark-sts metrics: - type: cosine_pearson value: 85.97288135509513 - type: cosine_spearman value: 87.25208310840168 - type: manhattan_pearson value: 86.3786471501451 - type: manhattan_spearman value: 86.71177136523868 - type: euclidean_pearson value: 86.40522339296625 - type: euclidean_spearman value: 86.73930576508816 - type: main_score value: 87.25208310840168 task: type: STS - dataset: config: default name: MTEB SciDocsRR revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab split: test type: mteb/scidocs-reranking metrics: - type: map value: 87.60324164489178 - type: mrr value: 96.30331904841708 - type: main_score value: 87.60324164489178 task: type: Reranking - dataset: config: default name: MTEB SprintDuplicateQuestions revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46 split: test type: mteb/sprintduplicatequestions-pairclassification metrics: - type: cos_sim_accuracy value: 99.6920792079208 - type: cos_sim_accuracy_threshold value: 90.36337347155474 - type: cos_sim_ap value: 90.93952679056765 - type: cos_sim_f1 value: 83.10700706137968 - type: cos_sim_f1_threshold value: 90.36337347155474 - type: cos_sim_precision value: 90.96313912009512 - type: cos_sim_recall value: 76.5 - type: dot_accuracy value: 99.54554455445545 - type: dot_accuracy_threshold value: 2876800.0 - type: dot_ap value: 84.01112287735286 - type: dot_f1 value: 75.7622739018088 - type: dot_f1_threshold value: 2820800.0 - type: dot_precision value: 78.39572192513369 - type: dot_recall value: 73.3 - type: euclidean_accuracy value: 99.6930693069307 - type: euclidean_accuracy_threshold value: 7718.054017089397 - type: euclidean_ap value: 91.1257568881301 - type: euclidean_f1 value: 83.09022150189087 - type: euclidean_f1_threshold value: 7817.08324628535 - type: euclidean_precision value: 90.36427732079906 - type: euclidean_recall value: 76.9 - type: manhattan_accuracy value: 99.6920792079208 - type: manhattan_accuracy_threshold value: 364735.19654273987 - type: manhattan_ap value: 91.2326885940691 - type: manhattan_f1 value: 83.36008560727663 - type: manhattan_f1_threshold value: 375395.8945572376 - type: manhattan_precision value: 89.64326812428078 - type: manhattan_recall value: 77.9 - type: max_accuracy value: 99.6930693069307 - type: max_ap value: 91.2326885940691 - type: max_f1 value: 83.36008560727663 task: type: PairClassification - dataset: config: default name: MTEB StackExchangeClustering revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259 split: test type: mteb/stackexchange-clustering metrics: - type: main_score value: 66.2095300942637 - type: v_measure value: 66.2095300942637 - type: v_measure_std value: 3.214369679617631 task: type: Clustering - dataset: config: default name: MTEB StackExchangeClusteringP2P revision: 815ca46b2622cec33ccafc3735d572c266efdb44 split: test type: mteb/stackexchange-clustering-p2p metrics: - type: main_score value: 45.74307000935057 - type: v_measure value: 45.74307000935057 - type: v_measure_std value: 1.5352466748569888 task: type: Clustering - dataset: config: default name: MTEB StackOverflowDupQuestions revision: e185fbe320c72810689fc5848eb6114e1ef5ec69 split: test type: mteb/stackoverflowdupquestions-reranking metrics: - type: map value: 54.90337951829123 - type: mrr value: 56.12889663441134 - type: main_score value: 54.90337951829123 task: type: Reranking - dataset: config: default name: MTEB SummEval revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c split: test type: mteb/summeval metrics: - type: cosine_pearson value: 31.0669308484832 - type: cosine_spearman value: 31.19637421540861 - type: dot_pearson value: 30.62326176666765 - type: dot_spearman value: 30.42135737502967 - type: main_score value: 31.19637421540861 task: type: Summarization - dataset: config: default name: MTEB ToxicConversationsClassification revision: d7c0de2777da35d6aae2200a62c6e0e5af397c4c split: test type: mteb/toxic_conversations_50k metrics: - type: accuracy value: 87.34339999999999 - type: accuracy_stderr value: 1.838245696309393 - type: ap value: 33.536584790435406 - type: ap_stderr value: 2.276373512492581 - type: f1 value: 72.47307082324448 - type: f1_stderr value: 1.9964640292072542 - type: main_score value: 87.34339999999999 task: type: Classification - dataset: config: default name: MTEB TweetSentimentExtractionClassification revision: d604517c81ca91fe16a244d1248fc021f9ecee7a split: test type: mteb/tweet_sentiment_extraction metrics: - type: accuracy value: 78.86247877758915 - type: accuracy_stderr value: 1.1273253738982443 - type: f1 value: 79.14666244848874 - type: f1_stderr value: 1.1532640958036497 - type: main_score value: 78.86247877758915 task: type: Classification - dataset: config: default name: MTEB TwentyNewsgroupsClustering revision: 6125ec4e24fa026cec8a478383ee943acfbd5449 split: test type: mteb/twentynewsgroups-clustering metrics: - type: main_score value: 70.44270836680788 - type: v_measure value: 70.44270836680788 - type: v_measure_std value: 1.5185423698266132 task: type: Clustering - dataset: config: default name: MTEB TwitterSemEval2015 revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1 split: test type: mteb/twittersemeval2015-pairclassification metrics: - type: cos_sim_accuracy value: 87.74512725755498 - type: cos_sim_accuracy_threshold value: 82.34941560483547 - type: cos_sim_ap value: 79.6389274210382 - type: cos_sim_f1 value: 71.76319176319176 - type: cos_sim_f1_threshold value: 80.1523829249257 - type: cos_sim_precision value: 70.0502512562814 - type: cos_sim_recall value: 73.56200527704485 - type: dot_accuracy value: 85.13441020444657 - type: dot_accuracy_threshold value: 2220800.0 - type: dot_ap value: 71.67080150823449 - type: dot_f1 value: 66.18984119287187 - type: dot_f1_threshold value: 2086400.0 - type: dot_precision value: 61.224489795918366 - type: dot_recall value: 72.0316622691293 - type: euclidean_accuracy value: 87.69148238660071 - type: euclidean_accuracy_threshold value: 9221.50036619459 - type: euclidean_ap value: 79.65326151280289 - type: euclidean_f1 value: 71.7903489983621 - type: euclidean_f1_threshold value: 10313.528386219872 - type: euclidean_precision value: 68.70026525198939 - type: euclidean_recall value: 75.17150395778364 - type: manhattan_accuracy value: 87.74512725755498 - type: manhattan_accuracy_threshold value: 444289.1119837761 - type: manhattan_ap value: 79.67744645365104 - type: manhattan_f1 value: 71.94423699278066 - type: manhattan_f1_threshold value: 491676.24004781246 - type: manhattan_precision value: 68.0961357210179 - type: manhattan_recall value: 76.2532981530343 - type: max_accuracy value: 87.74512725755498 - type: max_ap value: 79.67744645365104 - type: max_f1 value: 71.94423699278066 task: type: PairClassification - dataset: config: default name: MTEB TwitterURLCorpus revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf split: test type: mteb/twitterurlcorpus-pairclassification metrics: - type: cos_sim_accuracy value: 89.5544688943222 - type: cos_sim_accuracy_threshold value: 81.58909533293946 - type: cos_sim_ap value: 86.95174990178396 - type: cos_sim_f1 value: 79.1543756145526 - type: cos_sim_f1_threshold value: 80.08573448087095 - type: cos_sim_precision value: 77.78355879292404 - type: cos_sim_recall value: 80.5743763473976 - type: dot_accuracy value: 88.60752124810804 - type: dot_accuracy_threshold value: 2136000.0 - type: dot_ap value: 84.26724775947629 - type: dot_f1 value: 77.67666146985243 - type: dot_f1_threshold value: 2064000.0 - type: dot_precision value: 73.40505721921468 - type: dot_recall value: 82.47613181398214 - type: euclidean_accuracy value: 89.5370046959289 - type: euclidean_accuracy_threshold value: 9750.113991666478 - type: euclidean_ap value: 86.99393092403776 - type: euclidean_f1 value: 79.07167337207571 - type: euclidean_f1_threshold value: 10338.095928500366 - type: euclidean_precision value: 76.59497690531177 - type: euclidean_recall value: 81.71388974437943 - type: manhattan_accuracy value: 89.57581402569178 - type: manhattan_accuracy_threshold value: 463812.92815208435 - type: manhattan_ap value: 87.00849868076658 - type: manhattan_f1 value: 79.08583576933297 - type: manhattan_f1_threshold value: 482453.35128605366 - type: manhattan_precision value: 78.00494270950348 - type: manhattan_recall value: 80.19710502001848 - type: max_accuracy value: 89.57581402569178 - type: max_ap value: 87.00849868076658 - type: max_f1 value: 79.1543756145526 task: type: PairClassification - dataset: config: default name: MTEB AFQMC revision: b44c3b011063adb25877c13823db83bb193913c4 split: validation type: C-MTEB/AFQMC metrics: - type: cosine_pearson value: 45.108559635369325 - type: cosine_spearman value: 47.172833128216176 - type: manhattan_pearson value: 45.75443077564791 - type: manhattan_spearman value: 47.13974146235398 - type: euclidean_pearson value: 45.78921257223492 - type: euclidean_spearman value: 47.177095238278625 - type: main_score value: 47.172833128216176 task: type: STS - dataset: config: default name: MTEB ATEC revision: 0f319b1142f28d00e055a6770f3f726ae9b7d865 split: test type: C-MTEB/ATEC metrics: - type: cosine_pearson value: 48.304409578388466 - type: cosine_spearman value: 50.75006977697012 - type: manhattan_pearson value: 52.688818756177035 - type: manhattan_spearman value: 50.739214155741095 - type: euclidean_pearson value: 52.71788557204978 - type: euclidean_spearman value: 50.77895730336448 - type: main_score value: 50.75006977697012 task: type: STS - dataset: config: zh name: MTEB AmazonReviewsClassification (zh) revision: 1399c76144fd37290681b995c656ef9b2e06e26d split: test type: mteb/amazon_reviews_multi metrics: - type: accuracy value: 54.339999999999996 - type: accuracy_stderr value: 1.6518837731511269 - type: f1 value: 53.37316538790502 - type: f1_stderr value: 1.6112926272861336 - type: main_score value: 54.339999999999996 task: type: Classification - dataset: config: default name: MTEB BQ revision: e3dda5e115e487b39ec7e618c0c6a29137052a55 split: test type: C-MTEB/BQ metrics: - type: cosine_pearson value: 59.62831218167518 - type: cosine_spearman value: 62.02213472473759 - type: manhattan_pearson value: 61.122261197018176 - type: manhattan_spearman value: 62.208780520694454 - type: euclidean_pearson value: 61.17827629627213 - type: euclidean_spearman value: 62.266859648664244 - type: main_score value: 62.02213472473759 task: type: STS - dataset: config: default name: MTEB CLSClusteringP2P revision: 4b6227591c6c1a73bc76b1055f3b7f3588e72476 split: test type: C-MTEB/CLSClusteringP2P metrics: - type: main_score value: 54.64518394835408 - type: v_measure value: 54.64518394835408 - type: v_measure_std value: 1.2745946640208072 task: type: Clustering - dataset: config: default name: MTEB CLSClusteringS2S revision: e458b3f5414b62b7f9f83499ac1f5497ae2e869f split: test type: C-MTEB/CLSClusteringS2S metrics: - type: main_score value: 63.68323477729556 - type: v_measure value: 63.68323477729556 - type: v_measure_std value: 1.740918833098302 task: type: Clustering - dataset: config: default name: MTEB CMedQAv1 revision: 8d7f1e942507dac42dc58017c1a001c3717da7df split: test type: C-MTEB/CMedQAv1-reranking metrics: - type: map value: 84.61500884703916 - type: mrr value: 87.01424603174604 - type: main_score value: 84.61500884703916 task: type: Reranking - dataset: config: default name: MTEB CMedQAv2 revision: 23d186750531a14a0357ca22cd92d712fd512ea0 split: test type: C-MTEB/CMedQAv2-reranking metrics: - type: map value: 85.60137988993483 - type: mrr value: 87.96857142857142 - type: main_score value: 85.60137988993483 task: type: Reranking - dataset: config: default name: MTEB CmedqaRetrieval revision: cd540c506dae1cf9e9a59c3e06f42030d54e7301 split: dev type: C-MTEB/CmedqaRetrieval metrics: - type: map_at_1 value: 24.191 - type: map_at_10 value: 35.819 - type: map_at_100 value: 37.639 - type: map_at_1000 value: 37.775 - type: map_at_3 value: 32.045 - type: map_at_5 value: 34.008 - type: mrr_at_1 value: 36.684 - type: mrr_at_10 value: 44.769 - type: mrr_at_100 value: 45.754 - type: mrr_at_1000 value: 45.809 - type: mrr_at_3 value: 42.465 - type: mrr_at_5 value: 43.696 - type: ndcg_at_1 value: 36.834 - type: ndcg_at_10 value: 42.208 - type: ndcg_at_100 value: 49.507 - type: ndcg_at_1000 value: 51.834 - type: ndcg_at_3 value: 37.416 - type: ndcg_at_5 value: 39.152 - type: precision_at_1 value: 36.834 - type: precision_at_10 value: 9.357 - type: precision_at_100 value: 1.5310000000000001 - type: precision_at_1000 value: 0.183 - type: precision_at_3 value: 21.08 - type: precision_at_5 value: 15.068999999999999 - type: recall_at_1 value: 24.191 - type: recall_at_10 value: 52.078 - type: recall_at_100 value: 82.548 - type: recall_at_1000 value: 98.017 - type: recall_at_3 value: 37.484 - type: recall_at_5 value: 43.187 - type: main_score value: 42.208 task: type: Retrieval - dataset: config: default name: MTEB Cmnli revision: 41bc36f332156f7adc9e38f53777c959b2ae9766 split: validation type: C-MTEB/CMNLI metrics: - type: cos_sim_accuracy value: 81.98436560432953 - type: cos_sim_accuracy_threshold value: 67.33228049687503 - type: cos_sim_ap value: 90.13312662430796 - type: cos_sim_f1 value: 83.2163938077737 - type: cos_sim_f1_threshold value: 64.44945196171463 - type: cos_sim_precision value: 79.45555082943429 - type: cos_sim_recall value: 87.350946925415 - type: dot_accuracy value: 80.50511124473843 - type: dot_accuracy_threshold value: 1736000.0 - type: dot_ap value: 88.76136186445322 - type: dot_f1 value: 81.75838631878973 - type: dot_f1_threshold value: 1681600.0 - type: dot_precision value: 76.96594427244582 - type: dot_recall value: 87.18728080430208 - type: euclidean_accuracy value: 82.21286831028262 - type: euclidean_accuracy_threshold value: 13240.938473272565 - type: euclidean_ap value: 90.14863232280865 - type: euclidean_f1 value: 83.277292086976 - type: euclidean_f1_threshold value: 13667.852165734186 - type: euclidean_precision value: 79.97847147470398 - type: euclidean_recall value: 86.85994856207621 - type: manhattan_accuracy value: 82.21286831028262 - type: manhattan_accuracy_threshold value: 629412.1389746666 - type: manhattan_ap value: 90.03868533208357 - type: manhattan_f1 value: 83.15683870248579 - type: manhattan_f1_threshold value: 649621.3114321232 - type: manhattan_precision value: 79.46314443971026 - type: manhattan_recall value: 87.21066167874679 - type: max_accuracy value: 82.21286831028262 - type: max_ap value: 90.14863232280865 - type: max_f1 value: 83.277292086976 task: type: PairClassification - dataset: config: default name: MTEB CovidRetrieval revision: 1271c7809071a13532e05f25fb53511ffce77117 split: dev type: C-MTEB/CovidRetrieval metrics: - type: map_at_1 value: 65.595 - type: map_at_10 value: 73.717 - type: map_at_100 value: 74.134 - type: map_at_1000 value: 74.143 - type: map_at_3 value: 71.97 - type: map_at_5 value: 73.11800000000001 - type: mrr_at_1 value: 65.648 - type: mrr_at_10 value: 73.618 - type: mrr_at_100 value: 74.02499999999999 - type: mrr_at_1000 value: 74.033 - type: mrr_at_3 value: 71.865 - type: mrr_at_5 value: 73.04 - type: ndcg_at_1 value: 65.753 - type: ndcg_at_10 value: 77.458 - type: ndcg_at_100 value: 79.46 - type: ndcg_at_1000 value: 79.666 - type: ndcg_at_3 value: 73.988 - type: ndcg_at_5 value: 76.038 - type: precision_at_1 value: 65.753 - type: precision_at_10 value: 8.999 - type: precision_at_100 value: 0.9939999999999999 - type: precision_at_1000 value: 0.101 - type: precision_at_3 value: 26.765 - type: precision_at_5 value: 17.092 - type: recall_at_1 value: 65.595 - type: recall_at_10 value: 89.041 - type: recall_at_100 value: 98.31400000000001 - type: recall_at_1000 value: 99.895 - type: recall_at_3 value: 79.768 - type: recall_at_5 value: 84.66799999999999 - type: main_score value: 77.458 task: type: Retrieval - dataset: config: default name: MTEB DuRetrieval revision: a1a333e290fe30b10f3f56498e3a0d911a693ced split: dev type: C-MTEB/DuRetrieval metrics: - type: map_at_1 value: 27.248 - type: map_at_10 value: 84.303 - type: map_at_100 value: 86.866 - type: map_at_1000 value: 86.888 - type: map_at_3 value: 58.658 - type: map_at_5 value: 74.265 - type: mrr_at_1 value: 92.2 - type: mrr_at_10 value: 94.733 - type: mrr_at_100 value: 94.767 - type: mrr_at_1000 value: 94.768 - type: mrr_at_3 value: 94.492 - type: mrr_at_5 value: 94.627 - type: ndcg_at_1 value: 92.2 - type: ndcg_at_10 value: 90.462 - type: ndcg_at_100 value: 92.562 - type: ndcg_at_1000 value: 92.757 - type: ndcg_at_3 value: 89.44800000000001 - type: ndcg_at_5 value: 88.683 - type: precision_at_1 value: 92.2 - type: precision_at_10 value: 42.980000000000004 - type: precision_at_100 value: 4.851 - type: precision_at_1000 value: 0.49 - type: precision_at_3 value: 80.233 - type: precision_at_5 value: 67.95 - type: recall_at_1 value: 27.248 - type: recall_at_10 value: 91.46600000000001 - type: recall_at_100 value: 98.566 - type: recall_at_1000 value: 99.557 - type: recall_at_3 value: 60.671 - type: recall_at_5 value: 78.363 - type: main_score value: 90.462 task: type: Retrieval - dataset: config: default name: MTEB EcomRetrieval revision: 687de13dc7294d6fd9be10c6945f9e8fec8166b9 split: dev type: C-MTEB/EcomRetrieval metrics: - type: map_at_1 value: 54.7 - type: map_at_10 value: 64.574 - type: map_at_100 value: 65.144 - type: map_at_1000 value: 65.156 - type: map_at_3 value: 62.333000000000006 - type: map_at_5 value: 63.63799999999999 - type: mrr_at_1 value: 54.7 - type: mrr_at_10 value: 64.603 - type: mrr_at_100 value: 65.172 - type: mrr_at_1000 value: 65.184 - type: mrr_at_3 value: 62.383 - type: mrr_at_5 value: 63.683 - type: ndcg_at_1 value: 54.7 - type: ndcg_at_10 value: 69.298 - type: ndcg_at_100 value: 71.81 - type: ndcg_at_1000 value: 72.117 - type: ndcg_at_3 value: 64.72099999999999 - type: ndcg_at_5 value: 67.071 - type: precision_at_1 value: 54.7 - type: precision_at_10 value: 8.41 - type: precision_at_100 value: 0.9530000000000001 - type: precision_at_1000 value: 0.098 - type: precision_at_3 value: 23.867 - type: precision_at_5 value: 15.459999999999999 - type: recall_at_1 value: 54.7 - type: recall_at_10 value: 84.1 - type: recall_at_100 value: 95.3 - type: recall_at_1000 value: 97.7 - type: recall_at_3 value: 71.6 - type: recall_at_5 value: 77.3 - type: main_score value: 69.298 task: type: Retrieval - dataset: config: default name: MTEB IFlyTek revision: 421605374b29664c5fc098418fe20ada9bd55f8a split: validation type: C-MTEB/IFlyTek-classification metrics: - type: accuracy value: 49.942285494420936 - type: accuracy_stderr value: 0.9218275144833329 - type: f1 value: 41.32381790374152 - type: f1_stderr value: 0.8291507105327707 - type: main_score value: 49.942285494420936 task: type: Classification - dataset: config: default name: MTEB JDReview revision: b7c64bd89eb87f8ded463478346f76731f07bf8b split: test type: C-MTEB/JDReview-classification metrics: - type: accuracy value: 88.91181988742964 - type: accuracy_stderr value: 1.952391767940518 - type: ap value: 60.18509628974178 - type: ap_stderr value: 4.273060966573582 - type: f1 value: 84.02722221827027 - type: f1_stderr value: 2.238197243395083 - type: main_score value: 88.91181988742964 task: type: Classification - dataset: config: default name: MTEB LCQMC revision: 17f9b096f80380fce5ed12a9be8be7784b337daf split: test type: C-MTEB/LCQMC metrics: - type: cosine_pearson value: 68.32691294171383 - type: cosine_spearman value: 75.95458618586729 - type: manhattan_pearson value: 74.37198807732018 - type: manhattan_spearman value: 75.99352157963375 - type: euclidean_pearson value: 74.36294627886716 - type: euclidean_spearman value: 75.98632511635132 - type: main_score value: 75.95458618586729 task: type: STS - dataset: config: default name: MTEB MMarcoReranking revision: 8e0c766dbe9e16e1d221116a3f36795fbade07f6 split: dev type: C-MTEB/Mmarco-reranking metrics: - type: map value: 35.4327533126161 - type: mrr value: 34.61507936507937 - type: main_score value: 35.4327533126161 task: type: Reranking - dataset: config: default name: MTEB MMarcoRetrieval revision: 539bbde593d947e2a124ba72651aafc09eb33fc2 split: dev type: C-MTEB/MMarcoRetrieval metrics: - type: map_at_1 value: 72.652 - type: map_at_10 value: 81.396 - type: map_at_100 value: 81.597 - type: map_at_1000 value: 81.60300000000001 - type: map_at_3 value: 79.757 - type: map_at_5 value: 80.798 - type: mrr_at_1 value: 75.01400000000001 - type: mrr_at_10 value: 81.842 - type: mrr_at_100 value: 82.025 - type: mrr_at_1000 value: 82.03099999999999 - type: mrr_at_3 value: 80.45400000000001 - type: mrr_at_5 value: 81.345 - type: ndcg_at_1 value: 74.98599999999999 - type: ndcg_at_10 value: 84.70100000000001 - type: ndcg_at_100 value: 85.568 - type: ndcg_at_1000 value: 85.721 - type: ndcg_at_3 value: 81.64099999999999 - type: ndcg_at_5 value: 83.375 - type: precision_at_1 value: 74.98599999999999 - type: precision_at_10 value: 10.049 - type: precision_at_100 value: 1.047 - type: precision_at_1000 value: 0.106 - type: precision_at_3 value: 30.458000000000002 - type: precision_at_5 value: 19.206 - type: recall_at_1 value: 72.652 - type: recall_at_10 value: 94.40899999999999 - type: recall_at_100 value: 98.241 - type: recall_at_1000 value: 99.42 - type: recall_at_3 value: 86.354 - type: recall_at_5 value: 90.472 - type: main_score value: 84.70100000000001 task: type: Retrieval - dataset: config: zh-CN name: MTEB MassiveIntentClassification (zh-CN) revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 78.19098856758575 - type: accuracy_stderr value: 0.6325028678427684 - type: f1 value: 74.80611425574001 - type: f1_stderr value: 0.9021806207904779 - type: main_score value: 78.19098856758575 task: type: Classification - dataset: config: zh-CN name: MTEB MassiveScenarioClassification (zh-CN) revision: 7d571f92784cd94a019292a1f45445077d0ef634 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 82.58238063214526 - type: accuracy_stderr value: 1.0999970213165273 - type: f1 value: 81.94734854057064 - type: f1_stderr value: 1.248633855872851 - type: main_score value: 82.58238063214526 task: type: Classification - dataset: config: default name: MTEB MedicalRetrieval revision: 2039188fb5800a9803ba5048df7b76e6fb151fc6 split: dev type: C-MTEB/MedicalRetrieval metrics: - type: map_at_1 value: 53.7 - type: map_at_10 value: 59.184000000000005 - type: map_at_100 value: 59.754 - type: map_at_1000 value: 59.8 - type: map_at_3 value: 57.833 - type: map_at_5 value: 58.548 - type: mrr_at_1 value: 54.0 - type: mrr_at_10 value: 59.352000000000004 - type: mrr_at_100 value: 59.926 - type: mrr_at_1000 value: 59.971 - type: mrr_at_3 value: 57.99999999999999 - type: mrr_at_5 value: 58.714999999999996 - type: ndcg_at_1 value: 53.7 - type: ndcg_at_10 value: 62.022 - type: ndcg_at_100 value: 65.038 - type: ndcg_at_1000 value: 66.366 - type: ndcg_at_3 value: 59.209 - type: ndcg_at_5 value: 60.51299999999999 - type: precision_at_1 value: 53.7 - type: precision_at_10 value: 7.1 - type: precision_at_100 value: 0.856 - type: precision_at_1000 value: 0.096 - type: precision_at_3 value: 21.067 - type: precision_at_5 value: 13.28 - type: recall_at_1 value: 53.7 - type: recall_at_10 value: 71.0 - type: recall_at_100 value: 85.6 - type: recall_at_1000 value: 96.3 - type: recall_at_3 value: 63.2 - type: recall_at_5 value: 66.4 - type: main_score value: 62.022 task: type: Retrieval - dataset: config: default name: MTEB MultilingualSentiment revision: 46958b007a63fdbf239b7672c25d0bea67b5ea1a split: validation type: C-MTEB/MultilingualSentiment-classification metrics: - type: accuracy value: 78.91333333333334 - type: accuracy_stderr value: 1.0834307648494321 - type: f1 value: 78.881433228092 - type: f1_stderr value: 1.122457277013712 - type: main_score value: 78.91333333333334 task: type: Classification - dataset: config: default name: MTEB Ocnli revision: 66e76a618a34d6d565d5538088562851e6daa7ec split: validation type: C-MTEB/OCNLI metrics: - type: cos_sim_accuracy value: 76.39415268002165 - type: cos_sim_accuracy_threshold value: 68.98242139321592 - type: cos_sim_ap value: 83.20687440058073 - type: cos_sim_f1 value: 78.4351145038168 - type: cos_sim_f1_threshold value: 65.47409929698304 - type: cos_sim_precision value: 71.54046997389034 - type: cos_sim_recall value: 86.80042238648363 - type: dot_accuracy value: 74.60747157552788 - type: dot_accuracy_threshold value: 1737600.0 - type: dot_ap value: 79.78938545919723 - type: dot_f1 value: 76.92307692307692 - type: dot_f1_threshold value: 1652800.0 - type: dot_precision value: 67.90622473726758 - type: dot_recall value: 88.70116156283 - type: euclidean_accuracy value: 76.34001082837032 - type: euclidean_accuracy_threshold value: 12597.299662420446 - type: euclidean_ap value: 83.60222701792158 - type: euclidean_f1 value: 78.77947295423024 - type: euclidean_f1_threshold value: 13639.653702639469 - type: euclidean_precision value: 70.06578947368422 - type: euclidean_recall value: 89.96832101372756 - type: manhattan_accuracy value: 76.23172712506768 - type: manhattan_accuracy_threshold value: 587601.2824743986 - type: manhattan_ap value: 83.51813426548178 - type: manhattan_f1 value: 78.6654135338346 - type: manhattan_f1_threshold value: 639711.1931562424 - type: manhattan_precision value: 70.87214225232854 - type: manhattan_recall value: 88.3843717001056 - type: max_accuracy value: 76.39415268002165 - type: max_ap value: 83.60222701792158 - type: max_f1 value: 78.77947295423024 task: type: PairClassification - dataset: config: default name: MTEB OnlineShopping revision: e610f2ebd179a8fda30ae534c3878750a96db120 split: test type: C-MTEB/OnlineShopping-classification metrics: - type: accuracy value: 94.59 - type: accuracy_stderr value: 0.8971621926942733 - type: ap value: 93.01229797205905 - type: ap_stderr value: 1.0519542956523058 - type: f1 value: 94.58077736915268 - type: f1_stderr value: 0.8954928292768671 - type: main_score value: 94.59 task: type: Classification - dataset: config: default name: MTEB PAWSX revision: 9c6a90e430ac22b5779fb019a23e820b11a8b5e1 split: test type: C-MTEB/PAWSX metrics: - type: cosine_pearson value: 24.341872875292857 - type: cosine_spearman value: 30.570037022875436 - type: manhattan_pearson value: 31.41015320258418 - type: manhattan_spearman value: 30.604526098895114 - type: euclidean_pearson value: 31.400038084432175 - type: euclidean_spearman value: 30.61062265273698 - type: main_score value: 30.570037022875436 task: type: STS - dataset: config: default name: MTEB QBQTC revision: 790b0510dc52b1553e8c49f3d2afb48c0e5c48b7 split: test type: C-MTEB/QBQTC metrics: - type: cosine_pearson value: 36.61757468091905 - type: cosine_spearman value: 38.981417359835504 - type: manhattan_pearson value: 37.971127169578764 - type: manhattan_spearman value: 39.55028286687854 - type: euclidean_pearson value: 37.96983777648438 - type: euclidean_spearman value: 39.542856511171784 - type: main_score value: 38.981417359835504 task: type: STS - dataset: config: zh name: MTEB STS22 (zh) revision: eea2b4fe26a775864c896887d910b76a8098ad3f split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 68.29834902017382 - type: cosine_spearman value: 68.6823378297782 - type: manhattan_pearson value: 68.47336169904406 - type: manhattan_spearman value: 69.08033223619941 - type: euclidean_pearson value: 68.38785956191622 - type: euclidean_spearman value: 68.97973814449657 - type: main_score value: 68.6823378297782 task: type: STS - dataset: config: default name: MTEB STSB revision: 0cde68302b3541bb8b3c340dc0644b0b745b3dc0 split: test type: C-MTEB/STSB metrics: - type: cosine_pearson value: 80.60572958563593 - type: cosine_spearman value: 80.87063761195603 - type: manhattan_pearson value: 79.30174059269083 - type: manhattan_spearman value: 80.02203618135883 - type: euclidean_pearson value: 79.3314553444783 - type: euclidean_spearman value: 80.04556415585255 - type: main_score value: 80.87063761195603 task: type: STS - dataset: config: default name: MTEB T2Reranking revision: 76631901a18387f85eaa53e5450019b87ad58ef9 split: dev type: C-MTEB/T2Reranking metrics: - type: map value: 67.47921173708028 - type: mrr value: 77.9396513739777 - type: main_score value: 67.47921173708028 task: type: Reranking - dataset: config: default name: MTEB T2Retrieval revision: 8731a845f1bf500a4f111cf1070785c793d10e64 split: dev type: C-MTEB/T2Retrieval metrics: - type: map_at_1 value: 28.021 - type: map_at_10 value: 79.149 - type: map_at_100 value: 82.613 - type: map_at_1000 value: 82.67099999999999 - type: map_at_3 value: 55.665 - type: map_at_5 value: 68.46900000000001 - type: mrr_at_1 value: 91.106 - type: mrr_at_10 value: 93.372 - type: mrr_at_100 value: 93.44200000000001 - type: mrr_at_1000 value: 93.445 - type: mrr_at_3 value: 92.99300000000001 - type: mrr_at_5 value: 93.24900000000001 - type: ndcg_at_1 value: 91.106 - type: ndcg_at_10 value: 86.259 - type: ndcg_at_100 value: 89.46600000000001 - type: ndcg_at_1000 value: 90.012 - type: ndcg_at_3 value: 87.574 - type: ndcg_at_5 value: 86.283 - type: precision_at_1 value: 91.106 - type: precision_at_10 value: 42.742999999999995 - type: precision_at_100 value: 5.029999999999999 - type: precision_at_1000 value: 0.516 - type: precision_at_3 value: 76.593 - type: precision_at_5 value: 64.243 - type: recall_at_1 value: 28.021 - type: recall_at_10 value: 85.184 - type: recall_at_100 value: 95.79299999999999 - type: recall_at_1000 value: 98.547 - type: recall_at_3 value: 57.233000000000004 - type: recall_at_5 value: 71.628 - type: main_score value: 86.259 task: type: Retrieval - dataset: config: default name: MTEB TNews revision: 317f262bf1e6126357bbe89e875451e4b0938fe4 split: validation type: C-MTEB/TNews-classification metrics: - type: accuracy value: 50.255 - type: accuracy_stderr value: 0.9341868121526873 - type: f1 value: 48.65080322457893 - type: f1_stderr value: 0.9391547591179161 - type: main_score value: 50.255 task: type: Classification - dataset: config: default name: MTEB ThuNewsClusteringP2P revision: 5798586b105c0434e4f0fe5e767abe619442cf93 split: test type: C-MTEB/ThuNewsClusteringP2P metrics: - type: main_score value: 64.32076022871308 - type: v_measure value: 64.32076022871308 - type: v_measure_std value: 0.7190996709617924 task: type: Clustering - dataset: config: default name: MTEB ThuNewsClusteringS2S revision: 8a8b2caeda43f39e13c4bc5bea0f8a667896e10d split: test type: C-MTEB/ThuNewsClusteringS2S metrics: - type: main_score value: 54.57080911705562 - type: v_measure value: 54.57080911705562 - type: v_measure_std value: 1.5185826402845883 task: type: Clustering - dataset: config: default name: MTEB VideoRetrieval revision: 58c2597a5943a2ba48f4668c3b90d796283c5639 split: dev type: C-MTEB/VideoRetrieval metrics: - type: map_at_1 value: 63.1 - type: map_at_10 value: 73.137 - type: map_at_100 value: 73.539 - type: map_at_1000 value: 73.546 - type: map_at_3 value: 71.467 - type: map_at_5 value: 72.552 - type: mrr_at_1 value: 63.3 - type: mrr_at_10 value: 73.238 - type: mrr_at_100 value: 73.64 - type: mrr_at_1000 value: 73.64699999999999 - type: mrr_at_3 value: 71.56700000000001 - type: mrr_at_5 value: 72.652 - type: ndcg_at_1 value: 63.1 - type: ndcg_at_10 value: 77.397 - type: ndcg_at_100 value: 79.11399999999999 - type: ndcg_at_1000 value: 79.305 - type: ndcg_at_3 value: 74.031 - type: ndcg_at_5 value: 75.976 - type: precision_at_1 value: 63.1 - type: precision_at_10 value: 9.049999999999999 - type: precision_at_100 value: 0.98 - type: precision_at_1000 value: 0.1 - type: precision_at_3 value: 27.133000000000003 - type: precision_at_5 value: 17.22 - type: recall_at_1 value: 63.1 - type: recall_at_10 value: 90.5 - type: recall_at_100 value: 98.0 - type: recall_at_1000 value: 99.5 - type: recall_at_3 value: 81.39999999999999 - type: recall_at_5 value: 86.1 - type: main_score value: 77.397 task: type: Retrieval - dataset: config: default name: MTEB Waimai revision: 339287def212450dcaa9df8c22bf93e9980c7023 split: test type: C-MTEB/waimai-classification metrics: - type: accuracy value: 89.26 - type: accuracy_stderr value: 1.44651304867948 - type: ap value: 75.17154345788362 - type: ap_stderr value: 2.7356371110082565 - type: f1 value: 87.94016849813178 - type: f1_stderr value: 1.3897605039980534 - type: main_score value: 89.26 task: type: Classification - dataset: config: default name: MTEB AlloProfClusteringP2P revision: 392ba3f5bcc8c51f578786c1fc3dae648662cb9b split: test type: lyon-nlp/alloprof metrics: - type: main_score value: 71.20310003742769 - type: v_measure value: 71.20310003742769 - type: v_measure_std value: 2.3682783706448687 task: type: Clustering - dataset: config: default name: MTEB AlloProfClusteringS2S revision: 392ba3f5bcc8c51f578786c1fc3dae648662cb9b split: test type: lyon-nlp/alloprof metrics: - type: main_score value: 59.64232194434788 - type: v_measure value: 59.64232194434788 - type: v_measure_std value: 2.4292956011867557 task: type: Clustering - dataset: config: default name: MTEB AlloprofReranking revision: 65393d0d7a08a10b4e348135e824f385d420b0fd split: test type: lyon-nlp/mteb-fr-reranking-alloprof-s2p metrics: - type: main_score value: 78.62041803111894 - type: map value: 78.62041803111894 - type: mrr value: 79.82309057762426 - type: nAUC_map_diff1 value: 58.23586953459263 - type: nAUC_map_max value: 16.162821346484357 - type: nAUC_map_std value: 20.727030444422525 - type: nAUC_mrr_diff1 value: 57.89675675999501 - type: nAUC_mrr_max value: 17.188359535738417 - type: nAUC_mrr_std value: 20.121404571879598 task: type: Reranking - dataset: config: default name: MTEB AlloprofRetrieval revision: fcf295ea64c750f41fadbaa37b9b861558e1bfbd split: test type: lyon-nlp/alloprof metrics: - type: main_score value: 58.499 - type: map_at_1 value: 40.371 - type: map_at_10 value: 52.337 - type: map_at_100 value: 53.04 - type: map_at_1000 value: 53.065 - type: map_at_20 value: 52.772 - type: map_at_3 value: 49.201 - type: map_at_5 value: 51.025 - type: mrr_at_1 value: 40.3713298791019 - type: mrr_at_10 value: 52.322165337061755 - type: mrr_at_100 value: 53.02092832847133 - type: mrr_at_1000 value: 53.04594680215603 - type: mrr_at_20 value: 52.750849914358135 - type: mrr_at_3 value: 49.150834772596475 - type: mrr_at_5 value: 50.998848589522275 - type: nauc_map_at_1000_diff1 value: 44.71946249374932 - type: nauc_map_at_1000_max value: 28.074204125714193 - type: nauc_map_at_1000_std value: -5.1319087890196275 - type: nauc_map_at_100_diff1 value: 44.71140286780233 - type: nauc_map_at_100_max value: 28.09677884622645 - type: nauc_map_at_100_std value: -5.116353867480612 - type: nauc_map_at_10_diff1 value: 44.737968596047736 - type: nauc_map_at_10_max value: 28.103186472557184 - type: nauc_map_at_10_std value: -5.258817287329683 - type: nauc_map_at_1_diff1 value: 47.48389890056789 - type: nauc_map_at_1_max value: 24.803734709402654 - type: nauc_map_at_1_std value: -6.504759899363267 - type: nauc_map_at_20_diff1 value: 44.67268454863271 - type: nauc_map_at_20_max value: 28.068912295976933 - type: nauc_map_at_20_std value: -5.1971060419801836 - type: nauc_map_at_3_diff1 value: 44.59399231542881 - type: nauc_map_at_3_max value: 27.097806786915502 - type: nauc_map_at_3_std value: -5.957120508111229 - type: nauc_map_at_5_diff1 value: 44.549807218619236 - type: nauc_map_at_5_max value: 28.03902312965202 - type: nauc_map_at_5_std value: -5.279585300980128 - type: nauc_mrr_at_1000_diff1 value: 44.70183532803094 - type: nauc_mrr_at_1000_max value: 28.08833759937601 - type: nauc_mrr_at_1000_std value: -5.097929115475795 - type: nauc_mrr_at_100_diff1 value: 44.693824401340684 - type: nauc_mrr_at_100_max value: 28.110898009292296 - type: nauc_mrr_at_100_std value: -5.082401300601749 - type: nauc_mrr_at_10_diff1 value: 44.74052791862188 - type: nauc_mrr_at_10_max value: 28.125378341430725 - type: nauc_mrr_at_10_std value: -5.209767905428716 - type: nauc_mrr_at_1_diff1 value: 47.48389890056789 - type: nauc_mrr_at_1_max value: 24.803734709402654 - type: nauc_mrr_at_1_std value: -6.504759899363267 - type: nauc_mrr_at_20_diff1 value: 44.65204014980107 - type: nauc_mrr_at_20_max value: 28.071523791101487 - type: nauc_mrr_at_20_std value: -5.176680495032765 - type: nauc_mrr_at_3_diff1 value: 44.566371489967835 - type: nauc_mrr_at_3_max value: 27.138418179089243 - type: nauc_mrr_at_3_std value: -5.8860676927947715 - type: nauc_mrr_at_5_diff1 value: 44.513022796226025 - type: nauc_mrr_at_5_max value: 28.037968016529184 - type: nauc_mrr_at_5_std value: -5.286851060853457 - type: nauc_ndcg_at_1000_diff1 value: 44.31019947897497 - type: nauc_ndcg_at_1000_max value: 29.332844099450185 - type: nauc_ndcg_at_1000_std value: -4.185675731246788 - type: nauc_ndcg_at_100_diff1 value: 44.15415366286996 - type: nauc_ndcg_at_100_max value: 30.098413084162345 - type: nauc_ndcg_at_100_std value: -3.557438303045246 - type: nauc_ndcg_at_10_diff1 value: 44.117356815361376 - type: nauc_ndcg_at_10_max value: 30.090057186506147 - type: nauc_ndcg_at_10_std value: -4.294561567142078 - type: nauc_ndcg_at_1_diff1 value: 47.48389890056789 - type: nauc_ndcg_at_1_max value: 24.803734709402654 - type: nauc_ndcg_at_1_std value: -6.504759899363267 - type: nauc_ndcg_at_20_diff1 value: 43.868556983413285 - type: nauc_ndcg_at_20_max value: 30.06455269775592 - type: nauc_ndcg_at_20_std value: -3.9645560243946623 - type: nauc_ndcg_at_3_diff1 value: 43.71970793339256 - type: nauc_ndcg_at_3_max value: 28.057786581438034 - type: nauc_ndcg_at_3_std value: -5.597352364190012 - type: nauc_ndcg_at_5_diff1 value: 43.57692922989753 - type: nauc_ndcg_at_5_max value: 29.811975056854994 - type: nauc_ndcg_at_5_std value: -4.362865924703688 - type: nauc_precision_at_1000_diff1 value: 37.65255144893002 - type: nauc_precision_at_1000_max value: 88.70768683938714 - type: nauc_precision_at_1000_std value: 69.77642765639528 - type: nauc_precision_at_100_diff1 value: 38.99412121382678 - type: nauc_precision_at_100_max value: 61.57652450016459 - type: nauc_precision_at_100_std value: 24.826035139656348 - type: nauc_precision_at_10_diff1 value: 41.78189732924517 - type: nauc_precision_at_10_max value: 39.83536802453079 - type: nauc_precision_at_10_std value: 0.431964006091015 - type: nauc_precision_at_1_diff1 value: 47.48389890056789 - type: nauc_precision_at_1_max value: 24.803734709402654 - type: nauc_precision_at_1_std value: -6.504759899363267 - type: nauc_precision_at_20_diff1 value: 39.33781305274886 - type: nauc_precision_at_20_max value: 43.00448814568695 - type: nauc_precision_at_20_std value: 4.5633424143661365 - type: nauc_precision_at_3_diff1 value: 40.99977742505519 - type: nauc_precision_at_3_max value: 31.14585236181214 - type: nauc_precision_at_3_std value: -4.404002104899136 - type: nauc_precision_at_5_diff1 value: 40.12130730401297 - type: nauc_precision_at_5_max value: 36.45000981581976 - type: nauc_precision_at_5_std value: -0.8603896798394983 - type: nauc_recall_at_1000_diff1 value: 37.652551448927504 - type: nauc_recall_at_1000_max value: 88.70768683938547 - type: nauc_recall_at_1000_std value: 69.77642765638893 - type: nauc_recall_at_100_diff1 value: 38.9941212138267 - type: nauc_recall_at_100_max value: 61.57652450016457 - type: nauc_recall_at_100_std value: 24.82603513965631 - type: nauc_recall_at_10_diff1 value: 41.781897329245105 - type: nauc_recall_at_10_max value: 39.83536802453082 - type: nauc_recall_at_10_std value: 0.4319640060909985 - type: nauc_recall_at_1_diff1 value: 47.48389890056789 - type: nauc_recall_at_1_max value: 24.803734709402654 - type: nauc_recall_at_1_std value: -6.504759899363267 - type: nauc_recall_at_20_diff1 value: 39.337813052748835 - type: nauc_recall_at_20_max value: 43.00448814568676 - type: nauc_recall_at_20_std value: 4.56334241436601 - type: nauc_recall_at_3_diff1 value: 40.99977742505522 - type: nauc_recall_at_3_max value: 31.14585236181218 - type: nauc_recall_at_3_std value: -4.404002104899084 - type: nauc_recall_at_5_diff1 value: 40.121307304013 - type: nauc_recall_at_5_max value: 36.450009815819726 - type: nauc_recall_at_5_std value: -0.8603896798395225 - type: ndcg_at_1 value: 40.371 - type: ndcg_at_10 value: 58.499 - type: ndcg_at_100 value: 61.958 - type: ndcg_at_1000 value: 62.638000000000005 - type: ndcg_at_20 value: 60.068 - type: ndcg_at_3 value: 52.079 - type: ndcg_at_5 value: 55.359 - type: precision_at_1 value: 40.371 - type: precision_at_10 value: 7.797999999999999 - type: precision_at_100 value: 0.943 - type: precision_at_1000 value: 0.1 - type: precision_at_20 value: 4.208 - type: precision_at_3 value: 20.135 - type: precision_at_5 value: 13.669999999999998 - type: recall_at_1 value: 40.371 - type: recall_at_10 value: 77.979 - type: recall_at_100 value: 94.257 - type: recall_at_1000 value: 99.655 - type: recall_at_20 value: 84.154 - type: recall_at_3 value: 60.406000000000006 - type: recall_at_5 value: 68.351 task: type: Retrieval - dataset: config: fr name: MTEB AmazonReviewsClassification (fr) revision: 1399c76144fd37290681b995c656ef9b2e06e26d split: test type: mteb/amazon_reviews_multi metrics: - type: accuracy value: 55.186 - type: f1 value: 54.46705535013317 - type: f1_weighted value: 54.46705535013317 - type: main_score value: 55.186 task: type: Classification - dataset: config: default name: MTEB BSARDRetrieval revision: 5effa1b9b5fa3b0f9e12523e6e43e5f86a6e6d59 split: test type: maastrichtlawtech/bsard metrics: - type: main_score value: 65.766 - type: map_at_1 value: 17.116999999999997 - type: map_at_10 value: 24.2 - type: map_at_100 value: 25.196 - type: map_at_1000 value: 25.285999999999998 - type: map_at_20 value: 24.84 - type: map_at_3 value: 21.246000000000002 - type: map_at_5 value: 23.386000000000003 - type: mrr_at_1 value: 17.117117117117118 - type: mrr_at_10 value: 24.19955669955671 - type: mrr_at_100 value: 25.195531920335007 - type: mrr_at_1000 value: 25.284600511909495 - type: mrr_at_20 value: 24.840254977638896 - type: mrr_at_3 value: 21.246246246246244 - type: mrr_at_5 value: 23.38588588588589 - type: nauc_map_at_1000_diff1 value: 10.81116818873305 - type: nauc_map_at_1000_max value: 18.081485212587296 - type: nauc_map_at_1000_std value: 15.55247182359811 - type: nauc_map_at_100_diff1 value: 10.769025561727476 - type: nauc_map_at_100_max value: 18.05422658310923 - type: nauc_map_at_100_std value: 15.5467718904851 - type: nauc_map_at_10_diff1 value: 10.683272018434048 - type: nauc_map_at_10_max value: 18.142476171157714 - type: nauc_map_at_10_std value: 15.160871943210017 - type: nauc_map_at_1_diff1 value: 15.136874216646229 - type: nauc_map_at_1_max value: 19.68585969419655 - type: nauc_map_at_1_std value: 15.169957564848444 - type: nauc_map_at_20_diff1 value: 11.04316522915875 - type: nauc_map_at_20_max value: 17.817024791267443 - type: nauc_map_at_20_std value: 15.071246935999893 - type: nauc_map_at_3_diff1 value: 8.893328353778843 - type: nauc_map_at_3_max value: 16.402408590507946 - type: nauc_map_at_3_std value: 14.631998787185735 - type: nauc_map_at_5_diff1 value: 9.802455874823172 - type: nauc_map_at_5_max value: 17.939476196078495 - type: nauc_map_at_5_std value: 14.130589132632698 - type: nauc_mrr_at_1000_diff1 value: 10.813072323683013 - type: nauc_mrr_at_1000_max value: 18.08332318614462 - type: nauc_mrr_at_1000_std value: 15.553043223942819 - type: nauc_mrr_at_100_diff1 value: 10.77091057430458 - type: nauc_mrr_at_100_max value: 18.055798185778123 - type: nauc_mrr_at_100_std value: 15.547068262312003 - type: nauc_mrr_at_10_diff1 value: 10.683272018434048 - type: nauc_mrr_at_10_max value: 18.142476171157714 - type: nauc_mrr_at_10_std value: 15.160871943210017 - type: nauc_mrr_at_1_diff1 value: 15.136874216646229 - type: nauc_mrr_at_1_max value: 19.68585969419655 - type: nauc_mrr_at_1_std value: 15.169957564848444 - type: nauc_mrr_at_20_diff1 value: 11.04316522915875 - type: nauc_mrr_at_20_max value: 17.817024791267443 - type: nauc_mrr_at_20_std value: 15.071246935999893 - type: nauc_mrr_at_3_diff1 value: 8.893328353778843 - type: nauc_mrr_at_3_max value: 16.402408590507946 - type: nauc_mrr_at_3_std value: 14.631998787185735 - type: nauc_mrr_at_5_diff1 value: 9.802455874823172 - type: nauc_mrr_at_5_max value: 17.939476196078495 - type: nauc_mrr_at_5_std value: 14.130589132632698 - type: nauc_ndcg_at_1000_diff1 value: 11.202853727201774 - type: nauc_ndcg_at_1000_max value: 19.0293189527563 - type: nauc_ndcg_at_1000_std value: 18.390388750658357 - type: nauc_ndcg_at_100_diff1 value: 10.087335018055228 - type: nauc_ndcg_at_100_max value: 18.78516003607274 - type: nauc_ndcg_at_100_std value: 18.780357674944415 - type: nauc_ndcg_at_10_diff1 value: 10.574953671198443 - type: nauc_ndcg_at_10_max value: 18.572291623672044 - type: nauc_ndcg_at_10_std value: 15.808055075116057 - type: nauc_ndcg_at_1_diff1 value: 15.136874216646229 - type: nauc_ndcg_at_1_max value: 19.68585969419655 - type: nauc_ndcg_at_1_std value: 15.169957564848444 - type: nauc_ndcg_at_20_diff1 value: 11.86104023461335 - type: nauc_ndcg_at_20_max value: 17.436985589044458 - type: nauc_ndcg_at_20_std value: 15.588720372098383 - type: nauc_ndcg_at_3_diff1 value: 7.212552449189805 - type: nauc_ndcg_at_3_max value: 15.573909877641508 - type: nauc_ndcg_at_3_std value: 14.53705493856145 - type: nauc_ndcg_at_5_diff1 value: 8.778923731622235 - type: nauc_ndcg_at_5_max value: 18.140995131168534 - type: nauc_ndcg_at_5_std value: 13.608313703781533 - type: nauc_precision_at_1000_diff1 value: 21.242679241621413 - type: nauc_precision_at_1000_max value: 28.358433127289924 - type: nauc_precision_at_1000_std value: 43.82822797432329 - type: nauc_precision_at_100_diff1 value: 6.627014646720404 - type: nauc_precision_at_100_max value: 22.40433487802035 - type: nauc_precision_at_100_std value: 34.933889742457595 - type: nauc_precision_at_10_diff1 value: 10.885683410075934 - type: nauc_precision_at_10_max value: 19.96889041019717 - type: nauc_precision_at_10_std value: 17.798863824564464 - type: nauc_precision_at_1_diff1 value: 15.136874216646229 - type: nauc_precision_at_1_max value: 19.68585969419655 - type: nauc_precision_at_1_std value: 15.169957564848444 - type: nauc_precision_at_20_diff1 value: 15.496066928172066 - type: nauc_precision_at_20_max value: 16.03026652303162 - type: nauc_precision_at_20_std value: 17.26605341902364 - type: nauc_precision_at_3_diff1 value: 2.968469300914268 - type: nauc_precision_at_3_max value: 13.49791571660617 - type: nauc_precision_at_3_std value: 14.311739399090806 - type: nauc_precision_at_5_diff1 value: 6.502154730668018 - type: nauc_precision_at_5_max value: 18.889080152631124 - type: nauc_precision_at_5_std value: 12.221319698087786 - type: nauc_recall_at_1000_diff1 value: 21.242679241621435 - type: nauc_recall_at_1000_max value: 28.358433127289974 - type: nauc_recall_at_1000_std value: 43.82822797432328 - type: nauc_recall_at_100_diff1 value: 6.62701464672039 - type: nauc_recall_at_100_max value: 22.404334878020286 - type: nauc_recall_at_100_std value: 34.93388974245755 - type: nauc_recall_at_10_diff1 value: 10.885683410075906 - type: nauc_recall_at_10_max value: 19.968890410197133 - type: nauc_recall_at_10_std value: 17.7988638245644 - type: nauc_recall_at_1_diff1 value: 15.136874216646229 - type: nauc_recall_at_1_max value: 19.68585969419655 - type: nauc_recall_at_1_std value: 15.169957564848444 - type: nauc_recall_at_20_diff1 value: 15.49606692817206 - type: nauc_recall_at_20_max value: 16.030266523031628 - type: nauc_recall_at_20_std value: 17.26605341902362 - type: nauc_recall_at_3_diff1 value: 2.968469300914263 - type: nauc_recall_at_3_max value: 13.497915716606142 - type: nauc_recall_at_3_std value: 14.31173939909079 - type: nauc_recall_at_5_diff1 value: 6.50215473066801 - type: nauc_recall_at_5_max value: 18.889080152631095 - type: nauc_recall_at_5_std value: 12.221319698087767 - type: ndcg_at_1 value: 17.116999999999997 - type: ndcg_at_10 value: 28.524 - type: ndcg_at_100 value: 33.476 - type: ndcg_at_1000 value: 36.012 - type: ndcg_at_20 value: 30.820999999999998 - type: ndcg_at_3 value: 22.721 - type: ndcg_at_5 value: 26.596999999999998 - type: precision_at_1 value: 17.116999999999997 - type: precision_at_10 value: 4.234 - type: precision_at_100 value: 0.658 - type: precision_at_1000 value: 0.086 - type: precision_at_20 value: 2.568 - type: precision_at_3 value: 9.009 - type: precision_at_5 value: 7.297 - type: recall_at_1 value: 17.116999999999997 - type: recall_at_10 value: 42.342 - type: recall_at_100 value: 65.766 - type: recall_at_1000 value: 86.036 - type: recall_at_20 value: 51.351 - type: recall_at_3 value: 27.027 - type: recall_at_5 value: 36.486000000000004 task: type: Retrieval - dataset: config: default name: MTEB HALClusteringS2S revision: e06ebbbb123f8144bef1a5d18796f3dec9ae2915 split: test type: lyon-nlp/clustering-hal-s2s metrics: - type: main_score value: 28.18744772954557 - type: v_measure value: 28.18744772954557 - type: v_measure_std value: 3.239838057506439 task: type: Clustering - dataset: config: fr name: MTEB MLSUMClusteringP2P (fr) revision: b5d54f8f3b61ae17845046286940f03c6bc79bc7 split: test type: reciTAL/mlsum metrics: - type: main_score value: 47.75009059283003 - type: v_measure value: 47.75009059283003 - type: v_measure_std value: 2.009277732690298 task: type: Clustering - dataset: config: fr name: MTEB MLSUMClusteringS2S (fr) revision: b5d54f8f3b61ae17845046286940f03c6bc79bc7 split: test type: reciTAL/mlsum metrics: - type: main_score value: 47.46091989113078 - type: v_measure value: 47.46091989113078 - type: v_measure_std value: 2.604802270948194 task: type: Clustering - dataset: config: fr name: MTEB MTOPDomainClassification (fr) revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf split: test type: mteb/mtop_domain metrics: - type: accuracy value: 97.20325712496086 - type: f1 value: 97.05991090368462 - type: f1_weighted value: 97.20748006323807 - type: main_score value: 97.20325712496086 task: type: Classification - dataset: config: fr name: MTEB MTOPIntentClassification (fr) revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba split: test type: mteb/mtop_intent metrics: - type: accuracy value: 93.07234575634199 - type: f1 value: 76.54521288506878 - type: f1_weighted value: 93.6903586431893 - type: main_score value: 93.07234575634199 task: type: Classification - dataset: config: fra name: MTEB MasakhaNEWSClassification (fra) revision: 18193f187b92da67168c655c9973a165ed9593dd split: test type: mteb/masakhanews metrics: - type: accuracy value: 82.48815165876778 - type: f1 value: 78.71164464238117 - type: f1_weighted value: 82.38927389376973 - type: main_score value: 82.48815165876778 task: type: Classification - dataset: config: fra name: MTEB MasakhaNEWSClusteringP2P (fra) revision: 8ccc72e69e65f40c70e117d8b3c08306bb788b60 split: test type: masakhane/masakhanews metrics: - type: main_score value: 73.85712952800003 - type: v_measure value: 73.85712952800003 - type: v_measure_std value: 22.471668299794416 task: type: Clustering - dataset: config: fra name: MTEB MasakhaNEWSClusteringS2S (fra) revision: 8ccc72e69e65f40c70e117d8b3c08306bb788b60 split: test type: masakhane/masakhanews metrics: - type: main_score value: 67.23960512566751 - type: v_measure value: 67.23960512566751 - type: v_measure_std value: 24.65079601360142 task: type: Clustering - dataset: config: fr name: MTEB MassiveIntentClassification (fr) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 79.59986550100874 - type: f1 value: 76.0439154517916 - type: f1_weighted value: 79.48538292013761 - type: main_score value: 79.59986550100874 task: type: Classification - dataset: config: fr name: MTEB MassiveScenarioClassification (fr) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 82.182246133154 - type: f1 value: 81.68006668655397 - type: f1_weighted value: 81.94775072858566 - type: main_score value: 82.182246133154 task: type: Classification - dataset: config: fr name: MTEB MintakaRetrieval (fr) revision: efa78cc2f74bbcd21eff2261f9e13aebe40b814e split: test type: jinaai/mintakaqa metrics: - type: main_score value: 62.532 - type: map_at_1 value: 45.823 - type: map_at_10 value: 57.174 - type: map_at_100 value: 57.735 - type: map_at_1000 value: 57.767 - type: map_at_20 value: 57.53 - type: map_at_3 value: 54.716 - type: map_at_5 value: 56.227000000000004 - type: mrr_at_1 value: 45.82309582309582 - type: mrr_at_10 value: 57.17958217958217 - type: mrr_at_100 value: 57.744059413627866 - type: mrr_at_1000 value: 57.776651992832605 - type: mrr_at_20 value: 57.53890924556554 - type: mrr_at_3 value: 54.716079716079676 - type: mrr_at_5 value: 56.227136227136256 - type: nauc_map_at_1000_diff1 value: 39.48401851944296 - type: nauc_map_at_1000_max value: 36.55276875160682 - type: nauc_map_at_1000_std value: 3.9173787361040913 - type: nauc_map_at_100_diff1 value: 39.45696514871956 - type: nauc_map_at_100_max value: 36.55786982498759 - type: nauc_map_at_100_std value: 3.9506714061766557 - type: nauc_map_at_10_diff1 value: 39.31548009319837 - type: nauc_map_at_10_max value: 36.75711871602276 - type: nauc_map_at_10_std value: 3.782911249250981 - type: nauc_map_at_1_diff1 value: 44.190649439568766 - type: nauc_map_at_1_max value: 31.017419446234317 - type: nauc_map_at_1_std value: 0.5544388561183956 - type: nauc_map_at_20_diff1 value: 39.443640617310585 - type: nauc_map_at_20_max value: 36.63799366674228 - type: nauc_map_at_20_std value: 3.934276303386171 - type: nauc_map_at_3_diff1 value: 40.30871768246873 - type: nauc_map_at_3_max value: 36.944169455458656 - type: nauc_map_at_3_std value: 2.9847330185694556 - type: nauc_map_at_5_diff1 value: 39.590461060438095 - type: nauc_map_at_5_max value: 36.998781454405574 - type: nauc_map_at_5_std value: 3.532693606637119 - type: nauc_mrr_at_1000_diff1 value: 39.46102363098429 - type: nauc_mrr_at_1000_max value: 36.56900606103558 - type: nauc_mrr_at_1000_std value: 3.972436075561705 - type: nauc_mrr_at_100_diff1 value: 39.43269261665982 - type: nauc_mrr_at_100_max value: 36.574081599242014 - type: nauc_mrr_at_100_std value: 4.006374171904806 - type: nauc_mrr_at_10_diff1 value: 39.29970560564493 - type: nauc_mrr_at_10_max value: 36.778388879484716 - type: nauc_mrr_at_10_std value: 3.8335456201567206 - type: nauc_mrr_at_1_diff1 value: 44.190649439568766 - type: nauc_mrr_at_1_max value: 31.017419446234317 - type: nauc_mrr_at_1_std value: 0.5544388561183956 - type: nauc_mrr_at_20_diff1 value: 39.42091158484574 - type: nauc_mrr_at_20_max value: 36.65421566061936 - type: nauc_mrr_at_20_std value: 3.988695948848555 - type: nauc_mrr_at_3_diff1 value: 40.313976315898195 - type: nauc_mrr_at_3_max value: 36.960483501441985 - type: nauc_mrr_at_3_std value: 3.0112756156560394 - type: nauc_mrr_at_5_diff1 value: 39.56386294620379 - type: nauc_mrr_at_5_max value: 37.02119815939672 - type: nauc_mrr_at_5_std value: 3.6118004205573184 - type: nauc_ndcg_at_1000_diff1 value: 38.05281585863137 - type: nauc_ndcg_at_1000_max value: 37.41178875860201 - type: nauc_ndcg_at_1000_std value: 5.525420555163393 - type: nauc_ndcg_at_100_diff1 value: 37.18408005856676 - type: nauc_ndcg_at_100_max value: 37.617851212997685 - type: nauc_ndcg_at_100_std value: 6.871461890669446 - type: nauc_ndcg_at_10_diff1 value: 36.624444841382484 - type: nauc_ndcg_at_10_max value: 38.62100324849529 - type: nauc_ndcg_at_10_std value: 6.027810657475449 - type: nauc_ndcg_at_1_diff1 value: 44.190649439568766 - type: nauc_ndcg_at_1_max value: 31.017419446234317 - type: nauc_ndcg_at_1_std value: 0.5544388561183956 - type: nauc_ndcg_at_20_diff1 value: 37.057047514121564 - type: nauc_ndcg_at_20_max value: 38.19839331454421 - type: nauc_ndcg_at_20_std value: 6.770369938343684 - type: nauc_ndcg_at_3_diff1 value: 38.95821428563954 - type: nauc_ndcg_at_3_max value: 38.87440219376017 - type: nauc_ndcg_at_3_std value: 4.097498274708613 - type: nauc_ndcg_at_5_diff1 value: 37.515589837182034 - type: nauc_ndcg_at_5_max value: 39.165561493023276 - type: nauc_ndcg_at_5_std value: 5.291512124344874 - type: nauc_precision_at_1000_diff1 value: -13.365474882749279 - type: nauc_precision_at_1000_max value: 50.68568417959442 - type: nauc_precision_at_1000_std value: 37.847145129019054 - type: nauc_precision_at_100_diff1 value: 12.081443207482383 - type: nauc_precision_at_100_max value: 43.67561356191485 - type: nauc_precision_at_100_std value: 44.64523987759538 - type: nauc_precision_at_10_diff1 value: 23.20358204183261 - type: nauc_precision_at_10_max value: 46.93706139285088 - type: nauc_precision_at_10_std value: 17.36243956517301 - type: nauc_precision_at_1_diff1 value: 44.190649439568766 - type: nauc_precision_at_1_max value: 31.017419446234317 - type: nauc_precision_at_1_std value: 0.5544388561183956 - type: nauc_precision_at_20_diff1 value: 22.42836999246196 - type: nauc_precision_at_20_max value: 46.29381413041759 - type: nauc_precision_at_20_std value: 26.126609401922696 - type: nauc_precision_at_3_diff1 value: 34.503018704702484 - type: nauc_precision_at_3_max value: 45.194775358016095 - type: nauc_precision_at_3_std value: 7.864444241838433 - type: nauc_precision_at_5_diff1 value: 29.494641243672138 - type: nauc_precision_at_5_max value: 47.326071718857484 - type: nauc_precision_at_5_std value: 12.273738036245172 - type: nauc_recall_at_1000_diff1 value: -13.365474882756335 - type: nauc_recall_at_1000_max value: 50.68568417959348 - type: nauc_recall_at_1000_std value: 37.8471451290128 - type: nauc_recall_at_100_diff1 value: 12.08144320748251 - type: nauc_recall_at_100_max value: 43.675613561914986 - type: nauc_recall_at_100_std value: 44.645239877595564 - type: nauc_recall_at_10_diff1 value: 23.203582041832526 - type: nauc_recall_at_10_max value: 46.9370613928509 - type: nauc_recall_at_10_std value: 17.36243956517297 - type: nauc_recall_at_1_diff1 value: 44.190649439568766 - type: nauc_recall_at_1_max value: 31.017419446234317 - type: nauc_recall_at_1_std value: 0.5544388561183956 - type: nauc_recall_at_20_diff1 value: 22.42836999246212 - type: nauc_recall_at_20_max value: 46.29381413041773 - type: nauc_recall_at_20_std value: 26.12660940192268 - type: nauc_recall_at_3_diff1 value: 34.50301870470248 - type: nauc_recall_at_3_max value: 45.19477535801611 - type: nauc_recall_at_3_std value: 7.8644442418384335 - type: nauc_recall_at_5_diff1 value: 29.494641243672216 - type: nauc_recall_at_5_max value: 47.32607171885759 - type: nauc_recall_at_5_std value: 12.273738036245142 - type: ndcg_at_1 value: 45.823 - type: ndcg_at_10 value: 62.532 - type: ndcg_at_100 value: 65.298 - type: ndcg_at_1000 value: 66.214 - type: ndcg_at_20 value: 63.82600000000001 - type: ndcg_at_3 value: 57.528999999999996 - type: ndcg_at_5 value: 60.24 - type: precision_at_1 value: 45.823 - type: precision_at_10 value: 7.928 - type: precision_at_100 value: 0.923 - type: precision_at_1000 value: 0.1 - type: precision_at_20 value: 4.22 - type: precision_at_3 value: 21.881 - type: precision_at_5 value: 14.438999999999998 - type: recall_at_1 value: 45.823 - type: recall_at_10 value: 79.279 - type: recall_at_100 value: 92.301 - type: recall_at_1000 value: 99.631 - type: recall_at_20 value: 84.398 - type: recall_at_3 value: 65.643 - type: recall_at_5 value: 72.195 task: type: Retrieval - dataset: config: fr name: MTEB OpusparcusPC (fr) revision: 9e9b1f8ef51616073f47f306f7f47dd91663f86a split: test type: GEM/opusparcus metrics: - type: cosine_accuracy value: 99.90069513406156 - type: cosine_accuracy_threshold value: 54.45001207375879 - type: cosine_ap value: 100.0 - type: cosine_f1 value: 99.95032290114257 - type: cosine_f1_threshold value: 54.45001207375879 - type: cosine_precision value: 100.0 - type: cosine_recall value: 99.90069513406156 - type: dot_accuracy value: 99.90069513406156 - type: dot_accuracy_threshold value: 1312800.0 - type: dot_ap value: 100.0 - type: dot_f1 value: 99.95032290114257 - type: dot_f1_threshold value: 1312800.0 - type: dot_precision value: 100.0 - type: dot_recall value: 99.90069513406156 - type: euclidean_accuracy value: 99.90069513406156 - type: euclidean_accuracy_threshold value: 15150.791732002876 - type: euclidean_ap value: 100.0 - type: euclidean_f1 value: 99.95032290114257 - type: euclidean_f1_threshold value: 15150.791732002876 - type: euclidean_precision value: 100.0 - type: euclidean_recall value: 99.90069513406156 - type: main_score value: 100.0 - type: manhattan_accuracy value: 99.90069513406156 - type: manhattan_accuracy_threshold value: 717903.2791554928 - type: manhattan_ap value: 100.0 - type: manhattan_f1 value: 99.95032290114257 - type: manhattan_f1_threshold value: 717903.2791554928 - type: manhattan_precision value: 100.0 - type: manhattan_recall value: 99.90069513406156 - type: max_ap value: 100.0 - type: max_f1 value: 99.95032290114257 - type: max_precision value: 100.0 - type: max_recall value: 99.90069513406156 - type: similarity_accuracy value: 99.90069513406156 - type: similarity_accuracy_threshold value: 54.45001207375879 - type: similarity_ap value: 100.0 - type: similarity_f1 value: 99.95032290114257 - type: similarity_f1_threshold value: 54.45001207375879 - type: similarity_precision value: 100.0 - type: similarity_recall value: 99.90069513406156 task: type: PairClassification - dataset: config: fr name: MTEB PawsXPairClassification (fr) revision: 8a04d940a42cd40658986fdd8e3da561533a3646 split: test type: google-research-datasets/paws-x metrics: - type: cosine_accuracy value: 67.95 - type: cosine_accuracy_threshold value: 97.36901285947026 - type: cosine_ap value: 70.14158727060726 - type: cosine_f1 value: 65.38108356290174 - type: cosine_f1_threshold value: 94.90683744884689 - type: cosine_precision value: 55.84313725490196 - type: cosine_recall value: 78.8482834994463 - type: dot_accuracy value: 60.5 - type: dot_accuracy_threshold value: 2606400.0 - type: dot_ap value: 57.0114505567262 - type: dot_f1 value: 63.29394387001477 - type: dot_f1_threshold value: 2345600.0 - type: dot_precision value: 47.4792243767313 - type: dot_recall value: 94.90586932447398 - type: euclidean_accuracy value: 68.05 - type: euclidean_accuracy_threshold value: 3824.99743197985 - type: euclidean_ap value: 70.01158306654237 - type: euclidean_f1 value: 65.21939953810623 - type: euclidean_f1_threshold value: 5187.47968966464 - type: euclidean_precision value: 55.942947702060216 - type: euclidean_recall value: 78.18383167220377 - type: main_score value: 70.14158727060726 - type: manhattan_accuracy value: 68.05 - type: manhattan_accuracy_threshold value: 191852.34832763672 - type: manhattan_ap value: 70.01670033904287 - type: manhattan_f1 value: 65.2854511970534 - type: manhattan_f1_threshold value: 246807.1710705757 - type: manhattan_precision value: 55.87076438140268 - type: manhattan_recall value: 78.51605758582502 - type: max_ap value: 70.14158727060726 - type: max_f1 value: 65.38108356290174 - type: max_precision value: 55.942947702060216 - type: max_recall value: 94.90586932447398 - type: similarity_accuracy value: 67.95 - type: similarity_accuracy_threshold value: 97.36901285947026 - type: similarity_ap value: 70.14158727060726 - type: similarity_f1 value: 65.38108356290174 - type: similarity_f1_threshold value: 94.90683744884689 - type: similarity_precision value: 55.84313725490196 - type: similarity_recall value: 78.8482834994463 task: type: PairClassification - dataset: config: default name: MTEB SICKFr revision: e077ab4cf4774a1e36d86d593b150422fafd8e8a split: test type: Lajavaness/SICK-fr metrics: - type: cosine_pearson value: 79.79861486027 - type: cosine_spearman value: 79.3918786992987 - type: euclidean_pearson value: 77.73226212475764 - type: euclidean_spearman value: 79.08856888397014 - type: main_score value: 79.3918786992987 - type: manhattan_pearson value: 77.8002206650809 - type: manhattan_spearman value: 79.15284532531264 - type: pearson value: 79.79861486027 - type: spearman value: 79.3918786992987 task: type: STS - dataset: config: fr name: MTEB STS22 (fr) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 83.32314025534286 - type: cosine_spearman value: 83.2806004701507 - type: euclidean_pearson value: 81.88040500817269 - type: euclidean_spearman value: 82.73179823676206 - type: main_score value: 83.2806004701507 - type: manhattan_pearson value: 82.0438174605579 - type: manhattan_spearman value: 83.0253049811576 - type: pearson value: 83.32314025534286 - type: spearman value: 83.2806004701507 task: type: STS - dataset: config: fr name: MTEB STSBenchmarkMultilingualSTS (fr) revision: 29afa2569dcedaaa2fe6a3dcfebab33d28b82e8c split: test type: mteb/stsb_multi_mt metrics: - type: cosine_pearson value: 84.56723075054445 - type: cosine_spearman value: 85.08759191551403 - type: euclidean_pearson value: 83.186096744725 - type: euclidean_spearman value: 84.36958569816491 - type: main_score value: 85.08759191551403 - type: manhattan_pearson value: 83.1405072165467 - type: manhattan_spearman value: 84.34227830781155 - type: pearson value: 84.56723075054445 - type: spearman value: 85.08759191551403 task: type: STS - dataset: config: default name: MTEB SummEvalFr revision: b385812de6a9577b6f4d0f88c6a6e35395a94054 split: test type: lyon-nlp/summarization-summeval-fr-p2p metrics: - type: cosine_pearson value: 31.921764332449115 - type: cosine_spearman value: 31.260442997631806 - type: dot_pearson value: 31.585578707631406 - type: dot_spearman value: 31.479238746310028 - type: main_score value: 31.260442997631806 - type: pearson value: 31.921764332449115 - type: spearman value: 31.260442997631806 task: type: Summarization - dataset: config: default name: MTEB SyntecReranking revision: daf0863838cd9e3ba50544cdce3ac2b338a1b0ad split: test type: lyon-nlp/mteb-fr-reranking-syntec-s2p metrics: - type: main_score value: 91.83333333333333 - type: map value: 91.83333333333333 - type: mrr value: 92.0 - type: nAUC_map_diff1 value: 53.97793263646914 - type: nAUC_map_max value: 44.264158743282195 - type: nAUC_map_std value: 14.692218350754885 - type: nAUC_mrr_diff1 value: 54.36926882239366 - type: nAUC_mrr_max value: 46.43108510296003 - type: nAUC_mrr_std value: 17.48914092664096 task: type: Reranking - dataset: config: default name: MTEB SyntecRetrieval revision: 19661ccdca4dfc2d15122d776b61685f48c68ca9 split: test type: lyon-nlp/mteb-fr-retrieval-syntec-s2p metrics: - type: main_score value: 90.36699999999999 - type: map_at_1 value: 79.0 - type: map_at_10 value: 87.18599999999999 - type: map_at_100 value: 87.18599999999999 - type: map_at_1000 value: 87.18599999999999 - type: map_at_20 value: 87.18599999999999 - type: map_at_3 value: 86.0 - type: map_at_5 value: 86.95 - type: mrr_at_1 value: 79.0 - type: mrr_at_10 value: 87.18611111111112 - type: mrr_at_100 value: 87.18611111111112 - type: mrr_at_1000 value: 87.18611111111112 - type: mrr_at_20 value: 87.18611111111112 - type: mrr_at_3 value: 86.0 - type: mrr_at_5 value: 86.95 - type: nauc_map_at_1000_diff1 value: 63.05539428169271 - type: nauc_map_at_1000_max value: 45.428107132447124 - type: nauc_map_at_1000_std value: 13.94507583970834 - type: nauc_map_at_100_diff1 value: 63.05539428169271 - type: nauc_map_at_100_max value: 45.428107132447124 - type: nauc_map_at_100_std value: 13.94507583970834 - type: nauc_map_at_10_diff1 value: 63.05539428169271 - type: nauc_map_at_10_max value: 45.428107132447124 - type: nauc_map_at_10_std value: 13.94507583970834 - type: nauc_map_at_1_diff1 value: 64.24122923028831 - type: nauc_map_at_1_max value: 44.34077957053877 - type: nauc_map_at_1_std value: 9.594344386466878 - type: nauc_map_at_20_diff1 value: 63.05539428169271 - type: nauc_map_at_20_max value: 45.428107132447124 - type: nauc_map_at_20_std value: 13.94507583970834 - type: nauc_map_at_3_diff1 value: 62.30831315577075 - type: nauc_map_at_3_max value: 47.33980193586779 - type: nauc_map_at_3_std value: 16.132624025733 - type: nauc_map_at_5_diff1 value: 63.079622378971834 - type: nauc_map_at_5_max value: 45.13424437707254 - type: nauc_map_at_5_std value: 13.730785051570013 - type: nauc_mrr_at_1000_diff1 value: 63.05539428169271 - type: nauc_mrr_at_1000_max value: 45.428107132447124 - type: nauc_mrr_at_1000_std value: 13.94507583970834 - type: nauc_mrr_at_100_diff1 value: 63.05539428169271 - type: nauc_mrr_at_100_max value: 45.428107132447124 - type: nauc_mrr_at_100_std value: 13.94507583970834 - type: nauc_mrr_at_10_diff1 value: 63.05539428169271 - type: nauc_mrr_at_10_max value: 45.428107132447124 - type: nauc_mrr_at_10_std value: 13.94507583970834 - type: nauc_mrr_at_1_diff1 value: 64.24122923028831 - type: nauc_mrr_at_1_max value: 44.34077957053877 - type: nauc_mrr_at_1_std value: 9.594344386466878 - type: nauc_mrr_at_20_diff1 value: 63.05539428169271 - type: nauc_mrr_at_20_max value: 45.428107132447124 - type: nauc_mrr_at_20_std value: 13.94507583970834 - type: nauc_mrr_at_3_diff1 value: 62.30831315577075 - type: nauc_mrr_at_3_max value: 47.33980193586779 - type: nauc_mrr_at_3_std value: 16.132624025733 - type: nauc_mrr_at_5_diff1 value: 63.079622378971834 - type: nauc_mrr_at_5_max value: 45.13424437707254 - type: nauc_mrr_at_5_std value: 13.730785051570013 - type: nauc_ndcg_at_1000_diff1 value: 62.97376441474187 - type: nauc_ndcg_at_1000_max value: 45.457846840130586 - type: nauc_ndcg_at_1000_std value: 14.17695491254452 - type: nauc_ndcg_at_100_diff1 value: 62.97376441474187 - type: nauc_ndcg_at_100_max value: 45.457846840130586 - type: nauc_ndcg_at_100_std value: 14.17695491254452 - type: nauc_ndcg_at_10_diff1 value: 62.97376441474187 - type: nauc_ndcg_at_10_max value: 45.457846840130586 - type: nauc_ndcg_at_10_std value: 14.17695491254452 - type: nauc_ndcg_at_1_diff1 value: 64.24122923028831 - type: nauc_ndcg_at_1_max value: 44.34077957053877 - type: nauc_ndcg_at_1_std value: 9.594344386466878 - type: nauc_ndcg_at_20_diff1 value: 62.97376441474187 - type: nauc_ndcg_at_20_max value: 45.457846840130586 - type: nauc_ndcg_at_20_std value: 14.17695491254452 - type: nauc_ndcg_at_3_diff1 value: 61.47043349797183 - type: nauc_ndcg_at_3_max value: 49.12165820225059 - type: nauc_ndcg_at_3_std value: 18.525396343409568 - type: nauc_ndcg_at_5_diff1 value: 63.04022063936115 - type: nauc_ndcg_at_5_max value: 44.381937619091765 - type: nauc_ndcg_at_5_std value: 13.3263412698325 - type: nauc_precision_at_1000_diff1 value: .nan - type: nauc_precision_at_1000_max value: .nan - type: nauc_precision_at_1000_std value: .nan - type: nauc_precision_at_100_diff1 value: .nan - type: nauc_precision_at_100_max value: .nan - type: nauc_precision_at_100_std value: .nan - type: nauc_precision_at_10_diff1 value: 100.0 - type: nauc_precision_at_10_max value: 100.0 - type: nauc_precision_at_10_std value: 100.0 - type: nauc_precision_at_1_diff1 value: 64.24122923028831 - type: nauc_precision_at_1_max value: 44.34077957053877 - type: nauc_precision_at_1_std value: 9.594344386466878 - type: nauc_precision_at_20_diff1 value: 100.0 - type: nauc_precision_at_20_max value: 100.0 - type: nauc_precision_at_20_std value: 100.0 - type: nauc_precision_at_3_diff1 value: 56.27917833800158 - type: nauc_precision_at_3_max value: 60.51976346093969 - type: nauc_precision_at_3_std value: 33.02209772798002 - type: nauc_precision_at_5_diff1 value: 63.81886087768404 - type: nauc_precision_at_5_max value: 27.544351073763345 - type: nauc_precision_at_5_std value: -0.4668534080301362 - type: nauc_recall_at_1000_diff1 value: .nan - type: nauc_recall_at_1000_max value: .nan - type: nauc_recall_at_1000_std value: .nan - type: nauc_recall_at_100_diff1 value: .nan - type: nauc_recall_at_100_max value: .nan - type: nauc_recall_at_100_std value: .nan - type: nauc_recall_at_10_diff1 value: .nan - type: nauc_recall_at_10_max value: .nan - type: nauc_recall_at_10_std value: .nan - type: nauc_recall_at_1_diff1 value: 64.24122923028831 - type: nauc_recall_at_1_max value: 44.34077957053877 - type: nauc_recall_at_1_std value: 9.594344386466878 - type: nauc_recall_at_20_diff1 value: .nan - type: nauc_recall_at_20_max value: .nan - type: nauc_recall_at_20_std value: .nan - type: nauc_recall_at_3_diff1 value: 56.27917833800187 - type: nauc_recall_at_3_max value: 60.51976346094 - type: nauc_recall_at_3_std value: 33.022097727980125 - type: nauc_recall_at_5_diff1 value: 63.81886087768457 - type: nauc_recall_at_5_max value: 27.544351073763107 - type: nauc_recall_at_5_std value: -0.46685340803013775 - type: ndcg_at_1 value: 79.0 - type: ndcg_at_10 value: 90.36699999999999 - type: ndcg_at_100 value: 90.36699999999999 - type: ndcg_at_1000 value: 90.36699999999999 - type: ndcg_at_20 value: 90.36699999999999 - type: ndcg_at_3 value: 88.071 - type: ndcg_at_5 value: 89.75 - type: precision_at_1 value: 79.0 - type: precision_at_10 value: 10.0 - type: precision_at_100 value: 1.0 - type: precision_at_1000 value: 0.1 - type: precision_at_20 value: 5.0 - type: precision_at_3 value: 31.333 - type: precision_at_5 value: 19.6 - type: recall_at_1 value: 79.0 - type: recall_at_10 value: 100.0 - type: recall_at_100 value: 100.0 - type: recall_at_1000 value: 100.0 - type: recall_at_20 value: 100.0 - type: recall_at_3 value: 94.0 - type: recall_at_5 value: 98.0 task: type: Retrieval - dataset: config: fra-fra name: MTEB XPQARetrieval (fr) revision: c99d599f0a6ab9b85b065da6f9d94f9cf731679f split: test type: jinaai/xpqa metrics: - type: main_score value: 77.425 - type: map_at_1 value: 46.749 - type: map_at_10 value: 72.108 - type: map_at_100 value: 73.32499999999999 - type: map_at_1000 value: 73.341 - type: map_at_20 value: 72.991 - type: map_at_3 value: 65.09 - type: map_at_5 value: 70.137 - type: mrr_at_1 value: 71.82910547396529 - type: mrr_at_10 value: 78.63357492529722 - type: mrr_at_100 value: 78.97374961354801 - type: mrr_at_1000 value: 78.97840549855806 - type: mrr_at_20 value: 78.86005025292395 - type: mrr_at_3 value: 77.28081886960389 - type: mrr_at_5 value: 78.0551846906987 - type: nauc_map_at_1000_diff1 value: 57.508397030020156 - type: nauc_map_at_1000_max value: 43.80251983780665 - type: nauc_map_at_1000_std value: -16.231491160419434 - type: nauc_map_at_100_diff1 value: 57.48614844875469 - type: nauc_map_at_100_max value: 43.797011627763055 - type: nauc_map_at_100_std value: -16.239303348969592 - type: nauc_map_at_10_diff1 value: 57.254064849553934 - type: nauc_map_at_10_max value: 42.765535577219026 - type: nauc_map_at_10_std value: -17.255606315997156 - type: nauc_map_at_1_diff1 value: 65.04324659040175 - type: nauc_map_at_1_max value: 17.852220653388855 - type: nauc_map_at_1_std value: -14.257753661018779 - type: nauc_map_at_20_diff1 value: 57.48367588324867 - type: nauc_map_at_20_max value: 43.680084254814425 - type: nauc_map_at_20_std value: -16.59381108810359 - type: nauc_map_at_3_diff1 value: 58.328817274958276 - type: nauc_map_at_3_max value: 34.603370607250675 - type: nauc_map_at_3_std value: -15.326569334165047 - type: nauc_map_at_5_diff1 value: 57.544271139796365 - type: nauc_map_at_5_max value: 41.58159814532708 - type: nauc_map_at_5_std value: -17.035562345654515 - type: nauc_mrr_at_1000_diff1 value: 67.23053035385993 - type: nauc_mrr_at_1000_max value: 53.982556981667095 - type: nauc_mrr_at_1000_std value: -12.015571062417035 - type: nauc_mrr_at_100_diff1 value: 67.23047293440347 - type: nauc_mrr_at_100_max value: 53.97931489747768 - type: nauc_mrr_at_100_std value: -12.026957248146365 - type: nauc_mrr_at_10_diff1 value: 67.25927907237941 - type: nauc_mrr_at_10_max value: 53.99647347811833 - type: nauc_mrr_at_10_std value: -12.356365137919108 - type: nauc_mrr_at_1_diff1 value: 67.80552098159194 - type: nauc_mrr_at_1_max value: 52.34740974885752 - type: nauc_mrr_at_1_std value: -9.009347371853096 - type: nauc_mrr_at_20_diff1 value: 67.22472566769486 - type: nauc_mrr_at_20_max value: 54.03480374123263 - type: nauc_mrr_at_20_std value: -12.129416933895373 - type: nauc_mrr_at_3_diff1 value: 66.86636026044627 - type: nauc_mrr_at_3_max value: 53.84675762408544 - type: nauc_mrr_at_3_std value: -12.318414220208327 - type: nauc_mrr_at_5_diff1 value: 67.16713697443882 - type: nauc_mrr_at_5_max value: 54.174275682276765 - type: nauc_mrr_at_5_std value: -12.382704200660772 - type: nauc_ndcg_at_1000_diff1 value: 60.076768803793875 - type: nauc_ndcg_at_1000_max value: 48.06880976583911 - type: nauc_ndcg_at_1000_std value: -14.8002468401513 - type: nauc_ndcg_at_100_diff1 value: 59.84195440900073 - type: nauc_ndcg_at_100_max value: 48.031759882567265 - type: nauc_ndcg_at_100_std value: -14.93671795434138 - type: nauc_ndcg_at_10_diff1 value: 59.091362656630984 - type: nauc_ndcg_at_10_max value: 45.902216798175296 - type: nauc_ndcg_at_10_std value: -18.225812204918686 - type: nauc_ndcg_at_1_diff1 value: 67.80552098159194 - type: nauc_ndcg_at_1_max value: 52.34740974885752 - type: nauc_ndcg_at_1_std value: -9.009347371853096 - type: nauc_ndcg_at_20_diff1 value: 59.80472569029982 - type: nauc_ndcg_at_20_max value: 47.92221974783734 - type: nauc_ndcg_at_20_std value: -16.589965314279805 - type: nauc_ndcg_at_3_diff1 value: 56.9195769675713 - type: nauc_ndcg_at_3_max value: 44.992740041222575 - type: nauc_ndcg_at_3_std value: -16.329730380555382 - type: nauc_ndcg_at_5_diff1 value: 59.31912266230594 - type: nauc_ndcg_at_5_max value: 44.75423089733974 - type: nauc_ndcg_at_5_std value: -17.744216780645583 - type: nauc_precision_at_1000_diff1 value: -30.976050318575094 - type: nauc_precision_at_1000_max value: 16.55619583017722 - type: nauc_precision_at_1000_std value: 10.549164466552044 - type: nauc_precision_at_100_diff1 value: -30.217028356940872 - type: nauc_precision_at_100_max value: 17.709049202840184 - type: nauc_precision_at_100_std value: 10.04190905252673 - type: nauc_precision_at_10_diff1 value: -19.588612396735584 - type: nauc_precision_at_10_max value: 23.97095583735318 - type: nauc_precision_at_10_std value: 1.3308819095790259 - type: nauc_precision_at_1_diff1 value: 67.80552098159194 - type: nauc_precision_at_1_max value: 52.34740974885752 - type: nauc_precision_at_1_std value: -9.009347371853096 - type: nauc_precision_at_20_diff1 value: -24.56372903999468 - type: nauc_precision_at_20_max value: 21.970766470092478 - type: nauc_precision_at_20_std value: 5.690019568793079 - type: nauc_precision_at_3_diff1 value: -5.293993834675436 - type: nauc_precision_at_3_max value: 33.48037221970611 - type: nauc_precision_at_3_std value: -0.9905029996040207 - type: nauc_precision_at_5_diff1 value: -12.477204961113433 - type: nauc_precision_at_5_max value: 28.41320824321574 - type: nauc_precision_at_5_std value: -0.25510168506666026 - type: nauc_recall_at_1000_diff1 value: 63.80720019823024 - type: nauc_recall_at_1000_max value: 100.0 - type: nauc_recall_at_1000_std value: 100.0 - type: nauc_recall_at_100_diff1 value: 45.99503772001805 - type: nauc_recall_at_100_max value: 53.62256247578381 - type: nauc_recall_at_100_std value: -2.1521605315502126 - type: nauc_recall_at_10_diff1 value: 51.49183566173087 - type: nauc_recall_at_10_max value: 39.94460610694432 - type: nauc_recall_at_10_std value: -27.417226994058534 - type: nauc_recall_at_1_diff1 value: 65.04324659040175 - type: nauc_recall_at_1_max value: 17.852220653388855 - type: nauc_recall_at_1_std value: -14.257753661018779 - type: nauc_recall_at_20_diff1 value: 53.65987970751146 - type: nauc_recall_at_20_max value: 48.20536243702891 - type: nauc_recall_at_20_std value: -24.77784527777353 - type: nauc_recall_at_3_diff1 value: 53.27794448209969 - type: nauc_recall_at_3_max value: 30.304767840963283 - type: nauc_recall_at_3_std value: -19.099603261339936 - type: nauc_recall_at_5_diff1 value: 53.77383683020561 - type: nauc_recall_at_5_max value: 39.58616026474047 - type: nauc_recall_at_5_std value: -23.255086482736036 - type: ndcg_at_1 value: 71.829 - type: ndcg_at_10 value: 77.425 - type: ndcg_at_100 value: 80.88 - type: ndcg_at_1000 value: 81.128 - type: ndcg_at_20 value: 79.403 - type: ndcg_at_3 value: 72.89 - type: ndcg_at_5 value: 74.521 - type: precision_at_1 value: 71.829 - type: precision_at_10 value: 17.596999999999998 - type: precision_at_100 value: 2.033 - type: precision_at_1000 value: 0.207 - type: precision_at_20 value: 9.513 - type: precision_at_3 value: 44.192 - type: precision_at_5 value: 31.776 - type: recall_at_1 value: 46.749 - type: recall_at_10 value: 85.49799999999999 - type: recall_at_100 value: 98.17099999999999 - type: recall_at_1000 value: 99.733 - type: recall_at_20 value: 91.70700000000001 - type: recall_at_3 value: 70.309 - type: recall_at_5 value: 78.507 task: type: Retrieval - dataset: config: default name: MTEB AllegroReviews revision: b89853e6de927b0e3bfa8ecc0e56fe4e02ceafc6 split: test type: PL-MTEB/allegro-reviews metrics: - type: accuracy value: 65.0 - type: f1 value: 58.85888258599016 - type: f1_weighted value: 65.99554726292321 - type: main_score value: 65.0 task: type: Classification - dataset: config: default name: MTEB ArguAna-PL revision: 63fc86750af76253e8c760fc9e534bbf24d260a2 split: test type: clarin-knext/arguana-pl metrics: - type: main_score value: 59.71300000000001 - type: map_at_1 value: 35.135 - type: map_at_10 value: 51.092000000000006 - type: map_at_100 value: 51.773 - type: map_at_1000 value: 51.776999999999994 - type: map_at_20 value: 51.665000000000006 - type: map_at_3 value: 46.574 - type: map_at_5 value: 49.032 - type: mrr_at_1 value: 36.201991465149355 - type: mrr_at_10 value: 51.546405427984475 - type: mrr_at_100 value: 52.202374673015285 - type: mrr_at_1000 value: 52.20610086068531 - type: mrr_at_20 value: 52.096805353180756 - type: mrr_at_3 value: 47.01280227596022 - type: mrr_at_5 value: 49.49146514935999 - type: nauc_map_at_1000_diff1 value: 19.758403663654388 - type: nauc_map_at_1000_max value: 1.9211716901459552 - type: nauc_map_at_1000_std value: -12.391775130617594 - type: nauc_map_at_100_diff1 value: 19.75801012476506 - type: nauc_map_at_100_max value: 1.927233271789035 - type: nauc_map_at_100_std value: -12.390686358565384 - type: nauc_map_at_10_diff1 value: 19.618023487744257 - type: nauc_map_at_10_max value: 1.948823709088292 - type: nauc_map_at_10_std value: -12.590649627823774 - type: nauc_map_at_1_diff1 value: 22.704520355653777 - type: nauc_map_at_1_max value: -0.7340073588952427 - type: nauc_map_at_1_std value: -11.685082615631233 - type: nauc_map_at_20_diff1 value: 19.710150386755245 - type: nauc_map_at_20_max value: 1.9579689185617946 - type: nauc_map_at_20_std value: -12.454848473878485 - type: nauc_map_at_3_diff1 value: 19.88571571635227 - type: nauc_map_at_3_max value: 2.2089391275055754 - type: nauc_map_at_3_std value: -12.152625563551476 - type: nauc_map_at_5_diff1 value: 19.345423817148774 - type: nauc_map_at_5_max value: 2.4471831202433783 - type: nauc_map_at_5_std value: -11.60532301686549 - type: nauc_mrr_at_1000_diff1 value: 16.90786453167799 - type: nauc_mrr_at_1000_max value: 0.65578323377857 - type: nauc_mrr_at_1000_std value: -12.395929715413015 - type: nauc_mrr_at_100_diff1 value: 16.90781127619206 - type: nauc_mrr_at_100_max value: 0.6619900297824423 - type: nauc_mrr_at_100_std value: -12.394826789608906 - type: nauc_mrr_at_10_diff1 value: 16.785894192163838 - type: nauc_mrr_at_10_max value: 0.7096666849274212 - type: nauc_mrr_at_10_std value: -12.592883550594735 - type: nauc_mrr_at_1_diff1 value: 19.59282927806732 - type: nauc_mrr_at_1_max value: -1.1271716729359413 - type: nauc_mrr_at_1_std value: -11.710668880297517 - type: nauc_mrr_at_20_diff1 value: 16.86673477981559 - type: nauc_mrr_at_20_max value: 0.6897167399764257 - type: nauc_mrr_at_20_std value: -12.464631471378414 - type: nauc_mrr_at_3_diff1 value: 17.0481261621288 - type: nauc_mrr_at_3_max value: 0.7183007174016199 - type: nauc_mrr_at_3_std value: -12.329335728574527 - type: nauc_mrr_at_5_diff1 value: 16.698916629443854 - type: nauc_mrr_at_5_max value: 1.2515514207224299 - type: nauc_mrr_at_5_std value: -11.662599392805308 - type: nauc_ndcg_at_1000_diff1 value: 19.30605856078901 - type: nauc_ndcg_at_1000_max value: 2.3402231520806835 - type: nauc_ndcg_at_1000_std value: -12.370409989770332 - type: nauc_ndcg_at_100_diff1 value: 19.31155460872256 - type: nauc_ndcg_at_100_max value: 2.510633162779702 - type: nauc_ndcg_at_100_std value: -12.313796276064673 - type: nauc_ndcg_at_10_diff1 value: 18.511651466450843 - type: nauc_ndcg_at_10_max value: 2.6756675185155263 - type: nauc_ndcg_at_10_std value: -13.573610085360095 - type: nauc_ndcg_at_1_diff1 value: 22.704520355653777 - type: nauc_ndcg_at_1_max value: -0.7340073588952427 - type: nauc_ndcg_at_1_std value: -11.685082615631233 - type: nauc_ndcg_at_20_diff1 value: 19.01305812933961 - type: nauc_ndcg_at_20_max value: 2.777977280012548 - type: nauc_ndcg_at_20_std value: -12.959515013552128 - type: nauc_ndcg_at_3_diff1 value: 19.15053976740578 - type: nauc_ndcg_at_3_max value: 3.2587972262385496 - type: nauc_ndcg_at_3_std value: -12.105808757691328 - type: nauc_ndcg_at_5_diff1 value: 18.010082675090597 - type: nauc_ndcg_at_5_max value: 3.753876824229378 - type: nauc_ndcg_at_5_std value: -11.044202434548701 - type: nauc_precision_at_1000_diff1 value: -11.75783343822487 - type: nauc_precision_at_1000_max value: 5.7856460776313465 - type: nauc_precision_at_1000_std value: 62.79171280927037 - type: nauc_precision_at_100_diff1 value: 9.08527555500537 - type: nauc_precision_at_100_max value: 36.16754653078746 - type: nauc_precision_at_100_std value: 28.37969482833522 - type: nauc_precision_at_10_diff1 value: 10.685081888632977 - type: nauc_precision_at_10_max value: 7.185779514361452 - type: nauc_precision_at_10_std value: -22.209758078034394 - type: nauc_precision_at_1_diff1 value: 22.704520355653777 - type: nauc_precision_at_1_max value: -0.7340073588952427 - type: nauc_precision_at_1_std value: -11.685082615631233 - type: nauc_precision_at_20_diff1 value: 10.0745772945806 - type: nauc_precision_at_20_max value: 16.81469938479116 - type: nauc_precision_at_20_std value: -22.804277740935298 - type: nauc_precision_at_3_diff1 value: 16.900587067301714 - type: nauc_precision_at_3_max value: 6.595958907337978 - type: nauc_precision_at_3_std value: -11.888316132805594 - type: nauc_precision_at_5_diff1 value: 12.771428972972895 - type: nauc_precision_at_5_max value: 8.79201485711544 - type: nauc_precision_at_5_std value: -8.609881800940762 - type: nauc_recall_at_1000_diff1 value: -11.757833438225305 - type: nauc_recall_at_1000_max value: 5.785646077628613 - type: nauc_recall_at_1000_std value: 62.791712809264176 - type: nauc_recall_at_100_diff1 value: 9.085275555005722 - type: nauc_recall_at_100_max value: 36.167546530787995 - type: nauc_recall_at_100_std value: 28.37969482833511 - type: nauc_recall_at_10_diff1 value: 10.68508188863288 - type: nauc_recall_at_10_max value: 7.185779514361484 - type: nauc_recall_at_10_std value: -22.209758078034465 - type: nauc_recall_at_1_diff1 value: 22.704520355653777 - type: nauc_recall_at_1_max value: -0.7340073588952427 - type: nauc_recall_at_1_std value: -11.685082615631233 - type: nauc_recall_at_20_diff1 value: 10.074577294581067 - type: nauc_recall_at_20_max value: 16.814699384791545 - type: nauc_recall_at_20_std value: -22.80427774093497 - type: nauc_recall_at_3_diff1 value: 16.900587067301768 - type: nauc_recall_at_3_max value: 6.595958907337955 - type: nauc_recall_at_3_std value: -11.888316132805613 - type: nauc_recall_at_5_diff1 value: 12.77142897297289 - type: nauc_recall_at_5_max value: 8.792014857115413 - type: nauc_recall_at_5_std value: -8.609881800940697 - type: ndcg_at_1 value: 35.135 - type: ndcg_at_10 value: 59.71300000000001 - type: ndcg_at_100 value: 62.5 - type: ndcg_at_1000 value: 62.578 - type: ndcg_at_20 value: 61.775000000000006 - type: ndcg_at_3 value: 50.336999999999996 - type: ndcg_at_5 value: 54.748 - type: precision_at_1 value: 35.135 - type: precision_at_10 value: 8.72 - type: precision_at_100 value: 0.991 - type: precision_at_1000 value: 0.1 - type: precision_at_20 value: 4.765 - type: precision_at_3 value: 20.413 - type: precision_at_5 value: 14.381 - type: recall_at_1 value: 35.135 - type: recall_at_10 value: 87.198 - type: recall_at_100 value: 99.075 - type: recall_at_1000 value: 99.644 - type: recall_at_20 value: 95.306 - type: recall_at_3 value: 61.23800000000001 - type: recall_at_5 value: 71.906 task: type: Retrieval - dataset: config: default name: MTEB CBD revision: 36ddb419bcffe6a5374c3891957912892916f28d split: test type: PL-MTEB/cbd metrics: - type: accuracy value: 84.13000000000001 - type: ap value: 38.21674564144456 - type: ap_weighted value: 38.21674564144456 - type: f1 value: 73.58128735002478 - type: f1_weighted value: 85.75596717538494 - type: main_score value: 84.13000000000001 task: type: Classification - dataset: config: default name: MTEB CDSC-E revision: 0a3d4aa409b22f80eb22cbf59b492637637b536d split: test type: PL-MTEB/cdsce-pairclassification metrics: - type: cosine_accuracy value: 89.0 - type: cosine_accuracy_threshold value: 95.30268088769837 - type: cosine_ap value: 78.23422403821777 - type: cosine_f1 value: 69.23076923076923 - type: cosine_f1_threshold value: 87.1877340095262 - type: cosine_precision value: 67.5 - type: cosine_recall value: 71.05263157894737 - type: dot_accuracy value: 88.3 - type: dot_accuracy_threshold value: 2472000.0 - type: dot_ap value: 74.26705897704197 - type: dot_f1 value: 66.49874055415617 - type: dot_f1_threshold value: 2316800.0 - type: dot_precision value: 63.76811594202898 - type: dot_recall value: 69.47368421052632 - type: euclidean_accuracy value: 89.2 - type: euclidean_accuracy_threshold value: 6878.705188647788 - type: euclidean_ap value: 78.51718555534579 - type: euclidean_f1 value: 69.54314720812182 - type: euclidean_f1_threshold value: 8323.035838252725 - type: euclidean_precision value: 67.15686274509804 - type: euclidean_recall value: 72.10526315789474 - type: main_score value: 78.51718555534579 - type: manhattan_accuracy value: 89.2 - type: manhattan_accuracy_threshold value: 326812.48528957367 - type: manhattan_ap value: 78.50895632545628 - type: manhattan_f1 value: 69.84924623115577 - type: manhattan_f1_threshold value: 398102.616417408 - type: manhattan_precision value: 66.82692307692307 - type: manhattan_recall value: 73.15789473684211 - type: max_ap value: 78.51718555534579 - type: max_f1 value: 69.84924623115577 - type: max_precision value: 67.5 - type: max_recall value: 73.15789473684211 - type: similarity_accuracy value: 89.0 - type: similarity_accuracy_threshold value: 95.30268088769837 - type: similarity_ap value: 78.23422403821777 - type: similarity_f1 value: 69.23076923076923 - type: similarity_f1_threshold value: 87.1877340095262 - type: similarity_precision value: 67.5 - type: similarity_recall value: 71.05263157894737 task: type: PairClassification - dataset: config: default name: MTEB CDSC-R revision: 1cd6abbb00df7d14be3dbd76a7dcc64b3a79a7cd split: test type: PL-MTEB/cdscr-sts metrics: - type: cosine_pearson value: 91.04238667979497 - type: cosine_spearman value: 90.96758456402505 - type: euclidean_pearson value: 88.88396869759062 - type: euclidean_spearman value: 90.80235709678217 - type: main_score value: 90.96758456402505 - type: manhattan_pearson value: 88.91331977492183 - type: manhattan_spearman value: 90.82823486754444 - type: pearson value: 91.04238667979497 - type: spearman value: 90.96758456402505 task: type: STS - dataset: config: default name: MTEB DBPedia-PL revision: 76afe41d9af165cc40999fcaa92312b8b012064a split: test type: clarin-knext/dbpedia-pl metrics: - type: main_score value: 43.189 - type: map_at_1 value: 8.838 - type: map_at_10 value: 20.335 - type: map_at_100 value: 29.818 - type: map_at_1000 value: 31.672 - type: map_at_20 value: 24.037 - type: map_at_3 value: 14.144000000000002 - type: map_at_5 value: 16.674 - type: mrr_at_1 value: 66.25 - type: mrr_at_10 value: 74.51428571428573 - type: mrr_at_100 value: 74.85025528596333 - type: mrr_at_1000 value: 74.861579760375 - type: mrr_at_20 value: 74.75227906231197 - type: mrr_at_3 value: 73.25 - type: mrr_at_5 value: 73.825 - type: nauc_map_at_1000_diff1 value: 25.397956304548963 - type: nauc_map_at_1000_max value: 34.60045634629073 - type: nauc_map_at_1000_std value: 25.484338507029523 - type: nauc_map_at_100_diff1 value: 26.732402811074362 - type: nauc_map_at_100_max value: 33.16273154550298 - type: nauc_map_at_100_std value: 22.705558316419694 - type: nauc_map_at_10_diff1 value: 31.048350740517666 - type: nauc_map_at_10_max value: 20.58247280790142 - type: nauc_map_at_10_std value: -0.3057740988996755 - type: nauc_map_at_1_diff1 value: 37.44384898753489 - type: nauc_map_at_1_max value: 2.009066872007797 - type: nauc_map_at_1_std value: -18.38972044447374 - type: nauc_map_at_20_diff1 value: 29.145950023489974 - type: nauc_map_at_20_max value: 25.337239700245075 - type: nauc_map_at_20_std value: 7.680343084384305 - type: nauc_map_at_3_diff1 value: 32.41886776815376 - type: nauc_map_at_3_max value: 8.976460728750666 - type: nauc_map_at_3_std value: -14.206927116348458 - type: nauc_map_at_5_diff1 value: 31.316919153957873 - type: nauc_map_at_5_max value: 14.015365438005226 - type: nauc_map_at_5_std value: -8.909007562143335 - type: nauc_mrr_at_1000_diff1 value: 42.77521158292109 - type: nauc_mrr_at_1000_max value: 58.03733674934908 - type: nauc_mrr_at_1000_std value: 42.65118460573791 - type: nauc_mrr_at_100_diff1 value: 42.76917109803571 - type: nauc_mrr_at_100_max value: 58.04747433083853 - type: nauc_mrr_at_100_std value: 42.65151388365855 - type: nauc_mrr_at_10_diff1 value: 42.4992726119988 - type: nauc_mrr_at_10_max value: 58.157080658302974 - type: nauc_mrr_at_10_std value: 42.98778606676595 - type: nauc_mrr_at_1_diff1 value: 46.67764597969527 - type: nauc_mrr_at_1_max value: 54.52896662427813 - type: nauc_mrr_at_1_std value: 35.71181387979735 - type: nauc_mrr_at_20_diff1 value: 42.79101300218034 - type: nauc_mrr_at_20_max value: 58.05679669975563 - type: nauc_mrr_at_20_std value: 42.72288886007032 - type: nauc_mrr_at_3_diff1 value: 41.85440967628899 - type: nauc_mrr_at_3_max value: 57.975577899726126 - type: nauc_mrr_at_3_std value: 43.523432037784985 - type: nauc_mrr_at_5_diff1 value: 42.3041465494315 - type: nauc_mrr_at_5_max value: 58.54530113479029 - type: nauc_mrr_at_5_std value: 43.2944834223015 - type: nauc_ndcg_at_1000_diff1 value: 32.16216922989725 - type: nauc_ndcg_at_1000_max value: 50.03467332768009 - type: nauc_ndcg_at_1000_std value: 42.87877265207483 - type: nauc_ndcg_at_100_diff1 value: 33.55193527551313 - type: nauc_ndcg_at_100_max value: 45.12048953873363 - type: nauc_ndcg_at_100_std value: 34.788021436199024 - type: nauc_ndcg_at_10_diff1 value: 31.14168233882658 - type: nauc_ndcg_at_10_max value: 45.31079148382448 - type: nauc_ndcg_at_10_std value: 28.555214349385466 - type: nauc_ndcg_at_1_diff1 value: 45.12481069889602 - type: nauc_ndcg_at_1_max value: 45.93377570654117 - type: nauc_ndcg_at_1_std value: 26.672617000885186 - type: nauc_ndcg_at_20_diff1 value: 31.81216979830056 - type: nauc_ndcg_at_20_max value: 41.93464767693644 - type: nauc_ndcg_at_20_std value: 26.08707327004535 - type: nauc_ndcg_at_3_diff1 value: 29.90627202771331 - type: nauc_ndcg_at_3_max value: 46.50414958925517 - type: nauc_ndcg_at_3_std value: 29.66009841753563 - type: nauc_ndcg_at_5_diff1 value: 29.08122779713697 - type: nauc_ndcg_at_5_max value: 46.81499760516951 - type: nauc_ndcg_at_5_std value: 29.935930977468267 - type: nauc_precision_at_1000_diff1 value: -18.71150014402453 - type: nauc_precision_at_1000_max value: -0.9220395765472844 - type: nauc_precision_at_1000_std value: 7.219897945975822 - type: nauc_precision_at_100_diff1 value: -8.609528664023014 - type: nauc_precision_at_100_max value: 29.147048677242864 - type: nauc_precision_at_100_std value: 44.958041507680036 - type: nauc_precision_at_10_diff1 value: 2.8689201908213477 - type: nauc_precision_at_10_max value: 44.40893361361308 - type: nauc_precision_at_10_std value: 47.18569807586499 - type: nauc_precision_at_1_diff1 value: 46.01228536231763 - type: nauc_precision_at_1_max value: 54.30280987857099 - type: nauc_precision_at_1_std value: 36.923128493492776 - type: nauc_precision_at_20_diff1 value: -1.9783515948740122 - type: nauc_precision_at_20_max value: 38.42066921295958 - type: nauc_precision_at_20_std value: 47.41935674153161 - type: nauc_precision_at_3_diff1 value: 9.877584475384026 - type: nauc_precision_at_3_max value: 44.77006526403546 - type: nauc_precision_at_3_std value: 39.51299545977156 - type: nauc_precision_at_5_diff1 value: 5.096217475317008 - type: nauc_precision_at_5_max value: 45.66716959157208 - type: nauc_precision_at_5_std value: 42.651208343259505 - type: nauc_recall_at_1000_diff1 value: 25.395292649442965 - type: nauc_recall_at_1000_max value: 44.94193476114992 - type: nauc_recall_at_1000_std value: 53.58345238223027 - type: nauc_recall_at_100_diff1 value: 23.962022146293293 - type: nauc_recall_at_100_max value: 32.15140842028602 - type: nauc_recall_at_100_std value: 30.57126984952762 - type: nauc_recall_at_10_diff1 value: 28.120539807446004 - type: nauc_recall_at_10_max value: 18.154834280193572 - type: nauc_recall_at_10_std value: -0.6032386653260938 - type: nauc_recall_at_1_diff1 value: 37.44384898753489 - type: nauc_recall_at_1_max value: 2.009066872007797 - type: nauc_recall_at_1_std value: -18.38972044447374 - type: nauc_recall_at_20_diff1 value: 23.438945970294554 - type: nauc_recall_at_20_max value: 17.201259624644326 - type: nauc_recall_at_20_std value: 3.75587033487961 - type: nauc_recall_at_3_diff1 value: 29.867460507200587 - type: nauc_recall_at_3_max value: 8.066960542463528 - type: nauc_recall_at_3_std value: -15.13440571172203 - type: nauc_recall_at_5_diff1 value: 28.657118879661887 - type: nauc_recall_at_5_max value: 12.942552735963842 - type: nauc_recall_at_5_std value: -9.57735672972808 - type: ndcg_at_1 value: 54.50000000000001 - type: ndcg_at_10 value: 43.189 - type: ndcg_at_100 value: 48.595 - type: ndcg_at_1000 value: 55.681000000000004 - type: ndcg_at_20 value: 43.09 - type: ndcg_at_3 value: 47.599000000000004 - type: ndcg_at_5 value: 44.907000000000004 - type: precision_at_1 value: 66.5 - type: precision_at_10 value: 35.725 - type: precision_at_100 value: 11.583 - type: precision_at_1000 value: 2.302 - type: precision_at_20 value: 27.375 - type: precision_at_3 value: 52.0 - type: precision_at_5 value: 44.7 - type: recall_at_1 value: 8.838 - type: recall_at_10 value: 25.424999999999997 - type: recall_at_100 value: 55.632000000000005 - type: recall_at_1000 value: 77.857 - type: recall_at_20 value: 34.458 - type: recall_at_3 value: 15.229999999999999 - type: recall_at_5 value: 18.872 task: type: Retrieval - dataset: config: default name: MTEB 8TagsClustering revision: None split: test type: PL-MTEB/8tags-clustering metrics: - type: main_score value: 50.28804848851286 - type: v_measure value: 50.28804848851286 - type: v_measure_std value: 2.9879120747919505 task: type: Clustering - dataset: config: default name: MTEB FiQA-PL revision: 2e535829717f8bf9dc829b7f911cc5bbd4e6608e split: test type: clarin-knext/fiqa-pl metrics: - type: main_score value: 46.121 - type: map_at_1 value: 24.027 - type: map_at_10 value: 38.14 - type: map_at_100 value: 40.092 - type: map_at_1000 value: 40.266000000000005 - type: map_at_20 value: 39.195 - type: map_at_3 value: 33.415 - type: map_at_5 value: 36.115 - type: mrr_at_1 value: 46.60493827160494 - type: mrr_at_10 value: 54.70305457573974 - type: mrr_at_100 value: 55.355642920233414 - type: mrr_at_1000 value: 55.3908291424442 - type: mrr_at_20 value: 55.00793641725012 - type: mrr_at_3 value: 52.3148148148148 - type: mrr_at_5 value: 53.54166666666664 - type: nauc_map_at_1000_diff1 value: 37.73510043188139 - type: nauc_map_at_1000_max value: 28.32920495001755 - type: nauc_map_at_1000_std value: 2.1388839190211293 - type: nauc_map_at_100_diff1 value: 37.670108404247685 - type: nauc_map_at_100_max value: 28.227406812543826 - type: nauc_map_at_100_std value: 2.120931632442644 - type: nauc_map_at_10_diff1 value: 37.465256098544174 - type: nauc_map_at_10_max value: 27.091226456549666 - type: nauc_map_at_10_std value: 1.1173775566235409 - type: nauc_map_at_1_diff1 value: 41.23855326212752 - type: nauc_map_at_1_max value: 21.290748552864557 - type: nauc_map_at_1_std value: -0.8385928448565472 - type: nauc_map_at_20_diff1 value: 37.47054494805535 - type: nauc_map_at_20_max value: 27.729045702955386 - type: nauc_map_at_20_std value: 1.7216485460777051 - type: nauc_map_at_3_diff1 value: 37.262641031829105 - type: nauc_map_at_3_max value: 23.89124216989901 - type: nauc_map_at_3_std value: -0.14736489529369678 - type: nauc_map_at_5_diff1 value: 37.054030521972926 - type: nauc_map_at_5_max value: 25.37485175729055 - type: nauc_map_at_5_std value: 0.1603899014557275 - type: nauc_mrr_at_1000_diff1 value: 45.74249029214392 - type: nauc_mrr_at_1000_max value: 36.07619933100338 - type: nauc_mrr_at_1000_std value: 4.393752835100674 - type: nauc_mrr_at_100_diff1 value: 45.72338919745602 - type: nauc_mrr_at_100_max value: 36.07500193737586 - type: nauc_mrr_at_100_std value: 4.415904610787372 - type: nauc_mrr_at_10_diff1 value: 45.712821401955814 - type: nauc_mrr_at_10_max value: 36.077633940467855 - type: nauc_mrr_at_10_std value: 4.31515612100577 - type: nauc_mrr_at_1_diff1 value: 48.95197646135339 - type: nauc_mrr_at_1_max value: 37.627960253727124 - type: nauc_mrr_at_1_std value: 4.355410396712492 - type: nauc_mrr_at_20_diff1 value: 45.657031672968316 - type: nauc_mrr_at_20_max value: 36.02034080808377 - type: nauc_mrr_at_20_std value: 4.291569107759258 - type: nauc_mrr_at_3_diff1 value: 46.14016248486381 - type: nauc_mrr_at_3_max value: 35.096997959937816 - type: nauc_mrr_at_3_std value: 3.473234729162835 - type: nauc_mrr_at_5_diff1 value: 46.044456362138746 - type: nauc_mrr_at_5_max value: 35.54259698630834 - type: nauc_mrr_at_5_std value: 3.242035621890524 - type: nauc_ndcg_at_1000_diff1 value: 39.37342092420808 - type: nauc_ndcg_at_1000_max value: 32.34854163612446 - type: nauc_ndcg_at_1000_std value: 4.9764682793258865 - type: nauc_ndcg_at_100_diff1 value: 38.396532780365966 - type: nauc_ndcg_at_100_max value: 31.427345966345072 - type: nauc_ndcg_at_100_std value: 5.436384757156155 - type: nauc_ndcg_at_10_diff1 value: 38.33852883060773 - type: nauc_ndcg_at_10_max value: 29.405844267873825 - type: nauc_ndcg_at_10_std value: 2.9724473995284453 - type: nauc_ndcg_at_1_diff1 value: 49.360894087944914 - type: nauc_ndcg_at_1_max value: 37.10711812240423 - type: nauc_ndcg_at_1_std value: 3.8523559329866988 - type: nauc_ndcg_at_20_diff1 value: 38.050204646363945 - type: nauc_ndcg_at_20_max value: 29.935603389108866 - type: nauc_ndcg_at_20_std value: 3.779925764680313 - type: nauc_ndcg_at_3_diff1 value: 39.4668764835337 - type: nauc_ndcg_at_3_max value: 30.65976708125836 - type: nauc_ndcg_at_3_std value: 1.2337033504877237 - type: nauc_ndcg_at_5_diff1 value: 38.86503445443355 - type: nauc_ndcg_at_5_max value: 29.0023578220992 - type: nauc_ndcg_at_5_std value: 0.8206100069462643 - type: nauc_precision_at_1000_diff1 value: 5.84775168273073 - type: nauc_precision_at_1000_max value: 27.58660371315182 - type: nauc_precision_at_1000_std value: 9.028324162807364 - type: nauc_precision_at_100_diff1 value: 10.655637431827838 - type: nauc_precision_at_100_max value: 32.11889757111383 - type: nauc_precision_at_100_std value: 13.051376462007925 - type: nauc_precision_at_10_diff1 value: 20.55227291550576 - type: nauc_precision_at_10_max value: 34.48969436232284 - type: nauc_precision_at_10_std value: 7.57890876950882 - type: nauc_precision_at_1_diff1 value: 49.360894087944914 - type: nauc_precision_at_1_max value: 37.10711812240423 - type: nauc_precision_at_1_std value: 3.8523559329866988 - type: nauc_precision_at_20_diff1 value: 16.62880025315897 - type: nauc_precision_at_20_max value: 34.15703662717139 - type: nauc_precision_at_20_std value: 10.909431920732883 - type: nauc_precision_at_3_diff1 value: 28.04332082306772 - type: nauc_precision_at_3_max value: 31.009374202971753 - type: nauc_precision_at_3_std value: 2.307756409916575 - type: nauc_precision_at_5_diff1 value: 24.824270715808705 - type: nauc_precision_at_5_max value: 31.644036540931886 - type: nauc_precision_at_5_std value: 2.958068954639614 - type: nauc_recall_at_1000_diff1 value: 23.79234063489045 - type: nauc_recall_at_1000_max value: 26.76365425679858 - type: nauc_recall_at_1000_std value: 23.815318997671913 - type: nauc_recall_at_100_diff1 value: 22.399781833514737 - type: nauc_recall_at_100_max value: 23.192360958839174 - type: nauc_recall_at_100_std value: 15.984687692762742 - type: nauc_recall_at_10_diff1 value: 28.512649044683837 - type: nauc_recall_at_10_max value: 22.77819651497193 - type: nauc_recall_at_10_std value: 4.646633382718951 - type: nauc_recall_at_1_diff1 value: 41.23855326212752 - type: nauc_recall_at_1_max value: 21.290748552864557 - type: nauc_recall_at_1_std value: -0.8385928448565472 - type: nauc_recall_at_20_diff1 value: 26.797853661700632 - type: nauc_recall_at_20_max value: 21.9956231017133 - type: nauc_recall_at_20_std value: 5.664775183514371 - type: nauc_recall_at_3_diff1 value: 31.42511076281081 - type: nauc_recall_at_3_max value: 19.459398184547652 - type: nauc_recall_at_3_std value: -0.8592886454260257 - type: nauc_recall_at_5_diff1 value: 29.62950699804912 - type: nauc_recall_at_5_max value: 19.941323519486684 - type: nauc_recall_at_5_std value: -0.45387351120880465 - type: ndcg_at_1 value: 46.451 - type: ndcg_at_10 value: 46.121 - type: ndcg_at_100 value: 52.830999999999996 - type: ndcg_at_1000 value: 55.557 - type: ndcg_at_20 value: 48.535000000000004 - type: ndcg_at_3 value: 42.178 - type: ndcg_at_5 value: 43.406 - type: precision_at_1 value: 46.451 - type: precision_at_10 value: 12.562000000000001 - type: precision_at_100 value: 1.963 - type: precision_at_1000 value: 0.244 - type: precision_at_20 value: 7.392 - type: precision_at_3 value: 27.572000000000003 - type: precision_at_5 value: 20.031 - type: recall_at_1 value: 24.027 - type: recall_at_10 value: 52.61900000000001 - type: recall_at_100 value: 77.491 - type: recall_at_1000 value: 93.55 - type: recall_at_20 value: 59.745000000000005 - type: recall_at_3 value: 37.765 - type: recall_at_5 value: 44.304 task: type: Retrieval - dataset: config: default name: MTEB HotpotQA-PL revision: a0bd479ac97b4ccb5bd6ce320c415d0bb4beb907 split: test type: clarin-knext/hotpotqa-pl metrics: - type: main_score value: 77.02799999999999 - type: map_at_1 value: 41.249 - type: map_at_10 value: 69.512 - type: map_at_100 value: 70.291 - type: map_at_1000 value: 70.334 - type: map_at_20 value: 69.992 - type: map_at_3 value: 65.751 - type: map_at_5 value: 68.161 - type: mrr_at_1 value: 82.4983119513842 - type: mrr_at_10 value: 87.71202426502866 - type: mrr_at_100 value: 87.84265780907221 - type: mrr_at_1000 value: 87.8455843626266 - type: mrr_at_20 value: 87.80640011547308 - type: mrr_at_3 value: 86.94575737114536 - type: mrr_at_5 value: 87.46770200315063 - type: nauc_map_at_1000_diff1 value: 17.17119899625707 - type: nauc_map_at_1000_max value: 29.981569339485393 - type: nauc_map_at_1000_std value: 8.93659568948167 - type: nauc_map_at_100_diff1 value: 17.156175947340035 - type: nauc_map_at_100_max value: 29.988121004348194 - type: nauc_map_at_100_std value: 8.967947232110745 - type: nauc_map_at_10_diff1 value: 16.854416108818132 - type: nauc_map_at_10_max value: 29.784211249360194 - type: nauc_map_at_10_std value: 8.535227936720936 - type: nauc_map_at_1_diff1 value: 68.01294545515707 - type: nauc_map_at_1_max value: 47.51019900345037 - type: nauc_map_at_1_std value: -1.7951406243808212 - type: nauc_map_at_20_diff1 value: 16.993955459776572 - type: nauc_map_at_20_max value: 29.920806300647463 - type: nauc_map_at_20_std value: 8.873597327714583 - type: nauc_map_at_3_diff1 value: 16.16514623575243 - type: nauc_map_at_3_max value: 27.62371849413713 - type: nauc_map_at_3_std value: 5.131406130565191 - type: nauc_map_at_5_diff1 value: 16.507863832657364 - type: nauc_map_at_5_max value: 28.9019090072195 - type: nauc_map_at_5_std value: 7.2380930617814645 - type: nauc_mrr_at_1000_diff1 value: 66.74502991743417 - type: nauc_mrr_at_1000_max value: 50.29274140603486 - type: nauc_mrr_at_1000_std value: 1.602388931386098 - type: nauc_mrr_at_100_diff1 value: 66.7413605208101 - type: nauc_mrr_at_100_max value: 50.29720043419606 - type: nauc_mrr_at_100_std value: 1.612142495535232 - type: nauc_mrr_at_10_diff1 value: 66.71814591414376 - type: nauc_mrr_at_10_max value: 50.39851050116519 - type: nauc_mrr_at_10_std value: 1.7339878916186384 - type: nauc_mrr_at_1_diff1 value: 68.01294545515707 - type: nauc_mrr_at_1_max value: 47.627701029006225 - type: nauc_mrr_at_1_std value: -1.442043059079073 - type: nauc_mrr_at_20_diff1 value: 66.72944815863312 - type: nauc_mrr_at_20_max value: 50.325719646409716 - type: nauc_mrr_at_20_std value: 1.6584317196476688 - type: nauc_mrr_at_3_diff1 value: 66.29662294615758 - type: nauc_mrr_at_3_max value: 50.29363488669571 - type: nauc_mrr_at_3_std value: 1.1373012069481296 - type: nauc_mrr_at_5_diff1 value: 66.70959181668684 - type: nauc_mrr_at_5_max value: 50.42831108375743 - type: nauc_mrr_at_5_std value: 1.5492429855609648 - type: nauc_ndcg_at_1000_diff1 value: 24.337157353044912 - type: nauc_ndcg_at_1000_max value: 35.021784629126984 - type: nauc_ndcg_at_1000_std value: 11.976738067383161 - type: nauc_ndcg_at_100_diff1 value: 23.584427352691776 - type: nauc_ndcg_at_100_max value: 35.12304754035805 - type: nauc_ndcg_at_100_std value: 12.921291623167921 - type: nauc_ndcg_at_10_diff1 value: 22.057127915032765 - type: nauc_ndcg_at_10_max value: 34.09397142140321 - type: nauc_ndcg_at_10_std value: 11.21339882108658 - type: nauc_ndcg_at_1_diff1 value: 68.01294545515707 - type: nauc_ndcg_at_1_max value: 47.51019900345037 - type: nauc_ndcg_at_1_std value: -1.7951406243808212 - type: nauc_ndcg_at_20_diff1 value: 22.404347553479102 - type: nauc_ndcg_at_20_max value: 34.50508324969608 - type: nauc_ndcg_at_20_std value: 12.281993331498175 - type: nauc_ndcg_at_3_diff1 value: 21.21895220595676 - type: nauc_ndcg_at_3_max value: 30.76465236403928 - type: nauc_ndcg_at_3_std value: 5.501903724385424 - type: nauc_ndcg_at_5_diff1 value: 21.489825424548258 - type: nauc_ndcg_at_5_max value: 32.43517409935615 - type: nauc_ndcg_at_5_std value: 8.59021290966302 - type: nauc_precision_at_1000_diff1 value: 9.056916578488696 - type: nauc_precision_at_1000_max value: 47.29861770129213 - type: nauc_precision_at_1000_std value: 60.06028316961357 - type: nauc_precision_at_100_diff1 value: 6.853208191063939 - type: nauc_precision_at_100_max value: 40.23686318254916 - type: nauc_precision_at_100_std value: 44.69884156134862 - type: nauc_precision_at_10_diff1 value: 7.7572606953149315 - type: nauc_precision_at_10_max value: 33.24412509121427 - type: nauc_precision_at_10_std value: 22.894891705425753 - type: nauc_precision_at_1_diff1 value: 68.01294545515707 - type: nauc_precision_at_1_max value: 47.51019900345037 - type: nauc_precision_at_1_std value: -1.7951406243808212 - type: nauc_precision_at_20_diff1 value: 6.102789021481188 - type: nauc_precision_at_20_max value: 34.384739158981084 - type: nauc_precision_at_20_std value: 29.40165302735249 - type: nauc_precision_at_3_diff1 value: 10.004182813463276 - type: nauc_precision_at_3_max value: 27.07527926636925 - type: nauc_precision_at_3_std value: 8.034252288165805 - type: nauc_precision_at_5_diff1 value: 8.672082689816547 - type: nauc_precision_at_5_max value: 29.352582129843867 - type: nauc_precision_at_5_std value: 14.456464951944461 - type: nauc_recall_at_1000_diff1 value: 9.056916578488018 - type: nauc_recall_at_1000_max value: 47.29861770129215 - type: nauc_recall_at_1000_std value: 60.06028316961315 - type: nauc_recall_at_100_diff1 value: 6.853208191063934 - type: nauc_recall_at_100_max value: 40.23686318254888 - type: nauc_recall_at_100_std value: 44.698841561348615 - type: nauc_recall_at_10_diff1 value: 7.7572606953149394 - type: nauc_recall_at_10_max value: 33.244125091214286 - type: nauc_recall_at_10_std value: 22.894891705425863 - type: nauc_recall_at_1_diff1 value: 68.01294545515707 - type: nauc_recall_at_1_max value: 47.51019900345037 - type: nauc_recall_at_1_std value: -1.7951406243808212 - type: nauc_recall_at_20_diff1 value: 6.102789021481126 - type: nauc_recall_at_20_max value: 34.38473915898118 - type: nauc_recall_at_20_std value: 29.40165302735251 - type: nauc_recall_at_3_diff1 value: 10.004182813463203 - type: nauc_recall_at_3_max value: 27.07527926636916 - type: nauc_recall_at_3_std value: 8.034252288165728 - type: nauc_recall_at_5_diff1 value: 8.672082689816364 - type: nauc_recall_at_5_max value: 29.352582129843714 - type: nauc_recall_at_5_std value: 14.4564649519445 - type: ndcg_at_1 value: 82.498 - type: ndcg_at_10 value: 77.02799999999999 - type: ndcg_at_100 value: 79.593 - type: ndcg_at_1000 value: 80.372 - type: ndcg_at_20 value: 78.194 - type: ndcg_at_3 value: 71.932 - type: ndcg_at_5 value: 74.878 - type: precision_at_1 value: 82.498 - type: precision_at_10 value: 16.289 - type: precision_at_100 value: 1.8259999999999998 - type: precision_at_1000 value: 0.193 - type: precision_at_20 value: 8.519 - type: precision_at_3 value: 46.851 - type: precision_at_5 value: 30.436000000000003 - type: recall_at_1 value: 41.249 - type: recall_at_10 value: 81.44500000000001 - type: recall_at_100 value: 91.323 - type: recall_at_1000 value: 96.44200000000001 - type: recall_at_20 value: 85.18599999999999 - type: recall_at_3 value: 70.277 - type: recall_at_5 value: 76.09 task: type: Retrieval - dataset: config: default name: MTEB MSMARCO-PL revision: 8634c07806d5cce3a6138e260e59b81760a0a640 split: test type: clarin-knext/msmarco-pl metrics: - type: main_score value: 72.695 - type: map_at_1 value: 2.313 - type: map_at_10 value: 16.541 - type: map_at_100 value: 42.664 - type: map_at_1000 value: 51.048 - type: map_at_20 value: 25.691000000000003 - type: map_at_3 value: 6.8580000000000005 - type: map_at_5 value: 10.227 - type: mrr_at_1 value: 90.69767441860465 - type: mrr_at_10 value: 94.65116279069768 - type: mrr_at_100 value: 94.65116279069768 - type: mrr_at_1000 value: 94.65116279069768 - type: mrr_at_20 value: 94.65116279069768 - type: mrr_at_3 value: 94.18604651162791 - type: mrr_at_5 value: 94.65116279069768 - type: nauc_map_at_1000_diff1 value: -19.394271777832838 - type: nauc_map_at_1000_max value: 35.63073356621754 - type: nauc_map_at_1000_std value: 56.92803671553409 - type: nauc_map_at_100_diff1 value: -7.023340458676494 - type: nauc_map_at_100_max value: 22.967662469404267 - type: nauc_map_at_100_std value: 28.64423344417142 - type: nauc_map_at_10_diff1 value: 18.22452762970126 - type: nauc_map_at_10_max value: 3.235969423980127 - type: nauc_map_at_10_std value: -11.528499499305529 - type: nauc_map_at_1_diff1 value: 17.90743559505749 - type: nauc_map_at_1_max value: -14.61627654448527 - type: nauc_map_at_1_std value: -24.262430292012667 - type: nauc_map_at_20_diff1 value: 14.96422992084746 - type: nauc_map_at_20_max value: 11.128128185086132 - type: nauc_map_at_20_std value: -0.4087236026844547 - type: nauc_map_at_3_diff1 value: 16.45733174189393 - type: nauc_map_at_3_max value: -14.88196784500194 - type: nauc_map_at_3_std value: -26.096323520383446 - type: nauc_map_at_5_diff1 value: 17.572159494245003 - type: nauc_map_at_5_max value: -11.206812710229503 - type: nauc_map_at_5_std value: -22.27070819579704 - type: nauc_mrr_at_1000_diff1 value: 33.66069097978205 - type: nauc_mrr_at_1000_max value: 43.87773602456895 - type: nauc_mrr_at_1000_std value: 52.33730714398662 - type: nauc_mrr_at_100_diff1 value: 33.66069097978205 - type: nauc_mrr_at_100_max value: 43.87773602456895 - type: nauc_mrr_at_100_std value: 52.33730714398662 - type: nauc_mrr_at_10_diff1 value: 33.66069097978205 - type: nauc_mrr_at_10_max value: 43.87773602456895 - type: nauc_mrr_at_10_std value: 52.33730714398662 - type: nauc_mrr_at_1_diff1 value: 23.709794626749783 - type: nauc_mrr_at_1_max value: 35.45939642825464 - type: nauc_mrr_at_1_std value: 45.18790321558505 - type: nauc_mrr_at_20_diff1 value: 33.66069097978205 - type: nauc_mrr_at_20_max value: 43.87773602456895 - type: nauc_mrr_at_20_std value: 52.33730714398662 - type: nauc_mrr_at_3_diff1 value: 38.96783570139972 - type: nauc_mrr_at_3_max value: 48.367517142603624 - type: nauc_mrr_at_3_std value: 56.15032257246786 - type: nauc_mrr_at_5_diff1 value: 33.66069097978205 - type: nauc_mrr_at_5_max value: 43.87773602456895 - type: nauc_mrr_at_5_std value: 52.33730714398662 - type: nauc_ndcg_at_1000_diff1 value: -8.409227649777549 - type: nauc_ndcg_at_1000_max value: 55.08579408014661 - type: nauc_ndcg_at_1000_std value: 64.71829411541155 - type: nauc_ndcg_at_100_diff1 value: -12.171382005828134 - type: nauc_ndcg_at_100_max value: 37.279599751187895 - type: nauc_ndcg_at_100_std value: 55.59571261330682 - type: nauc_ndcg_at_10_diff1 value: -4.2745893875224645 - type: nauc_ndcg_at_10_max value: 35.61094191299521 - type: nauc_ndcg_at_10_std value: 31.49122710738599 - type: nauc_ndcg_at_1_diff1 value: 34.77341575621081 - type: nauc_ndcg_at_1_max value: 18.418784098194983 - type: nauc_ndcg_at_1_std value: 3.6003144907881026 - type: nauc_ndcg_at_20_diff1 value: -16.937600290863816 - type: nauc_ndcg_at_20_max value: 28.731002593372718 - type: nauc_ndcg_at_20_std value: 40.140028262395546 - type: nauc_ndcg_at_3_diff1 value: 21.008563623057892 - type: nauc_ndcg_at_3_max value: 32.092932411602945 - type: nauc_ndcg_at_3_std value: 7.783159518591246 - type: nauc_ndcg_at_5_diff1 value: 13.35248395075747 - type: nauc_ndcg_at_5_max value: 33.48637127489678 - type: nauc_ndcg_at_5_std value: 19.883656903878986 - type: nauc_precision_at_1000_diff1 value: -34.613170483366815 - type: nauc_precision_at_1000_max value: 14.178980568050093 - type: nauc_precision_at_1000_std value: 53.45813399059421 - type: nauc_precision_at_100_diff1 value: -40.67552345859168 - type: nauc_precision_at_100_max value: 23.091965607829138 - type: nauc_precision_at_100_std value: 62.39644907525577 - type: nauc_precision_at_10_diff1 value: -29.61210257317124 - type: nauc_precision_at_10_max value: 43.992102732918255 - type: nauc_precision_at_10_std value: 67.25524849542518 - type: nauc_precision_at_1_diff1 value: 23.709794626749783 - type: nauc_precision_at_1_max value: 35.45939642825464 - type: nauc_precision_at_1_std value: 45.18790321558505 - type: nauc_precision_at_20_diff1 value: -38.29110052486433 - type: nauc_precision_at_20_max value: 28.73705296191401 - type: nauc_precision_at_20_std value: 62.12026159344505 - type: nauc_precision_at_3_diff1 value: -4.950069185044093 - type: nauc_precision_at_3_max value: 35.30311413187648 - type: nauc_precision_at_3_std value: 37.24789627772557 - type: nauc_precision_at_5_diff1 value: -8.259725731846123 - type: nauc_precision_at_5_max value: 33.985287538899314 - type: nauc_precision_at_5_std value: 53.59550306044433 - type: nauc_recall_at_1000_diff1 value: -5.996961409631926 - type: nauc_recall_at_1000_max value: 63.118266233402764 - type: nauc_recall_at_1000_std value: 69.5649709802058 - type: nauc_recall_at_100_diff1 value: 6.920650261229799 - type: nauc_recall_at_100_max value: 26.76777278523633 - type: nauc_recall_at_100_std value: 24.81349844560708 - type: nauc_recall_at_10_diff1 value: 18.636579796911292 - type: nauc_recall_at_10_max value: 2.214374250576099 - type: nauc_recall_at_10_std value: -12.939953791707651 - type: nauc_recall_at_1_diff1 value: 17.90743559505749 - type: nauc_recall_at_1_max value: -14.61627654448527 - type: nauc_recall_at_1_std value: -24.262430292012667 - type: nauc_recall_at_20_diff1 value: 17.612041689452855 - type: nauc_recall_at_20_max value: 11.182632726686007 - type: nauc_recall_at_20_std value: -2.4835954401161864 - type: nauc_recall_at_3_diff1 value: 16.773341381117 - type: nauc_recall_at_3_max value: -15.051242807277163 - type: nauc_recall_at_3_std value: -26.410274593618038 - type: nauc_recall_at_5_diff1 value: 17.091861029537423 - type: nauc_recall_at_5_max value: -13.243464985211395 - type: nauc_recall_at_5_std value: -23.92982354951768 - type: ndcg_at_1 value: 78.295 - type: ndcg_at_10 value: 72.695 - type: ndcg_at_100 value: 65.69500000000001 - type: ndcg_at_1000 value: 73.359 - type: ndcg_at_20 value: 69.16499999999999 - type: ndcg_at_3 value: 76.632 - type: ndcg_at_5 value: 74.024 - type: precision_at_1 value: 90.69800000000001 - type: precision_at_10 value: 81.628 - type: precision_at_100 value: 38.116 - type: precision_at_1000 value: 7.199999999999999 - type: precision_at_20 value: 72.209 - type: precision_at_3 value: 89.922 - type: precision_at_5 value: 86.047 - type: recall_at_1 value: 2.313 - type: recall_at_10 value: 17.48 - type: recall_at_100 value: 53.937000000000005 - type: recall_at_1000 value: 80.018 - type: recall_at_20 value: 28.081 - type: recall_at_3 value: 6.927 - type: recall_at_5 value: 10.575 task: type: Retrieval - dataset: config: pl name: MTEB MassiveIntentClassification (pl) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 79.41492938802959 - type: f1 value: 75.75917683785259 - type: f1_weighted value: 79.4156392656699 - type: main_score value: 79.41492938802959 task: type: Classification - dataset: config: pl name: MTEB MassiveScenarioClassification (pl) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 81.9334229993275 - type: f1 value: 81.40628785444537 - type: f1_weighted value: 81.79807477693303 - type: main_score value: 81.9334229993275 task: type: Classification - dataset: config: default name: MTEB NFCorpus-PL revision: 9a6f9567fda928260afed2de480d79c98bf0bec0 split: test type: clarin-knext/nfcorpus-pl metrics: - type: main_score value: 36.723 - type: map_at_1 value: 5.8069999999999995 - type: map_at_10 value: 13.602 - type: map_at_100 value: 17.196 - type: map_at_1000 value: 18.609 - type: map_at_20 value: 15.146999999999998 - type: map_at_3 value: 9.594999999999999 - type: map_at_5 value: 11.453000000000001 - type: mrr_at_1 value: 47.368421052631575 - type: mrr_at_10 value: 55.60703228659884 - type: mrr_at_100 value: 56.1552975760445 - type: mrr_at_1000 value: 56.19164342988321 - type: mrr_at_20 value: 55.922507068281476 - type: mrr_at_3 value: 53.147574819401456 - type: mrr_at_5 value: 54.680082559339525 - type: nauc_map_at_1000_diff1 value: 34.05763404594125 - type: nauc_map_at_1000_max value: 29.5226776533209 - type: nauc_map_at_1000_std value: 15.427632324819914 - type: nauc_map_at_100_diff1 value: 34.80313586539057 - type: nauc_map_at_100_max value: 27.999543781245972 - type: nauc_map_at_100_std value: 11.502430185601197 - type: nauc_map_at_10_diff1 value: 39.10493763818235 - type: nauc_map_at_10_max value: 20.299110129894572 - type: nauc_map_at_10_std value: -1.8131312981171384 - type: nauc_map_at_1_diff1 value: 54.952292547558436 - type: nauc_map_at_1_max value: 13.172173380536137 - type: nauc_map_at_1_std value: -11.135859432447047 - type: nauc_map_at_20_diff1 value: 36.56338939350608 - type: nauc_map_at_20_max value: 24.057778180377355 - type: nauc_map_at_20_std value: 4.030543599731532 - type: nauc_map_at_3_diff1 value: 46.798195082350766 - type: nauc_map_at_3_max value: 14.899395608553915 - type: nauc_map_at_3_std value: -10.505614189182307 - type: nauc_map_at_5_diff1 value: 42.83953515294862 - type: nauc_map_at_5_max value: 17.04727497975375 - type: nauc_map_at_5_std value: -7.6517071380275885 - type: nauc_mrr_at_1000_diff1 value: 41.44193432540061 - type: nauc_mrr_at_1000_max value: 39.88086824180341 - type: nauc_mrr_at_1000_std value: 27.351885880283966 - type: nauc_mrr_at_100_diff1 value: 41.43357468563369 - type: nauc_mrr_at_100_max value: 39.91394628214467 - type: nauc_mrr_at_100_std value: 27.37166382203234 - type: nauc_mrr_at_10_diff1 value: 41.46082695650948 - type: nauc_mrr_at_10_max value: 39.858957188572944 - type: nauc_mrr_at_10_std value: 27.18216001182641 - type: nauc_mrr_at_1_diff1 value: 41.485448798176904 - type: nauc_mrr_at_1_max value: 33.6944538535235 - type: nauc_mrr_at_1_std value: 22.826701578387503 - type: nauc_mrr_at_20_diff1 value: 41.374365310091925 - type: nauc_mrr_at_20_max value: 39.923859616197035 - type: nauc_mrr_at_20_std value: 27.27268109687068 - type: nauc_mrr_at_3_diff1 value: 42.1244757279239 - type: nauc_mrr_at_3_max value: 38.380669877043864 - type: nauc_mrr_at_3_std value: 25.734391560690224 - type: nauc_mrr_at_5_diff1 value: 41.26497822292423 - type: nauc_mrr_at_5_max value: 39.17164048501762 - type: nauc_mrr_at_5_std value: 26.304110615701987 - type: nauc_ndcg_at_1000_diff1 value: 31.76845316166595 - type: nauc_ndcg_at_1000_max value: 44.0530198648453 - type: nauc_ndcg_at_1000_std value: 33.37050209530549 - type: nauc_ndcg_at_100_diff1 value: 31.70167104254346 - type: nauc_ndcg_at_100_max value: 38.98577219865644 - type: nauc_ndcg_at_100_std value: 28.46948949404448 - type: nauc_ndcg_at_10_diff1 value: 31.41371490994258 - type: nauc_ndcg_at_10_max value: 36.46974014607837 - type: nauc_ndcg_at_10_std value: 28.214061102873274 - type: nauc_ndcg_at_1_diff1 value: 45.195218239572185 - type: nauc_ndcg_at_1_max value: 32.47174554115089 - type: nauc_ndcg_at_1_std value: 22.252970640869655 - type: nauc_ndcg_at_20_diff1 value: 30.22073304733139 - type: nauc_ndcg_at_20_max value: 36.85722580956459 - type: nauc_ndcg_at_20_std value: 28.82508960932221 - type: nauc_ndcg_at_3_diff1 value: 34.85087007597385 - type: nauc_ndcg_at_3_max value: 35.08880030166066 - type: nauc_ndcg_at_3_std value: 24.477164602350427 - type: nauc_ndcg_at_5_diff1 value: 32.15269255562139 - type: nauc_ndcg_at_5_max value: 36.26512978748847 - type: nauc_ndcg_at_5_std value: 26.121143638336193 - type: nauc_precision_at_1000_diff1 value: -5.016344866521763 - type: nauc_precision_at_1000_max value: 13.76155613533569 - type: nauc_precision_at_1000_std value: 42.87650310943072 - type: nauc_precision_at_100_diff1 value: -2.4765231121724867 - type: nauc_precision_at_100_max value: 26.413714147361173 - type: nauc_precision_at_100_std value: 52.07869389693284 - type: nauc_precision_at_10_diff1 value: 9.381859834804454 - type: nauc_precision_at_10_max value: 36.79686689654208 - type: nauc_precision_at_10_std value: 41.450385008923874 - type: nauc_precision_at_1_diff1 value: 43.14276503972391 - type: nauc_precision_at_1_max value: 33.23669937901841 - type: nauc_precision_at_1_std value: 23.574191783291614 - type: nauc_precision_at_20_diff1 value: 3.3554639781732143 - type: nauc_precision_at_20_max value: 35.07048369650734 - type: nauc_precision_at_20_std value: 46.90757933302204 - type: nauc_precision_at_3_diff1 value: 22.3364560733951 - type: nauc_precision_at_3_max value: 34.49198383469041 - type: nauc_precision_at_3_std value: 28.30886758592867 - type: nauc_precision_at_5_diff1 value: 14.242157915266043 - type: nauc_precision_at_5_max value: 36.78665790141447 - type: nauc_precision_at_5_std value: 34.22226904133568 - type: nauc_recall_at_1000_diff1 value: 6.177080203711223 - type: nauc_recall_at_1000_max value: 20.36718691855502 - type: nauc_recall_at_1000_std value: 21.44974953318914 - type: nauc_recall_at_100_diff1 value: 16.98521396327983 - type: nauc_recall_at_100_max value: 25.739641139625473 - type: nauc_recall_at_100_std value: 16.08045361596745 - type: nauc_recall_at_10_diff1 value: 28.066091446759465 - type: nauc_recall_at_10_max value: 15.875422037194987 - type: nauc_recall_at_10_std value: -2.7729209404094712 - type: nauc_recall_at_1_diff1 value: 54.952292547558436 - type: nauc_recall_at_1_max value: 13.172173380536137 - type: nauc_recall_at_1_std value: -11.135859432447047 - type: nauc_recall_at_20_diff1 value: 22.454203317605455 - type: nauc_recall_at_20_max value: 19.38991609441149 - type: nauc_recall_at_20_std value: 3.3669889925713683 - type: nauc_recall_at_3_diff1 value: 42.41050348142469 - type: nauc_recall_at_3_max value: 14.345477767632861 - type: nauc_recall_at_3_std value: -11.275161125178107 - type: nauc_recall_at_5_diff1 value: 34.851159133502286 - type: nauc_recall_at_5_max value: 15.03263812713638 - type: nauc_recall_at_5_std value: -9.042538295018138 - type: ndcg_at_1 value: 44.891999999999996 - type: ndcg_at_10 value: 36.723 - type: ndcg_at_100 value: 33.101 - type: ndcg_at_1000 value: 41.493 - type: ndcg_at_20 value: 34.14 - type: ndcg_at_3 value: 41.131 - type: ndcg_at_5 value: 39.446999999999996 - type: precision_at_1 value: 46.749 - type: precision_at_10 value: 27.616000000000003 - type: precision_at_100 value: 8.372 - type: precision_at_1000 value: 2.095 - type: precision_at_20 value: 20.294 - type: precision_at_3 value: 38.493 - type: precision_at_5 value: 34.427 - type: recall_at_1 value: 5.8069999999999995 - type: recall_at_10 value: 18.444 - type: recall_at_100 value: 33.655 - type: recall_at_1000 value: 63.839999999999996 - type: recall_at_20 value: 22.205 - type: recall_at_3 value: 10.61 - type: recall_at_5 value: 13.938999999999998 task: type: Retrieval - dataset: config: default name: MTEB NQ-PL revision: f171245712cf85dd4700b06bef18001578d0ca8d split: test type: clarin-knext/nq-pl metrics: - type: main_score value: 56.854000000000006 - type: map_at_1 value: 34.514 - type: map_at_10 value: 49.644 - type: map_at_100 value: 50.608 - type: map_at_1000 value: 50.635 - type: map_at_20 value: 50.305 - type: map_at_3 value: 45.672000000000004 - type: map_at_5 value: 48.089 - type: mrr_at_1 value: 38.78910776361529 - type: mrr_at_10 value: 52.148397984145234 - type: mrr_at_100 value: 52.852966946095215 - type: mrr_at_1000 value: 52.87105017860762 - type: mrr_at_20 value: 52.64188894631607 - type: mrr_at_3 value: 48.97643877945134 - type: mrr_at_5 value: 50.92168791039002 - type: nauc_map_at_1000_diff1 value: 37.02156712167867 - type: nauc_map_at_1000_max value: 30.9541229199217 - type: nauc_map_at_1000_std value: 7.320033004454671 - type: nauc_map_at_100_diff1 value: 37.02236703226826 - type: nauc_map_at_100_max value: 30.9697676745961 - type: nauc_map_at_100_std value: 7.33984133867723 - type: nauc_map_at_10_diff1 value: 36.90102700826612 - type: nauc_map_at_10_max value: 30.785723842405183 - type: nauc_map_at_10_std value: 6.779448226242215 - type: nauc_map_at_1_diff1 value: 39.909029450982274 - type: nauc_map_at_1_max value: 25.241631663639062 - type: nauc_map_at_1_std value: 3.9346798436914625 - type: nauc_map_at_20_diff1 value: 37.01885833177735 - type: nauc_map_at_20_max value: 30.93864719019393 - type: nauc_map_at_20_std value: 7.157784404582363 - type: nauc_map_at_3_diff1 value: 36.66395294442894 - type: nauc_map_at_3_max value: 28.73917625955397 - type: nauc_map_at_3_std value: 4.974442294121807 - type: nauc_map_at_5_diff1 value: 36.50200331851477 - type: nauc_map_at_5_max value: 30.19694653814823 - type: nauc_map_at_5_std value: 6.080701892676308 - type: nauc_mrr_at_1000_diff1 value: 37.13771503608112 - type: nauc_mrr_at_1000_max value: 31.751547147247507 - type: nauc_mrr_at_1000_std value: 9.508614158791604 - type: nauc_mrr_at_100_diff1 value: 37.13715249048103 - type: nauc_mrr_at_100_max value: 31.76453363846907 - type: nauc_mrr_at_100_std value: 9.527333431366577 - type: nauc_mrr_at_10_diff1 value: 37.04617391414406 - type: nauc_mrr_at_10_max value: 31.835558691659767 - type: nauc_mrr_at_10_std value: 9.403478249864207 - type: nauc_mrr_at_1_diff1 value: 40.24340603514061 - type: nauc_mrr_at_1_max value: 27.892025295592664 - type: nauc_mrr_at_1_std value: 6.948060152377137 - type: nauc_mrr_at_20_diff1 value: 37.13679664662962 - type: nauc_mrr_at_20_max value: 31.80571193908972 - type: nauc_mrr_at_20_std value: 9.463516427443066 - type: nauc_mrr_at_3_diff1 value: 36.59947958587673 - type: nauc_mrr_at_3_max value: 30.56905612034133 - type: nauc_mrr_at_3_std value: 8.213473085446296 - type: nauc_mrr_at_5_diff1 value: 36.66740305041658 - type: nauc_mrr_at_5_max value: 31.470226490982878 - type: nauc_mrr_at_5_std value: 9.02109643375307 - type: nauc_ndcg_at_1000_diff1 value: 36.60296185088649 - type: nauc_ndcg_at_1000_max value: 33.40562074993109 - type: nauc_ndcg_at_1000_std value: 10.60845451213325 - type: nauc_ndcg_at_100_diff1 value: 36.59946610918652 - type: nauc_ndcg_at_100_max value: 33.9570260243297 - type: nauc_ndcg_at_100_std value: 11.340469448481196 - type: nauc_ndcg_at_10_diff1 value: 36.14418247401987 - type: nauc_ndcg_at_10_max value: 33.451039871075345 - type: nauc_ndcg_at_10_std value: 9.272972801419813 - type: nauc_ndcg_at_1_diff1 value: 40.07169143996099 - type: nauc_ndcg_at_1_max value: 27.943354680588055 - type: nauc_ndcg_at_1_std value: 7.036639009967827 - type: nauc_ndcg_at_20_diff1 value: 36.51152244027151 - type: nauc_ndcg_at_20_max value: 33.89378482325653 - type: nauc_ndcg_at_20_std value: 10.342721315866635 - type: nauc_ndcg_at_3_diff1 value: 35.4822845318483 - type: nauc_ndcg_at_3_max value: 29.912345910181415 - type: nauc_ndcg_at_3_std value: 5.9694134283330715 - type: nauc_ndcg_at_5_diff1 value: 35.221776161219466 - type: nauc_ndcg_at_5_max value: 32.1072171248216 - type: nauc_ndcg_at_5_std value: 7.670174771541694 - type: nauc_precision_at_1000_diff1 value: -4.285000172509594 - type: nauc_precision_at_1000_max value: 14.600633321561062 - type: nauc_precision_at_1000_std value: 21.991435704986305 - type: nauc_precision_at_100_diff1 value: 1.7266493932509126 - type: nauc_precision_at_100_max value: 22.9932202096611 - type: nauc_precision_at_100_std value: 27.464183639561075 - type: nauc_precision_at_10_diff1 value: 16.16723142044687 - type: nauc_precision_at_10_max value: 32.61177863055963 - type: nauc_precision_at_10_std value: 19.30609156634069 - type: nauc_precision_at_1_diff1 value: 40.07169143996099 - type: nauc_precision_at_1_max value: 27.943354680588055 - type: nauc_precision_at_1_std value: 7.036639009967827 - type: nauc_precision_at_20_diff1 value: 10.986359452355082 - type: nauc_precision_at_20_max value: 30.001608294285408 - type: nauc_precision_at_20_std value: 23.470161266132752 - type: nauc_precision_at_3_diff1 value: 25.021299827765368 - type: nauc_precision_at_3_max value: 31.112435175145354 - type: nauc_precision_at_3_std value: 9.97933575854508 - type: nauc_precision_at_5_diff1 value: 19.85258852538675 - type: nauc_precision_at_5_max value: 33.017057636553346 - type: nauc_precision_at_5_std value: 14.226398540277224 - type: nauc_recall_at_1000_diff1 value: 32.956809555733294 - type: nauc_recall_at_1000_max value: 81.17616645437344 - type: nauc_recall_at_1000_std value: 80.81894015338722 - type: nauc_recall_at_100_diff1 value: 34.21543518933059 - type: nauc_recall_at_100_max value: 64.60424388566007 - type: nauc_recall_at_100_std value: 55.36262550526809 - type: nauc_recall_at_10_diff1 value: 31.854572843060865 - type: nauc_recall_at_10_max value: 41.47697651985406 - type: nauc_recall_at_10_std value: 15.449819317346778 - type: nauc_recall_at_1_diff1 value: 39.909029450982274 - type: nauc_recall_at_1_max value: 25.241631663639062 - type: nauc_recall_at_1_std value: 3.9346798436914625 - type: nauc_recall_at_20_diff1 value: 33.155424988870266 - type: nauc_recall_at_20_max value: 47.41147314334969 - type: nauc_recall_at_20_std value: 24.122822585459915 - type: nauc_recall_at_3_diff1 value: 31.030069463711484 - type: nauc_recall_at_3_max value: 30.349471998175105 - type: nauc_recall_at_3_std value: 5.3792560913820635 - type: nauc_recall_at_5_diff1 value: 29.662449422215627 - type: nauc_recall_at_5_max value: 35.59583981361554 - type: nauc_recall_at_5_std value: 9.138475426366536 - type: ndcg_at_1 value: 38.847 - type: ndcg_at_10 value: 56.854000000000006 - type: ndcg_at_100 value: 60.767 - type: ndcg_at_1000 value: 61.399 - type: ndcg_at_20 value: 58.941 - type: ndcg_at_3 value: 49.576 - type: ndcg_at_5 value: 53.502 - type: precision_at_1 value: 38.847 - type: precision_at_10 value: 9.064 - type: precision_at_100 value: 1.127 - type: precision_at_1000 value: 0.11900000000000001 - type: precision_at_20 value: 5.038 - type: precision_at_3 value: 22.335 - type: precision_at_5 value: 15.689 - type: recall_at_1 value: 34.514 - type: recall_at_10 value: 76.152 - type: recall_at_100 value: 92.837 - type: recall_at_1000 value: 97.596 - type: recall_at_20 value: 83.77799999999999 - type: recall_at_3 value: 57.484 - type: recall_at_5 value: 66.476 task: type: Retrieval - dataset: config: default name: MTEB PAC revision: None split: test type: laugustyniak/abusive-clauses-pl metrics: - type: accuracy value: 67.24297712134376 - type: accuracy_stderr value: 4.77558207347837 - type: ap value: 77.38171975466854 - type: ap_stderr value: 2.5801970175320394 - type: f1 value: 65.21823897814332 - type: f1_stderr value: 4.317111734308895 - type: main_score value: 67.24297712134376 task: type: Classification - dataset: config: default name: MTEB PSC revision: d05a294af9e1d3ff2bfb6b714e08a24a6cabc669 split: test type: PL-MTEB/psc-pairclassification metrics: - type: cosine_accuracy value: 97.95918367346938 - type: cosine_accuracy_threshold value: 59.87724328133361 - type: cosine_ap value: 99.24498625606927 - type: cosine_f1 value: 96.6867469879518 - type: cosine_f1_threshold value: 59.87724328133361 - type: cosine_precision value: 95.53571428571429 - type: cosine_recall value: 97.86585365853658 - type: dot_accuracy value: 98.51576994434137 - type: dot_accuracy_threshold value: 1574400.0 - type: dot_ap value: 99.28566232682996 - type: dot_f1 value: 97.57575757575758 - type: dot_f1_threshold value: 1564800.0 - type: dot_precision value: 96.98795180722891 - type: dot_recall value: 98.17073170731707 - type: euclidean_accuracy value: 97.6808905380334 - type: euclidean_accuracy_threshold value: 14418.957939643331 - type: euclidean_ap value: 99.0876340868033 - type: euclidean_f1 value: 96.24060150375941 - type: euclidean_f1_threshold value: 14442.183182634264 - type: euclidean_precision value: 94.95548961424333 - type: euclidean_recall value: 97.5609756097561 - type: main_score value: 99.28566232682996 - type: manhattan_accuracy value: 97.86641929499072 - type: manhattan_accuracy_threshold value: 681802.1857857704 - type: manhattan_ap value: 99.08465290287205 - type: manhattan_f1 value: 96.52042360060513 - type: manhattan_f1_threshold value: 681802.1857857704 - type: manhattan_precision value: 95.7957957957958 - type: manhattan_recall value: 97.2560975609756 - type: max_ap value: 99.28566232682996 - type: max_f1 value: 97.57575757575758 - type: max_precision value: 96.98795180722891 - type: max_recall value: 98.17073170731707 - type: similarity_accuracy value: 97.95918367346938 - type: similarity_accuracy_threshold value: 59.87724328133361 - type: similarity_ap value: 99.24498625606927 - type: similarity_f1 value: 96.6867469879518 - type: similarity_f1_threshold value: 59.87724328133361 - type: similarity_precision value: 95.53571428571429 - type: similarity_recall value: 97.86585365853658 task: type: PairClassification - dataset: config: default name: MTEB PolEmo2.0-IN revision: d90724373c70959f17d2331ad51fb60c71176b03 split: test type: PL-MTEB/polemo2_in metrics: - type: accuracy value: 90.41551246537396 - type: f1 value: 89.15361039614409 - type: f1_weighted value: 90.69893050097603 - type: main_score value: 90.41551246537396 task: type: Classification - dataset: config: default name: MTEB PolEmo2.0-OUT revision: 6a21ab8716e255ab1867265f8b396105e8aa63d4 split: test type: PL-MTEB/polemo2_out metrics: - type: accuracy value: 77.77327935222672 - type: f1 value: 61.238079022455636 - type: f1_weighted value: 80.58753601509183 - type: main_score value: 77.77327935222672 task: type: Classification - dataset: config: default name: MTEB PPC revision: None split: test type: PL-MTEB/ppc-pairclassification metrics: - type: cos_sim_accuracy value: 87.2 - type: cos_sim_accuracy_threshold value: 83.69773167092553 - type: cos_sim_ap value: 95.43345251568122 - type: cos_sim_f1 value: 89.82785602503913 - type: cos_sim_f1_threshold value: 81.2116503074739 - type: cos_sim_precision value: 85.16320474777447 - type: cos_sim_recall value: 95.03311258278146 - type: dot_accuracy value: 85.9 - type: dot_accuracy_threshold value: 2177600.0 - type: dot_ap value: 92.4192102018206 - type: dot_f1 value: 88.9238020424195 - type: dot_f1_threshold value: 2163200.0 - type: dot_precision value: 84.60388639760838 - type: dot_recall value: 93.70860927152319 - type: euclidean_accuracy value: 87.5 - type: euclidean_accuracy_threshold value: 9325.450203438862 - type: euclidean_ap value: 95.42730698295347 - type: euclidean_f1 value: 89.92747784045125 - type: euclidean_f1_threshold value: 9325.450203438862 - type: euclidean_precision value: 87.59811616954474 - type: euclidean_recall value: 92.3841059602649 - type: manhattan_accuracy value: 87.5 - type: manhattan_accuracy_threshold value: 441412.88244724274 - type: manhattan_ap value: 95.4277447451651 - type: manhattan_f1 value: 89.92747784045125 - type: manhattan_f1_threshold value: 441412.88244724274 - type: manhattan_precision value: 87.59811616954474 - type: manhattan_recall value: 92.3841059602649 - type: max_accuracy value: 87.5 - type: max_ap value: 95.43345251568122 - type: max_f1 value: 89.92747784045125 task: type: PairClassification - dataset: config: default name: MTEB Quora-PL revision: 0be27e93455051e531182b85e85e425aba12e9d4 split: test type: clarin-knext/quora-pl metrics: - type: main_score value: 84.47099999999999 - type: map_at_1 value: 65.892 - type: map_at_10 value: 80.11500000000001 - type: map_at_100 value: 80.861 - type: map_at_1000 value: 80.879 - type: map_at_20 value: 80.604 - type: map_at_3 value: 76.97 - type: map_at_5 value: 78.926 - type: mrr_at_1 value: 75.83 - type: mrr_at_10 value: 83.2125238095233 - type: mrr_at_100 value: 83.38714262504709 - type: mrr_at_1000 value: 83.38942088013238 - type: mrr_at_20 value: 83.34284466299037 - type: mrr_at_3 value: 81.95333333333281 - type: mrr_at_5 value: 82.78533333333272 - type: nauc_map_at_1000_diff1 value: 73.95721764018812 - type: nauc_map_at_1000_max value: 9.653675847999432 - type: nauc_map_at_1000_std value: -42.35408133902171 - type: nauc_map_at_100_diff1 value: 73.96621756991526 - type: nauc_map_at_100_max value: 9.618124708373092 - type: nauc_map_at_100_std value: -42.41429680546156 - type: nauc_map_at_10_diff1 value: 74.20643666348498 - type: nauc_map_at_10_max value: 9.056688996919677 - type: nauc_map_at_10_std value: -44.13396437616006 - type: nauc_map_at_1_diff1 value: 77.18196114257519 - type: nauc_map_at_1_max value: 7.840648640771136 - type: nauc_map_at_1_std value: -39.84395715001256 - type: nauc_map_at_20_diff1 value: 74.03475632514551 - type: nauc_map_at_20_max value: 9.385795565805118 - type: nauc_map_at_20_std value: -43.160299598965466 - type: nauc_map_at_3_diff1 value: 74.43855921599284 - type: nauc_map_at_3_max value: 7.574218825911361 - type: nauc_map_at_3_std value: -46.1476276122436 - type: nauc_map_at_5_diff1 value: 74.38688915461512 - type: nauc_map_at_5_max value: 8.557764506539128 - type: nauc_map_at_5_std value: -45.53897898458085 - type: nauc_mrr_at_1000_diff1 value: 74.0311045258841 - type: nauc_mrr_at_1000_max value: 11.885448379701055 - type: nauc_mrr_at_1000_std value: -38.16008409213179 - type: nauc_mrr_at_100_diff1 value: 74.03074603058893 - type: nauc_mrr_at_100_max value: 11.886356221882725 - type: nauc_mrr_at_100_std value: -38.159139191997795 - type: nauc_mrr_at_10_diff1 value: 73.99521522874129 - type: nauc_mrr_at_10_max value: 11.77749620520773 - type: nauc_mrr_at_10_std value: -38.266295250166635 - type: nauc_mrr_at_1_diff1 value: 75.53192564838908 - type: nauc_mrr_at_1_max value: 12.979267595721275 - type: nauc_mrr_at_1_std value: -36.634066084632785 - type: nauc_mrr_at_20_diff1 value: 74.01273934757484 - type: nauc_mrr_at_20_max value: 11.887566738728225 - type: nauc_mrr_at_20_std value: -38.169250252410485 - type: nauc_mrr_at_3_diff1 value: 73.6073534511043 - type: nauc_mrr_at_3_max value: 11.450856365709727 - type: nauc_mrr_at_3_std value: -38.767141663073964 - type: nauc_mrr_at_5_diff1 value: 73.84950218235583 - type: nauc_mrr_at_5_max value: 11.787394554048813 - type: nauc_mrr_at_5_std value: -38.57240589862417 - type: nauc_ndcg_at_1000_diff1 value: 73.51677487598074 - type: nauc_ndcg_at_1000_max value: 10.72929244202152 - type: nauc_ndcg_at_1000_std value: -39.92813917654933 - type: nauc_ndcg_at_100_diff1 value: 73.53904136553481 - type: nauc_ndcg_at_100_max value: 10.569310211635521 - type: nauc_ndcg_at_100_std value: -40.12206261908318 - type: nauc_ndcg_at_10_diff1 value: 73.55958917204208 - type: nauc_ndcg_at_10_max value: 9.255791947077263 - type: nauc_ndcg_at_10_std value: -42.7856138240991 - type: nauc_ndcg_at_1_diff1 value: 75.34289960079188 - type: nauc_ndcg_at_1_max value: 13.499789436258705 - type: nauc_ndcg_at_1_std value: -35.91483904818284 - type: nauc_ndcg_at_20_diff1 value: 73.48070745481307 - type: nauc_ndcg_at_20_max value: 9.92427572953505 - type: nauc_ndcg_at_20_std value: -41.55653404596579 - type: nauc_ndcg_at_3_diff1 value: 72.72072901275445 - type: nauc_ndcg_at_3_max value: 8.303708237302729 - type: nauc_ndcg_at_3_std value: -43.618531107389344 - type: nauc_ndcg_at_5_diff1 value: 73.30060059269601 - type: nauc_ndcg_at_5_max value: 8.915386932153249 - type: nauc_ndcg_at_5_std value: -44.088053429661 - type: nauc_precision_at_1000_diff1 value: -41.540517884119524 - type: nauc_precision_at_1000_max value: 6.9361565712971265 - type: nauc_precision_at_1000_std value: 42.39482890919027 - type: nauc_precision_at_100_diff1 value: -40.609576663184896 - type: nauc_precision_at_100_max value: 6.302451339507686 - type: nauc_precision_at_100_std value: 41.30693233869549 - type: nauc_precision_at_10_diff1 value: -30.91653155031006 - type: nauc_precision_at_10_max value: 4.84981614338782 - type: nauc_precision_at_10_std value: 24.47022404030676 - type: nauc_precision_at_1_diff1 value: 75.34289960079188 - type: nauc_precision_at_1_max value: 13.499789436258705 - type: nauc_precision_at_1_std value: -35.91483904818284 - type: nauc_precision_at_20_diff1 value: -36.75164419452007 - type: nauc_precision_at_20_max value: 5.440757182282365 - type: nauc_precision_at_20_std value: 33.08928025809355 - type: nauc_precision_at_3_diff1 value: -5.3240699725635565 - type: nauc_precision_at_3_max value: 5.156636102003736 - type: nauc_precision_at_3_std value: -0.9779263105110453 - type: nauc_precision_at_5_diff1 value: -19.92133198420086 - type: nauc_precision_at_5_max value: 5.432766335564369 - type: nauc_precision_at_5_std value: 11.417736295996392 - type: nauc_recall_at_1000_diff1 value: 56.57663068186203 - type: nauc_recall_at_1000_max value: 25.80329039728696 - type: nauc_recall_at_1000_std value: 57.82937604195464 - type: nauc_recall_at_100_diff1 value: 67.25188672746224 - type: nauc_recall_at_100_max value: 6.879939694351325 - type: nauc_recall_at_100_std value: -30.098258041087096 - type: nauc_recall_at_10_diff1 value: 68.00694154421653 - type: nauc_recall_at_10_max value: 0.7226814903576098 - type: nauc_recall_at_10_std value: -52.980002751088215 - type: nauc_recall_at_1_diff1 value: 77.18196114257519 - type: nauc_recall_at_1_max value: 7.840648640771136 - type: nauc_recall_at_1_std value: -39.84395715001256 - type: nauc_recall_at_20_diff1 value: 66.56016564739411 - type: nauc_recall_at_20_max value: 1.919044428493598 - type: nauc_recall_at_20_std value: -49.5380686276396 - type: nauc_recall_at_3_diff1 value: 69.83247207081557 - type: nauc_recall_at_3_max value: 2.395588418833963 - type: nauc_recall_at_3_std value: -52.11119790224493 - type: nauc_recall_at_5_diff1 value: 69.25881483845956 - type: nauc_recall_at_5_max value: 2.9185552604991716 - type: nauc_recall_at_5_std value: -54.376346690212095 - type: ndcg_at_1 value: 75.92 - type: ndcg_at_10 value: 84.47099999999999 - type: ndcg_at_100 value: 86.11999999999999 - type: ndcg_at_1000 value: 86.276 - type: ndcg_at_20 value: 85.37599999999999 - type: ndcg_at_3 value: 81.0 - type: ndcg_at_5 value: 82.88799999999999 - type: precision_at_1 value: 75.92 - type: precision_at_10 value: 12.987000000000002 - type: precision_at_100 value: 1.5190000000000001 - type: precision_at_1000 value: 0.156 - type: precision_at_20 value: 6.977 - type: precision_at_3 value: 35.573 - type: precision_at_5 value: 23.566000000000003 - type: recall_at_1 value: 65.892 - type: recall_at_10 value: 93.318 - type: recall_at_100 value: 99.124 - type: recall_at_1000 value: 99.92699999999999 - type: recall_at_20 value: 96.256 - type: recall_at_3 value: 83.69 - type: recall_at_5 value: 88.783 task: type: Retrieval - dataset: config: default name: MTEB SCIDOCS-PL revision: 45452b03f05560207ef19149545f168e596c9337 split: test type: clarin-knext/scidocs-pl metrics: - type: main_score value: 19.528000000000002 - type: map_at_1 value: 4.5280000000000005 - type: map_at_10 value: 11.649 - type: map_at_100 value: 14.019 - type: map_at_1000 value: 14.35 - type: map_at_20 value: 12.866 - type: map_at_3 value: 8.35 - type: map_at_5 value: 9.84 - type: mrr_at_1 value: 22.3 - type: mrr_at_10 value: 32.690039682539656 - type: mrr_at_100 value: 33.91097016542133 - type: mrr_at_1000 value: 33.96940693754695 - type: mrr_at_20 value: 33.418312740750785 - type: mrr_at_3 value: 29.4 - type: mrr_at_5 value: 31.21999999999997 - type: nauc_map_at_1000_diff1 value: 20.52578935318615 - type: nauc_map_at_1000_max value: 28.28553814852898 - type: nauc_map_at_1000_std value: 18.74384140790138 - type: nauc_map_at_100_diff1 value: 20.508083204903077 - type: nauc_map_at_100_max value: 28.281447260273346 - type: nauc_map_at_100_std value: 18.51851601604162 - type: nauc_map_at_10_diff1 value: 21.028884157759624 - type: nauc_map_at_10_max value: 26.98935951161403 - type: nauc_map_at_10_std value: 14.434790357547536 - type: nauc_map_at_1_diff1 value: 23.406427416653127 - type: nauc_map_at_1_max value: 21.759624726647303 - type: nauc_map_at_1_std value: 8.335925909478444 - type: nauc_map_at_20_diff1 value: 20.370301978337785 - type: nauc_map_at_20_max value: 27.30787972231405 - type: nauc_map_at_20_std value: 16.166505401287353 - type: nauc_map_at_3_diff1 value: 23.920717676009453 - type: nauc_map_at_3_max value: 26.061264285994124 - type: nauc_map_at_3_std value: 10.707123907182902 - type: nauc_map_at_5_diff1 value: 22.180679453453557 - type: nauc_map_at_5_max value: 26.85332935641574 - type: nauc_map_at_5_std value: 12.316377808191762 - type: nauc_mrr_at_1000_diff1 value: 21.49186339320302 - type: nauc_mrr_at_1000_max value: 24.329921012356493 - type: nauc_mrr_at_1000_std value: 13.6080824939291 - type: nauc_mrr_at_100_diff1 value: 21.47653180378912 - type: nauc_mrr_at_100_max value: 24.34218235410752 - type: nauc_mrr_at_100_std value: 13.646711743513668 - type: nauc_mrr_at_10_diff1 value: 21.487198850706935 - type: nauc_mrr_at_10_max value: 24.32385099521571 - type: nauc_mrr_at_10_std value: 13.26596223383694 - type: nauc_mrr_at_1_diff1 value: 23.19221955587559 - type: nauc_mrr_at_1_max value: 21.963004569187575 - type: nauc_mrr_at_1_std value: 8.799819519408619 - type: nauc_mrr_at_20_diff1 value: 21.51014357510076 - type: nauc_mrr_at_20_max value: 24.376067405199347 - type: nauc_mrr_at_20_std value: 13.643597889716563 - type: nauc_mrr_at_3_diff1 value: 22.60437837853161 - type: nauc_mrr_at_3_max value: 23.58608363876532 - type: nauc_mrr_at_3_std value: 11.887163540535768 - type: nauc_mrr_at_5_diff1 value: 21.919324914716633 - type: nauc_mrr_at_5_max value: 23.71458680225389 - type: nauc_mrr_at_5_std value: 12.507643886191785 - type: nauc_ndcg_at_1000_diff1 value: 18.546848864440005 - type: nauc_ndcg_at_1000_max value: 30.031984469206325 - type: nauc_ndcg_at_1000_std value: 26.561149084437485 - type: nauc_ndcg_at_100_diff1 value: 18.76271748622068 - type: nauc_ndcg_at_100_max value: 30.180887663861306 - type: nauc_ndcg_at_100_std value: 25.50551358758007 - type: nauc_ndcg_at_10_diff1 value: 19.861367738304697 - type: nauc_ndcg_at_10_max value: 27.360442235691522 - type: nauc_ndcg_at_10_std value: 16.476546243351976 - type: nauc_ndcg_at_1_diff1 value: 23.56715803292495 - type: nauc_ndcg_at_1_max value: 22.29229945166374 - type: nauc_ndcg_at_1_std value: 8.43434671818737 - type: nauc_ndcg_at_20_diff1 value: 18.885059883708053 - type: nauc_ndcg_at_20_max value: 27.78854464221595 - type: nauc_ndcg_at_20_std value: 19.404353378015255 - type: nauc_ndcg_at_3_diff1 value: 23.34227259398943 - type: nauc_ndcg_at_3_max value: 25.75899010582446 - type: nauc_ndcg_at_3_std value: 12.097012181915954 - type: nauc_ndcg_at_5_diff1 value: 21.599246331396863 - type: nauc_ndcg_at_5_max value: 26.6575824351444 - type: nauc_ndcg_at_5_std value: 14.029006846982394 - type: nauc_precision_at_1000_diff1 value: 4.880571159099271 - type: nauc_precision_at_1000_max value: 24.693741787360725 - type: nauc_precision_at_1000_std value: 41.00756555344345 - type: nauc_precision_at_100_diff1 value: 10.440170876298648 - type: nauc_precision_at_100_max value: 28.942738351320408 - type: nauc_precision_at_100_std value: 36.921704945977446 - type: nauc_precision_at_10_diff1 value: 15.55680558043308 - type: nauc_precision_at_10_max value: 27.31414489241847 - type: nauc_precision_at_10_std value: 19.76275914256793 - type: nauc_precision_at_1_diff1 value: 23.56715803292495 - type: nauc_precision_at_1_max value: 22.29229945166374 - type: nauc_precision_at_1_std value: 8.43434671818737 - type: nauc_precision_at_20_diff1 value: 12.57247210423589 - type: nauc_precision_at_20_max value: 25.978951783180946 - type: nauc_precision_at_20_std value: 23.89998191646426 - type: nauc_precision_at_3_diff1 value: 22.61273732758558 - type: nauc_precision_at_3_max value: 26.51246898792034 - type: nauc_precision_at_3_std value: 13.618855663226162 - type: nauc_precision_at_5_diff1 value: 19.216237125486472 - type: nauc_precision_at_5_max value: 27.491221626577868 - type: nauc_precision_at_5_std value: 16.448119031617793 - type: nauc_recall_at_1000_diff1 value: 5.787043341957982 - type: nauc_recall_at_1000_max value: 25.922109246772763 - type: nauc_recall_at_1000_std value: 43.03768522656805 - type: nauc_recall_at_100_diff1 value: 10.696362559629796 - type: nauc_recall_at_100_max value: 29.335080453227146 - type: nauc_recall_at_100_std value: 37.271217586452124 - type: nauc_recall_at_10_diff1 value: 15.458092305569215 - type: nauc_recall_at_10_max value: 27.24445210740807 - type: nauc_recall_at_10_std value: 19.71157635644842 - type: nauc_recall_at_1_diff1 value: 23.406427416653127 - type: nauc_recall_at_1_max value: 21.759624726647303 - type: nauc_recall_at_1_std value: 8.335925909478444 - type: nauc_recall_at_20_diff1 value: 12.666354755313089 - type: nauc_recall_at_20_max value: 26.089770792562327 - type: nauc_recall_at_20_std value: 24.153776619741254 - type: nauc_recall_at_3_diff1 value: 22.545408113368953 - type: nauc_recall_at_3_max value: 26.18564049945919 - type: nauc_recall_at_3_std value: 13.308772571657293 - type: nauc_recall_at_5_diff1 value: 19.063078320434958 - type: nauc_recall_at_5_max value: 27.15038597116091 - type: nauc_recall_at_5_std value: 16.202694888143302 - type: ndcg_at_1 value: 22.2 - type: ndcg_at_10 value: 19.528000000000002 - type: ndcg_at_100 value: 28.444000000000003 - type: ndcg_at_1000 value: 33.826 - type: ndcg_at_20 value: 22.746 - type: ndcg_at_3 value: 18.413 - type: ndcg_at_5 value: 15.927 - type: precision_at_1 value: 22.2 - type: precision_at_10 value: 10.24 - type: precision_at_100 value: 2.3040000000000003 - type: precision_at_1000 value: 0.358 - type: precision_at_20 value: 6.97 - type: precision_at_3 value: 17.299999999999997 - type: precision_at_5 value: 13.919999999999998 - type: recall_at_1 value: 4.5280000000000005 - type: recall_at_10 value: 20.757 - type: recall_at_100 value: 46.75 - type: recall_at_1000 value: 72.738 - type: recall_at_20 value: 28.28 - type: recall_at_3 value: 10.558 - type: recall_at_5 value: 14.148 task: type: Retrieval - dataset: config: default name: MTEB SICK-E-PL revision: 71bba34b0ece6c56dfcf46d9758a27f7a90f17e9 split: test type: PL-MTEB/sicke-pl-pairclassification metrics: - type: cosine_accuracy value: 87.50509580105992 - type: cosine_accuracy_threshold value: 89.01510631979949 - type: cosine_ap value: 85.58291779193907 - type: cosine_f1 value: 77.58919293384136 - type: cosine_f1_threshold value: 87.10908804245841 - type: cosine_precision value: 75.52258934592044 - type: cosine_recall value: 79.77207977207978 - type: dot_accuracy value: 83.9380350591113 - type: dot_accuracy_threshold value: 2292800.0 - type: dot_ap value: 77.56937485120034 - type: dot_f1 value: 73.32065906210391 - type: dot_f1_threshold value: 2190400.0 - type: dot_precision value: 66.03881278538812 - type: dot_recall value: 82.4074074074074 - type: euclidean_accuracy value: 87.89237668161435 - type: euclidean_accuracy_threshold value: 7497.701400069587 - type: euclidean_ap value: 85.97216152106346 - type: euclidean_f1 value: 77.97228300510578 - type: euclidean_f1_threshold value: 7799.027816670506 - type: euclidean_precision value: 79.89536621823618 - type: euclidean_recall value: 76.13960113960114 - type: main_score value: 85.97216152106346 - type: manhattan_accuracy value: 87.85161027313494 - type: manhattan_accuracy_threshold value: 357242.9743885994 - type: manhattan_ap value: 85.96709490495458 - type: manhattan_f1 value: 77.9874213836478 - type: manhattan_f1_threshold value: 383558.8531732559 - type: manhattan_precision value: 76.5432098765432 - type: manhattan_recall value: 79.48717948717949 - type: max_ap value: 85.97216152106346 - type: max_f1 value: 77.9874213836478 - type: max_precision value: 79.89536621823618 - type: max_recall value: 82.4074074074074 - type: similarity_accuracy value: 87.50509580105992 - type: similarity_accuracy_threshold value: 89.01510631979949 - type: similarity_ap value: 85.58291779193907 - type: similarity_f1 value: 77.58919293384136 - type: similarity_f1_threshold value: 87.10908804245841 - type: similarity_precision value: 75.52258934592044 - type: similarity_recall value: 79.77207977207978 task: type: PairClassification - dataset: config: default name: MTEB SICK-R-PL revision: fd5c2441b7eeff8676768036142af4cfa42c1339 split: test type: PL-MTEB/sickr-pl-sts metrics: - type: cosine_pearson value: 79.68602301743276 - type: cosine_spearman value: 78.15913085997471 - type: euclidean_pearson value: 77.19541180768627 - type: euclidean_spearman value: 77.9122894221527 - type: main_score value: 78.15913085997471 - type: manhattan_pearson value: 77.24713453824641 - type: manhattan_spearman value: 77.95971728547582 - type: pearson value: 79.68602301743276 - type: spearman value: 78.15913085997471 task: type: STS - dataset: config: pl name: MTEB STS22 (pl) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 42.01062393061261 - type: cosine_spearman value: 42.79076406559122 - type: euclidean_pearson value: 28.57786522106708 - type: euclidean_spearman value: 42.51040813516686 - type: main_score value: 42.79076406559122 - type: manhattan_pearson value: 28.855884350706653 - type: manhattan_spearman value: 42.77481125184737 - type: pearson value: 42.01062393061261 - type: spearman value: 42.79076406559122 task: type: STS - dataset: config: default name: MTEB SciFact-PL revision: 47932a35f045ef8ed01ba82bf9ff67f6e109207e split: test type: clarin-knext/scifact-pl metrics: - type: main_score value: 74.434 - type: map_at_1 value: 59.494 - type: map_at_10 value: 69.893 - type: map_at_100 value: 70.45 - type: map_at_1000 value: 70.466 - type: map_at_20 value: 70.259 - type: map_at_3 value: 67.037 - type: map_at_5 value: 68.777 - type: mrr_at_1 value: 62.66666666666667 - type: mrr_at_10 value: 71.04457671957671 - type: mrr_at_100 value: 71.52299909263925 - type: mrr_at_1000 value: 71.53881086964122 - type: mrr_at_20 value: 71.33636271136271 - type: mrr_at_3 value: 69.16666666666667 - type: mrr_at_5 value: 70.26666666666667 - type: nauc_map_at_1000_diff1 value: 68.97113084189034 - type: nauc_map_at_1000_max value: 51.00665747497857 - type: nauc_map_at_1000_std value: 8.970270487093412 - type: nauc_map_at_100_diff1 value: 68.97281660521169 - type: nauc_map_at_100_max value: 51.01659549614879 - type: nauc_map_at_100_std value: 8.986483862053491 - type: nauc_map_at_10_diff1 value: 69.07605123979184 - type: nauc_map_at_10_max value: 51.229841935772804 - type: nauc_map_at_10_std value: 9.050901052243548 - type: nauc_map_at_1_diff1 value: 71.46187295357046 - type: nauc_map_at_1_max value: 46.82038076857106 - type: nauc_map_at_1_std value: 6.931602615510153 - type: nauc_map_at_20_diff1 value: 68.93823362705625 - type: nauc_map_at_20_max value: 51.15218544845727 - type: nauc_map_at_20_std value: 8.993550237629675 - type: nauc_map_at_3_diff1 value: 69.19558420072627 - type: nauc_map_at_3_max value: 47.345905341053886 - type: nauc_map_at_3_std value: 4.833936436252541 - type: nauc_map_at_5_diff1 value: 69.05067049349557 - type: nauc_map_at_5_max value: 49.62866209452668 - type: nauc_map_at_5_std value: 7.455937282103214 - type: nauc_mrr_at_1000_diff1 value: 69.2896395759106 - type: nauc_mrr_at_1000_max value: 54.20478659857226 - type: nauc_mrr_at_1000_std value: 12.534151525016302 - type: nauc_mrr_at_100_diff1 value: 69.29115865311857 - type: nauc_mrr_at_100_max value: 54.212882919608475 - type: nauc_mrr_at_100_std value: 12.548435473868432 - type: nauc_mrr_at_10_diff1 value: 69.29596234146305 - type: nauc_mrr_at_10_max value: 54.391683731646935 - type: nauc_mrr_at_10_std value: 12.74312540729047 - type: nauc_mrr_at_1_diff1 value: 71.19661136604304 - type: nauc_mrr_at_1_max value: 53.50646788895577 - type: nauc_mrr_at_1_std value: 14.68408048005645 - type: nauc_mrr_at_20_diff1 value: 69.24714813412893 - type: nauc_mrr_at_20_max value: 54.32239828421196 - type: nauc_mrr_at_20_std value: 12.623980761665866 - type: nauc_mrr_at_3_diff1 value: 69.22708724496187 - type: nauc_mrr_at_3_max value: 53.18873450995116 - type: nauc_mrr_at_3_std value: 11.336687945925586 - type: nauc_mrr_at_5_diff1 value: 69.10748983236182 - type: nauc_mrr_at_5_max value: 53.878090193979034 - type: nauc_mrr_at_5_std value: 12.079036178698662 - type: nauc_ndcg_at_1000_diff1 value: 68.66705448374432 - type: nauc_ndcg_at_1000_max value: 52.74699991296371 - type: nauc_ndcg_at_1000_std value: 10.535824386304968 - type: nauc_ndcg_at_100_diff1 value: 68.66862462407086 - type: nauc_ndcg_at_100_max value: 52.979821543362874 - type: nauc_ndcg_at_100_std value: 10.856284103500371 - type: nauc_ndcg_at_10_diff1 value: 68.66965948376267 - type: nauc_ndcg_at_10_max value: 53.978681919984474 - type: nauc_ndcg_at_10_std value: 11.10472732803466 - type: nauc_ndcg_at_1_diff1 value: 71.19661136604304 - type: nauc_ndcg_at_1_max value: 53.50646788895577 - type: nauc_ndcg_at_1_std value: 14.68408048005645 - type: nauc_ndcg_at_20_diff1 value: 68.20754850499976 - type: nauc_ndcg_at_20_max value: 53.590485842045595 - type: nauc_ndcg_at_20_std value: 10.719753086433334 - type: nauc_ndcg_at_3_diff1 value: 68.23406959629385 - type: nauc_ndcg_at_3_max value: 48.8837450762613 - type: nauc_ndcg_at_3_std value: 6.287949648205997 - type: nauc_ndcg_at_5_diff1 value: 68.52532849588677 - type: nauc_ndcg_at_5_max value: 51.29845300513165 - type: nauc_ndcg_at_5_std value: 8.15488455762137 - type: nauc_precision_at_1000_diff1 value: -29.56388929021074 - type: nauc_precision_at_1000_max value: 18.61674681637121 - type: nauc_precision_at_1000_std value: 41.68541412973936 - type: nauc_precision_at_100_diff1 value: -17.020740767390375 - type: nauc_precision_at_100_max value: 24.321682766394957 - type: nauc_precision_at_100_std value: 39.36188711602 - type: nauc_precision_at_10_diff1 value: 7.735819461600302 - type: nauc_precision_at_10_max value: 39.59963139423176 - type: nauc_precision_at_10_std value: 33.923494696390385 - type: nauc_precision_at_1_diff1 value: 71.19661136604304 - type: nauc_precision_at_1_max value: 53.50646788895577 - type: nauc_precision_at_1_std value: 14.68408048005645 - type: nauc_precision_at_20_diff1 value: -3.587900694179661 - type: nauc_precision_at_20_max value: 33.36606615861144 - type: nauc_precision_at_20_std value: 34.51624192343654 - type: nauc_precision_at_3_diff1 value: 41.996620318298625 - type: nauc_precision_at_3_max value: 43.08007454860597 - type: nauc_precision_at_3_std value: 14.398965447916495 - type: nauc_precision_at_5_diff1 value: 25.054180107661132 - type: nauc_precision_at_5_max value: 40.94617942853718 - type: nauc_precision_at_5_std value: 23.69992709404865 - type: nauc_recall_at_1000_diff1 value: .nan - type: nauc_recall_at_1000_max value: .nan - type: nauc_recall_at_1000_std value: .nan - type: nauc_recall_at_100_diff1 value: 68.09523809523836 - type: nauc_recall_at_100_max value: 63.034547152194406 - type: nauc_recall_at_100_std value: 23.594771241830657 - type: nauc_recall_at_10_diff1 value: 66.43213426149696 - type: nauc_recall_at_10_max value: 63.07509853849101 - type: nauc_recall_at_10_std value: 15.44924084252273 - type: nauc_recall_at_1_diff1 value: 71.46187295357046 - type: nauc_recall_at_1_max value: 46.82038076857106 - type: nauc_recall_at_1_std value: 6.931602615510153 - type: nauc_recall_at_20_diff1 value: 61.64354198229226 - type: nauc_recall_at_20_max value: 63.09950698826864 - type: nauc_recall_at_20_std value: 12.823209698925014 - type: nauc_recall_at_3_diff1 value: 65.63352507252078 - type: nauc_recall_at_3_max value: 45.10210171735505 - type: nauc_recall_at_3_std value: -0.08017546941514365 - type: nauc_recall_at_5_diff1 value: 65.93453179242769 - type: nauc_recall_at_5_max value: 51.97740656606473 - type: nauc_recall_at_5_std value: 4.929967882548962 - type: ndcg_at_1 value: 62.666999999999994 - type: ndcg_at_10 value: 74.434 - type: ndcg_at_100 value: 76.655 - type: ndcg_at_1000 value: 77.08 - type: ndcg_at_20 value: 75.588 - type: ndcg_at_3 value: 69.75099999999999 - type: ndcg_at_5 value: 72.09100000000001 - type: precision_at_1 value: 62.666999999999994 - type: precision_at_10 value: 9.9 - type: precision_at_100 value: 1.097 - type: precision_at_1000 value: 0.11299999999999999 - type: precision_at_20 value: 5.2 - type: precision_at_3 value: 27.0 - type: precision_at_5 value: 17.933 - type: recall_at_1 value: 59.494 - type: recall_at_10 value: 87.13300000000001 - type: recall_at_100 value: 96.667 - type: recall_at_1000 value: 100.0 - type: recall_at_20 value: 91.43299999999999 - type: recall_at_3 value: 74.461 - type: recall_at_5 value: 80.34400000000001 task: type: Retrieval - dataset: config: default name: MTEB TRECCOVID-PL revision: 81bcb408f33366c2a20ac54adafad1ae7e877fdd split: test type: clarin-knext/trec-covid-pl metrics: - type: main_score value: 82.749 - type: map_at_1 value: 0.20400000000000001 - type: map_at_10 value: 2.099 - type: map_at_100 value: 12.948 - type: map_at_1000 value: 32.007000000000005 - type: map_at_20 value: 3.746 - type: map_at_3 value: 0.651 - type: map_at_5 value: 1.061 - type: mrr_at_1 value: 84.0 - type: mrr_at_10 value: 91.66666666666666 - type: mrr_at_100 value: 91.66666666666666 - type: mrr_at_1000 value: 91.66666666666666 - type: mrr_at_20 value: 91.66666666666666 - type: mrr_at_3 value: 91.66666666666666 - type: mrr_at_5 value: 91.66666666666666 - type: nauc_map_at_1000_diff1 value: 1.0291414165448085 - type: nauc_map_at_1000_max value: 57.33479540784058 - type: nauc_map_at_1000_std value: 76.70364036170582 - type: nauc_map_at_100_diff1 value: 6.949672309533349 - type: nauc_map_at_100_max value: 43.99861611069154 - type: nauc_map_at_100_std value: 64.12473626966596 - type: nauc_map_at_10_diff1 value: 4.208568177173666 - type: nauc_map_at_10_max value: 18.875910045226423 - type: nauc_map_at_10_std value: 34.58171216714189 - type: nauc_map_at_1_diff1 value: 8.433450768728983 - type: nauc_map_at_1_max value: 24.08001091473891 - type: nauc_map_at_1_std value: 35.21473053133869 - type: nauc_map_at_20_diff1 value: 6.041054220619057 - type: nauc_map_at_20_max value: 22.57475437061051 - type: nauc_map_at_20_std value: 35.254808865756964 - type: nauc_map_at_3_diff1 value: 11.166815378728485 - type: nauc_map_at_3_max value: 18.995433996118248 - type: nauc_map_at_3_std value: 34.29696290521795 - type: nauc_map_at_5_diff1 value: 7.1134812647567855 - type: nauc_map_at_5_max value: 20.03877039266845 - type: nauc_map_at_5_std value: 36.21644151312843 - type: nauc_mrr_at_1000_diff1 value: -7.262394669801826 - type: nauc_mrr_at_1000_max value: 66.22378992749366 - type: nauc_mrr_at_1000_std value: 68.18146188516563 - type: nauc_mrr_at_100_diff1 value: -7.262394669801826 - type: nauc_mrr_at_100_max value: 66.22378992749366 - type: nauc_mrr_at_100_std value: 68.18146188516563 - type: nauc_mrr_at_10_diff1 value: -7.262394669801826 - type: nauc_mrr_at_10_max value: 66.22378992749366 - type: nauc_mrr_at_10_std value: 68.18146188516563 - type: nauc_mrr_at_1_diff1 value: -11.38929798723619 - type: nauc_mrr_at_1_max value: 68.58738340697101 - type: nauc_mrr_at_1_std value: 68.00441826215022 - type: nauc_mrr_at_20_diff1 value: -7.262394669801826 - type: nauc_mrr_at_20_max value: 66.22378992749366 - type: nauc_mrr_at_20_std value: 68.18146188516563 - type: nauc_mrr_at_3_diff1 value: -7.262394669801826 - type: nauc_mrr_at_3_max value: 66.22378992749366 - type: nauc_mrr_at_3_std value: 68.18146188516563 - type: nauc_mrr_at_5_diff1 value: -7.262394669801826 - type: nauc_mrr_at_5_max value: 66.22378992749366 - type: nauc_mrr_at_5_std value: 68.18146188516563 - type: nauc_ndcg_at_1000_diff1 value: 2.5628376286433334 - type: nauc_ndcg_at_1000_max value: 57.605148480655025 - type: nauc_ndcg_at_1000_std value: 76.62891677430625 - type: nauc_ndcg_at_100_diff1 value: -13.313083767893671 - type: nauc_ndcg_at_100_max value: 52.932453336031905 - type: nauc_ndcg_at_100_std value: 73.5050466104544 - type: nauc_ndcg_at_10_diff1 value: -6.837803344621873 - type: nauc_ndcg_at_10_max value: 59.29833159945462 - type: nauc_ndcg_at_10_std value: 63.719268128346705 - type: nauc_ndcg_at_1_diff1 value: 4.834338452523335 - type: nauc_ndcg_at_1_max value: 53.58546768562144 - type: nauc_ndcg_at_1_std value: 59.07659252386643 - type: nauc_ndcg_at_20_diff1 value: -9.617683189610558 - type: nauc_ndcg_at_20_max value: 54.57354685878183 - type: nauc_ndcg_at_20_std value: 63.15198506529425 - type: nauc_ndcg_at_3_diff1 value: 15.216236580270994 - type: nauc_ndcg_at_3_max value: 58.345749967766416 - type: nauc_ndcg_at_3_std value: 61.78177922399883 - type: nauc_ndcg_at_5_diff1 value: 1.3882436296634026 - type: nauc_ndcg_at_5_max value: 62.44013008368074 - type: nauc_ndcg_at_5_std value: 65.64455986653293 - type: nauc_precision_at_1000_diff1 value: -18.516822124710856 - type: nauc_precision_at_1000_max value: 33.10336267989325 - type: nauc_precision_at_1000_std value: 29.49816019882571 - type: nauc_precision_at_100_diff1 value: -14.113619184538592 - type: nauc_precision_at_100_max value: 55.55228172103563 - type: nauc_precision_at_100_std value: 69.64355056246397 - type: nauc_precision_at_10_diff1 value: -27.271286464111455 - type: nauc_precision_at_10_max value: 61.885272647604594 - type: nauc_precision_at_10_std value: 60.73389705676694 - type: nauc_precision_at_1_diff1 value: -11.38929798723619 - type: nauc_precision_at_1_max value: 68.58738340697101 - type: nauc_precision_at_1_std value: 68.00441826215022 - type: nauc_precision_at_20_diff1 value: -21.53639909310826 - type: nauc_precision_at_20_max value: 53.361537614358376 - type: nauc_precision_at_20_std value: 55.58737187496432 - type: nauc_precision_at_3_diff1 value: 3.785071466384217 - type: nauc_precision_at_3_max value: 61.66906148377818 - type: nauc_precision_at_3_std value: 62.81857369734561 - type: nauc_precision_at_5_diff1 value: -16.00339477131436 - type: nauc_precision_at_5_max value: 61.5246951163262 - type: nauc_precision_at_5_std value: 63.615062452722135 - type: nauc_recall_at_1000_diff1 value: 5.871263115826736 - type: nauc_recall_at_1000_max value: 50.48397949000848 - type: nauc_recall_at_1000_std value: 67.37950715297474 - type: nauc_recall_at_100_diff1 value: 8.310215006893952 - type: nauc_recall_at_100_max value: 28.687726825722386 - type: nauc_recall_at_100_std value: 50.34038560928654 - type: nauc_recall_at_10_diff1 value: 3.3408195168322075 - type: nauc_recall_at_10_max value: 6.89511828305496 - type: nauc_recall_at_10_std value: 22.929267555360028 - type: nauc_recall_at_1_diff1 value: 8.433450768728983 - type: nauc_recall_at_1_max value: 24.08001091473891 - type: nauc_recall_at_1_std value: 35.21473053133869 - type: nauc_recall_at_20_diff1 value: 5.307683260432045 - type: nauc_recall_at_20_max value: 10.025532087519974 - type: nauc_recall_at_20_std value: 24.110512570368947 - type: nauc_recall_at_3_diff1 value: 13.355136074654078 - type: nauc_recall_at_3_max value: 8.568079109800236 - type: nauc_recall_at_3_std value: 23.691593767005745 - type: nauc_recall_at_5_diff1 value: 6.535580157651383 - type: nauc_recall_at_5_max value: 9.1442468749571 - type: nauc_recall_at_5_std value: 27.00111567203191 - type: ndcg_at_1 value: 79.0 - type: ndcg_at_10 value: 82.749 - type: ndcg_at_100 value: 63.846000000000004 - type: ndcg_at_1000 value: 57.691 - type: ndcg_at_20 value: 77.076 - type: ndcg_at_3 value: 84.83800000000001 - type: ndcg_at_5 value: 83.016 - type: precision_at_1 value: 84.0 - type: precision_at_10 value: 87.8 - type: precision_at_100 value: 66.10000000000001 - type: precision_at_1000 value: 25.764 - type: precision_at_20 value: 81.10000000000001 - type: precision_at_3 value: 91.333 - type: precision_at_5 value: 88.8 - type: recall_at_1 value: 0.20400000000000001 - type: recall_at_10 value: 2.294 - type: recall_at_100 value: 16.134999999999998 - type: recall_at_1000 value: 54.981 - type: recall_at_20 value: 4.201 - type: recall_at_3 value: 0.699 - type: recall_at_5 value: 1.141 task: type: Retrieval ---

FlagEmbedding

For more details please refer to our Github: FlagEmbedding. **BGE-Multilingual-Gemma2** is a LLM-based multilingual embedding model. It is trained on a diverse range of languages and tasks based on google/gemma-2-9b. BGE-Multilingual-Gemma2 primarily demonstrates the following advancements: - Diverse training data: The model's training data spans a broad range of languages, including English, Chinese, Japanese, Korean, French, and more.Additionally, the data covers a variety of task types, such as retrieval, classification, and clustering. - Outstanding performance: The model exhibits state-of-the-art (SOTA) results on multilingual benchmarks like MIRACL, MTEB-pl, and MTEB-fr. It also achieves excellent performance on other major evaluations, including MTEB, C-MTEB and AIR-Bench. ## 📑 Open-source Plan - [x] Checkpoint - [ ] Training Data We will release the training data of **BGE-Multilingual-Gemma2** in the future. ## Usage ### Using FlagEmbedding By default, FlagLLMModel will use all available GPUs when encoding. Please set to select specific GPUs. You also can set to make all GPUs unavailable. ### Using Sentence Transformers ### Using HuggingFace Transformers ## Evaluation exhibits **state-of-the-art (SOTA) results on benchmarks like MIRACL, MTEB-pl, and MTEB-fr**. It also achieves excellent performance on other major evaluations, including MTEB, C-MTEB and AIR-Bench. - **MIRACL** nDCG@10: \"MIRACL-nDCG@10\" Recall@100: \"MIRACL-Recall@100\" - **MTEB-fr/pl** \"MTEB-fr/pl\" - **MTEB** \"MTEB\" - **BEIR** \"BEIR\" - **C-MTEB** \"C-MTEB\" - **AIR-Bench** Long-Doc (en, Recall@10): \"AIR-Bench_Long-Doc\" QA (en&zh, nDCG@10): \"AIR-Bench_QA\" ## Model List is short for . | Model | Language | | Description | query instruction for retrieval [1] | | :----------------------------------------------------------- | :-----------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | | BAAI/bge-multilingual-gemma2 | Multilingual | - | A LLM-based multilingual embedding model, trained on a diverse range of languages and tasks. | | BAAI/bge-en-icl | English | - | A LLM-based dense retriever with in-context learning capabilities can fully leverage the model's potential based on a few shot examples(4096 tokens) | Provide instructions and few-shot examples freely based on the given task. | | BAAI/bge-m3 | Multilingual | Inference Fine-tune | Multi-Functionality(dense retrieval, sparse retrieval, multi-vector(colbert)), Multi-Linguality, and Multi-Granularity(8192 tokens) | | | BAAI/llm-embedder | English | Inference Fine-tune | a unified embedding model to support diverse retrieval augmentation needs for LLMs | See README | | BAAI/bge-reranker-large | Chinese and English | Inference Fine-tune | a cross-encoder model which is more accurate but less efficient [2] | | | BAAI/bge-reranker-base | Chinese and English | Inference Fine-tune | a cross-encoder model which is more accurate but less efficient [2] | | | BAAI/bge-large-en-v1.5 | English | Inference Fine-tune | version 1.5 with more reasonable similarity distribution | | | BAAI/bge-base-en-v1.5 | English | Inference Fine-tune | version 1.5 with more reasonable similarity distribution | | | BAAI/bge-small-en-v1.5 | English | Inference Fine-tune | version 1.5 with more reasonable similarity distribution | | | BAAI/bge-large-zh-v1.5 | Chinese | Inference Fine-tune | version 1.5 with more reasonable similarity distribution | | | BAAI/bge-base-zh-v1.5 | Chinese | Inference Fine-tune | version 1.5 with more reasonable similarity distribution | | | BAAI/bge-small-zh-v1.5 | Chinese | Inference Fine-tune | version 1.5 with more reasonable similarity distribution | | | BAAI/bge-large-en | English | Inference Fine-tune | :trophy: rank **1st** in MTEB leaderboard | | | BAAI/bge-base-en | English | Inference Fine-tune | a base-scale model but with similar ability to | | | BAAI/bge-small-en | English | Inference Fine-tune | a small-scale model but with competitive performance | | | BAAI/bge-large-zh | Chinese | Inference Fine-tune | :trophy: rank **1st** in C-MTEB benchmark | | | BAAI/bge-base-zh | Chinese | Inference Fine-tune | a base-scale model but with similar ability to | | | BAAI/bge-small-zh | Chinese | Inference Fine-tune | a small-scale model but with competitive performance | | ## Citation If you find this repository useful, please consider giving a star :star: and citation", + "model_explanation_gemini": "A multilingual model for feature extraction and sentence similarity tasks, optimized for retrieval performance across various datasets.\n\nFeatures: \n- Multilingual capability \n- Feature extraction \n- Sentence similarity measurement \n- Retrieval task optimization \n\nComparison: \nThe model shows competitive retrieval performance (measured by NDCG, MAP, precision, and recall scores) across multiple datasets (NFCorpus, MSMARCO, FiQA2018) compared to other models in similar tasks." +} \ No newline at end of file diff --git a/model_data_json/BAAI_llm-embedder.json b/model_data_json/BAAI_llm-embedder.json new file mode 100644 index 0000000000000000000000000000000000000000..3c8cb601a495efe412ba55804d31edd775b61a9a --- /dev/null +++ b/model_data_json/BAAI_llm-embedder.json @@ -0,0 +1,19 @@ +{ + "model_id": "BAAI/llm-embedder", + "downloads": 82703, + "tags": [ + "transformers", + "pytorch", + "safetensors", + "bert", + "feature-extraction", + "arxiv:2310.07554", + "arxiv:2309.07597", + "license:mit", + "text-embeddings-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: mit ---

FlagEmbedding

Model List | FAQ | Usage | Evaluation | Train | Contact | Citation | License

More details please refer to our Github: FlagEmbedding. English | 中文 **Hiring:** We're seeking experienced NLP researchers and intern students focusing on dense retrieval and retrieval-augmented LLMs. If you're interested, please feel free to reach out to us via email at zhengliu1026@gmail.com. FlagEmbedding can map any text to a low-dimensional dense vector, which can be used for tasks like retrieval, classification, clustering, and semantic search. And it can also be used in vector databases for LLMs. ************* 🌟**Updates**🌟 ************* - 10/12/2023: Release LLM-Embedder, a unified embedding model to support diverse retrieval augmentation needs for LLMs. Paper :fire: - 09/15/2023: The technical report of BGE has been released - 09/15/2023: The massive training data of BGE has been released - 09/12/2023: New models: - **New reranker model**: release cross-encoder models and , which are more powerful than embedding model. We recommend to use/fine-tune them to re-rank top-k documents returned by embedding models. - **update embedding model**: release embedding model to alleviate the issue of the similarity distribution, and enhance its retrieval ability without instruction.
More - 09/07/2023: Update fine-tune code: Add script to mine hard negatives and support adding instruction during fine-tuning. - 08/09/2023: BGE Models are integrated into **Langchain**, you can use it like this; C-MTEB **leaderboard** is available. - 08/05/2023: Release base-scale and small-scale models, **best performance among the models of the same size 🤗** - 08/02/2023: Release (short for BAAI General Embedding) Models, **rank 1st on MTEB and C-MTEB benchmark!** :tada: :tada: - 08/01/2023: We release the Chinese Massive Text Embedding Benchmark (**C-MTEB**), consisting of 31 test dataset.
## Model List is short for . | Model | Language | | Description | query instruction for retrieval [1] | |:-------------------------------|:--------:| :--------:| :--------:|:--------:| | BAAI/llm-embedder | English | Inference Fine-tune | a unified embedding model to support diverse retrieval augmentation needs for LLMs | See README | | BAAI/bge-reranker-large | Chinese and English | Inference Fine-tune | a cross-encoder model which is more accurate but less efficient [2] | | | BAAI/bge-reranker-base | Chinese and English | Inference Fine-tune | a cross-encoder model which is more accurate but less efficient [2] | | | BAAI/bge-large-en-v1.5 | English | Inference Fine-tune | version 1.5 with more reasonable similarity distribution | | | BAAI/bge-base-en-v1.5 | English | Inference Fine-tune | version 1.5 with more reasonable similarity distribution | | | BAAI/bge-small-en-v1.5 | English | Inference Fine-tune | version 1.5 with more reasonable similarity distribution | | | BAAI/bge-large-zh-v1.5 | Chinese | Inference Fine-tune | version 1.5 with more reasonable similarity distribution | | | BAAI/bge-base-zh-v1.5 | Chinese | Inference Fine-tune | version 1.5 with more reasonable similarity distribution | | | BAAI/bge-small-zh-v1.5 | Chinese | Inference Fine-tune | version 1.5 with more reasonable similarity distribution | | | BAAI/bge-large-en | English | Inference Fine-tune | :trophy: rank **1st** in MTEB leaderboard | | | BAAI/bge-base-en | English | Inference Fine-tune | a base-scale model but with similar ability to | | | BAAI/bge-small-en | English | Inference Fine-tune |a small-scale model but with competitive performance | | | BAAI/bge-large-zh | Chinese | Inference Fine-tune | :trophy: rank **1st** in C-MTEB benchmark | | | BAAI/bge-base-zh | Chinese | Inference Fine-tune | a base-scale model but with similar ability to | | | BAAI/bge-small-zh | Chinese | Inference Fine-tune | a small-scale model but with competitive performance | | [1\\]: If you need to search the relevant passages in a query, we suggest to add the instruction to the query; in other cases, no instruction is needed, just use the original query directly. In all cases, **no instruction** needs to be added to passages. [2\\]: Different from the embedding model, reranker uses question and document as input and directly output similarity instead of embedding. To balance the accuracy and time cost, cross-encoder is widely used to re-rank top-k documents retrieved by other simple models. For example, use bge embedding model to retrieve top 100 relevant documents, and then use bge reranker to re-rank the top 100 documents to get the final top-3 results. All models have been uploaded to Huggingface Hub, and you can see them at If you cannot open the Huggingface Hub, you can also download the models at . ## Frequently asked questions **1. How to fine-tune bge embedding model?** Following this example to prepare data and fine-tune your model. Some suggestions: - Mine hard negatives following this example, which can improve the retrieval performance. - In general, larger hyper-parameter brings better performance. You can expand it by enabling , (df_config.json can refer to ds_config.json, , etc. - If you pre-train bge on your data, the pre-trained model cannot be directly used to calculate similarity, and it must be fine-tuned with contrastive learning before computing similarity. - If the accuracy of the fine-tuned model is still not high, it is recommended to use/fine-tune the cross-encoder model (bge-reranker) to re-rank top-k results. Hard negatives also are needed to fine-tune reranker.
2. The similarity score between two dissimilar sentences is higher than 0.5 **Suggest to use bge v1.5, which alleviates the issue of the similarity distribution.** Since we finetune the models by contrastive learning with a temperature of 0.01, the similarity distribution of the current BGE model is about in the interval \\[0.6, 1\\]. So a similarity score greater than 0.5 does not indicate that the two sentences are similar. For downstream tasks, such as passage retrieval or semantic similarity, **what matters is the relative order of the scores, not the absolute value.** If you need to filter similar sentences based on a similarity threshold, please select an appropriate similarity threshold based on the similarity distribution on your data (such as 0.8, 0.85, or even 0.9).
3. When does the query instruction need to be used For the , we improve its retrieval ability when not using instruction. No instruction only has a slight degradation in retrieval performance compared with using instruction. So you can generate embedding without instruction in all cases for convenience. For a retrieval task that uses short queries to find long related documents, it is recommended to add instructions for these short queries. **The best method to decide whether to add instructions for queries is choosing the setting that achieves better performance on your task.** In all cases, the documents/passages do not need to add the instruction.
## Usage ### Usage for Embedding Model Here are some examples of using models with FlagEmbedding, Sentence-Transformers, Langchain, or Huggingface Transformers. #### Using FlagEmbedding If it doesn't work for you, you can see FlagEmbedding for more methods to install FlagEmbedding. For the value of the argument , see Model List. By default, FlagModel will use all available GPUs when encoding. Please set to select specific GPUs. You also can set to make all GPUs unavailable. #### Using Sentence-Transformers You can also use the models with sentence-transformers: For s2p(short query to long passage) retrieval task, each short query should start with an instruction (instructions see Model List). But the instruction is not needed for passages. #### Using Langchain You can use in langchain like this: #### Using HuggingFace Transformers With the transformers package, you can use the model like this: First, you pass your input through the transformer model, then you select the last hidden state of the first token (i.e., [CLS]) as the sentence embedding. ### Usage for Reranker Different from embedding model, reranker uses question and document as input and directly output similarity instead of embedding. You can get a relevance score by inputting query and passage to the reranker. The reranker is optimized based cross-entropy loss, so the relevance score is not bounded to a specific range. #### Using FlagEmbedding Get relevance scores (higher scores indicate more relevance): #### Using Huggingface transformers ## Evaluation models achieve **state-of-the-art performance on both MTEB and C-MTEB leaderboard!** For more details and evaluation tools see our scripts. - **MTEB**: | Model Name | Dimension | Sequence Length | Average (56) | Retrieval (15) |Clustering (11) | Pair Classification (3) | Reranking (4) | STS (10) | Summarization (1) | Classification (12) | |:----:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| | BAAI/bge-large-en-v1.5 | 1024 | 512 | **64.23** | **54.29** | 46.08 | 87.12 | 60.03 | 83.11 | 31.61 | 75.97 | | BAAI/bge-base-en-v1.5 | 768 | 512 | 63.55 | 53.25 | 45.77 | 86.55 | 58.86 | 82.4 | 31.07 | 75.53 | | BAAI/bge-small-en-v1.5 | 384 | 512 | 62.17 |51.68 | 43.82 | 84.92 | 58.36 | 81.59 | 30.12 | 74.14 | | bge-large-en | 1024 | 512 | 63.98 | 53.9 | 46.98 | 85.8 | 59.48 | 81.56 | 32.06 | 76.21 | | bge-base-en | 768 | 512 | 63.36 | 53.0 | 46.32 | 85.86 | 58.7 | 81.84 | 29.27 | 75.27 | | gte-large | 1024 | 512 | 63.13 | 52.22 | 46.84 | 85.00 | 59.13 | 83.35 | 31.66 | 73.33 | | gte-base | 768 | 512 | 62.39 | 51.14 | 46.2 | 84.57 | 58.61 | 82.3 | 31.17 | 73.01 | | e5-large-v2 | 1024| 512 | 62.25 | 50.56 | 44.49 | 86.03 | 56.61 | 82.05 | 30.19 | 75.24 | | bge-small-en | 384 | 512 | 62.11 | 51.82 | 44.31 | 83.78 | 57.97 | 80.72 | 30.53 | 74.37 | | instructor-xl | 768 | 512 | 61.79 | 49.26 | 44.74 | 86.62 | 57.29 | 83.06 | 32.32 | 61.79 | | e5-base-v2 | 768 | 512 | 61.5 | 50.29 | 43.80 | 85.73 | 55.91 | 81.05 | 30.28 | 73.84 | | gte-small | 384 | 512 | 61.36 | 49.46 | 44.89 | 83.54 | 57.7 | 82.07 | 30.42 | 72.31 | | text-embedding-ada-002 | 1536 | 8192 | 60.99 | 49.25 | 45.9 | 84.89 | 56.32 | 80.97 | 30.8 | 70.93 | | e5-small-v2 | 384 | 512 | 59.93 | 49.04 | 39.92 | 84.67 | 54.32 | 80.39 | 31.16 | 72.94 | | sentence-t5-xxl | 768 | 512 | 59.51 | 42.24 | 43.72 | 85.06 | 56.42 | 82.63 | 30.08 | 73.42 | | all-mpnet-base-v2 | 768 | 514 | 57.78 | 43.81 | 43.69 | 83.04 | 59.36 | 80.28 | 27.49 | 65.07 | | sgpt-bloom-7b1-msmarco | 4096 | 2048 | 57.59 | 48.22 | 38.93 | 81.9 | 55.65 | 77.74 | 33.6 | 66.19 | - **C-MTEB**: We create the benchmark C-MTEB for Chinese text embedding which consists of 31 datasets from 6 tasks. Please refer to C_MTEB for a detailed introduction. | Model | Embedding dimension | Avg | Retrieval | STS | PairClassification | Classification | Reranking | Clustering | |:-------------------------------|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:| | **BAAI/bge-large-zh-v1.5** | 1024 | **64.53** | 70.46 | 56.25 | 81.6 | 69.13 | 65.84 | 48.99 | | BAAI/bge-base-zh-v1.5 | 768 | 63.13 | 69.49 | 53.72 | 79.75 | 68.07 | 65.39 | 47.53 | | BAAI/bge-small-zh-v1.5 | 512 | 57.82 | 61.77 | 49.11 | 70.41 | 63.96 | 60.92 | 44.18 | | BAAI/bge-large-zh | 1024 | 64.20 | 71.53 | 54.98 | 78.94 | 68.32 | 65.11 | 48.39 | | bge-large-zh-noinstruct | 1024 | 63.53 | 70.55 | 53 | 76.77 | 68.58 | 64.91 | 50.01 | | BAAI/bge-base-zh | 768 | 62.96 | 69.53 | 54.12 | 77.5 | 67.07 | 64.91 | 47.63 | | multilingual-e5-large | 1024 | 58.79 | 63.66 | 48.44 | 69.89 | 67.34 | 56.00 | 48.23 | | BAAI/bge-small-zh | 512 | 58.27 | 63.07 | 49.45 | 70.35 | 63.64 | 61.48 | 45.09 | | m3e-base | 768 | 57.10 | 56.91 | 50.47 | 63.99 | 67.52 | 59.34 | 47.68 | | m3e-large | 1024 | 57.05 | 54.75 | 50.42 | 64.3 | 68.2 | 59.66 | 48.88 | | multilingual-e5-base | 768 | 55.48 | 61.63 | 46.49 | 67.07 | 65.35 | 54.35 | 40.68 | | multilingual-e5-small | 384 | 55.38 | 59.95 | 45.27 | 66.45 | 65.85 | 53.86 | 45.26 | | text-embedding-ada-002(OpenAI) | 1536 | 53.02 | 52.0 | 43.35 | 69.56 | 64.31 | 54.28 | 45.68 | | luotuo | 1024 | 49.37 | 44.4 | 42.78 | 66.62 | 61 | 49.25 | 44.39 | | text2vec-base | 768 | 47.63 | 38.79 | 43.41 | 67.41 | 62.19 | 49.45 | 37.66 | | text2vec-large | 1024 | 47.36 | 41.94 | 44.97 | 70.86 | 60.66 | 49.16 | 30.02 | - **Reranking**: See C_MTEB for evaluation script. | Model | T2Reranking | T2RerankingZh2En\\* | T2RerankingEn2Zh\\* | MMarcoReranking | CMedQAv1 | CMedQAv2 | Avg | |:-------------------------------|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:| | text2vec-base-multilingual | 64.66 | 62.94 | 62.51 | 14.37 | 48.46 | 48.6 | 50.26 | | multilingual-e5-small | 65.62 | 60.94 | 56.41 | 29.91 | 67.26 | 66.54 | 57.78 | | multilingual-e5-large | 64.55 | 61.61 | 54.28 | 28.6 | 67.42 | 67.92 | 57.4 | | multilingual-e5-base | 64.21 | 62.13 | 54.68 | 29.5 | 66.23 | 66.98 | 57.29 | | m3e-base | 66.03 | 62.74 | 56.07 | 17.51 | 77.05 | 76.76 | 59.36 | | m3e-large | 66.13 | 62.72 | 56.1 | 16.46 | 77.76 | 78.27 | 59.57 | | bge-base-zh-v1.5 | 66.49 | 63.25 | 57.02 | 29.74 | 80.47 | 84.88 | 63.64 | | bge-large-zh-v1.5 | 65.74 | 63.39 | 57.03 | 28.74 | 83.45 | 85.44 | 63.97 | | BAAI/bge-reranker-base | 67.28 | 63.95 | 60.45 | 35.46 | 81.26 | 84.1 | 65.42 | | BAAI/bge-reranker-large | 67.6 | 64.03 | 61.44 | 37.16 | 82.15 | 84.18 | 66.09 | \\* : T2RerankingZh2En and T2RerankingEn2Zh are cross-language retrieval tasks ## Train ### BAAI Embedding We pre-train the models using retromae and train them on large-scale pair data using contrastive learning. **You can fine-tune the embedding model on your data following our examples.** We also provide a pre-train example. Note that the goal of pre-training is to reconstruct the text, and the pre-trained model cannot be used for similarity calculation directly, it needs to be fine-tuned. For more training details for bge see baai_general_embedding. ### BGE Reranker Cross-encoder will perform full-attention over the input pair, which is more accurate than embedding model (i.e., bi-encoder) but more time-consuming than embedding model. Therefore, it can be used to re-rank the top-k documents returned by embedding model. We train the cross-encoder on a multilingual pair data, The data format is the same as embedding model, so you can fine-tune it easily following our example. For more details please refer to ./FlagEmbedding/reranker/README.md ### Our Contributors: ## Contact If you have any question or suggestion related to this project, feel free to open an issue or pull request. You also can email Shitao Xiao(stxiao@baai.ac.cn) and Zheng Liu(liuzheng@baai.ac.cn). ## Citation If you find this repository useful, please consider giving a star :star: and citation ## License FlagEmbedding is licensed under the MIT License. The released models can be used for commercial purposes free of charge.", + "model_explanation_gemini": "\"BAAI_llm-embedder is a unified embedding model that maps text to dense vectors for retrieval-augmented tasks in LLMs, supporting retrieval, classification, clustering, and semantic search.\"\n\nFeatures: \n- Generates low-dimensional dense vectors from text \n- Designed for retrieval augmentation in LLMs \n- Supports tasks like retrieval, classification, clustering, and semantic search \n- Compatible with vector databases for LLMs \n\nComparison: \nThe model is part of the FlagEmbedding" +} \ No newline at end of file diff --git a/model_data_json/EleutherAI_pythia-1b.json b/model_data_json/EleutherAI_pythia-1b.json new file mode 100644 index 0000000000000000000000000000000000000000..c148d7955969b873bd0546d465e4e6df2992f65b --- /dev/null +++ b/model_data_json/EleutherAI_pythia-1b.json @@ -0,0 +1,25 @@ +{ + "model_id": "EleutherAI/pythia-1b", + "downloads": 83131, + "tags": [ + "transformers", + "pytorch", + "safetensors", + "gpt_neox", + "text-generation", + "causal-lm", + "pythia", + "en", + "dataset:the_pile", + "arxiv:2304.01373", + "arxiv:2101.00027", + "arxiv:2201.07311", + "license:apache-2.0", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: - en tags: - pytorch - causal-lm - pythia license: apache-2.0 datasets: - the_pile --- The *Pythia Scaling Suite* is a collection of models developed to facilitate interpretability research (see paper). It contains two sets of eight models of sizes 70M, 160M, 410M, 1B, 1.4B, 2.8B, 6.9B, and 12B. For each size, there are two models: one trained on the Pile, and one trained on the Pile after the dataset has been globally deduplicated. All 8 model sizes are trained on the exact same data, in the exact same order. We also provide 154 intermediate checkpoints per model, hosted on Hugging Face as branches. The Pythia model suite was deliberately designed to promote scientific research on large language models, especially interpretability research. Despite not centering downstream performance as a design goal, we find the models match or exceed the performance of similar and same-sized models, such as those in the OPT and GPT-Neo suites.
Details on previous early release and naming convention. Previously, we released an early version of the Pythia suite to the public. However, we decided to retrain the model suite to address a few hyperparameter discrepancies. This model card lists the changes; see appendix B in the Pythia paper for further discussion. We found no difference in benchmark performance between the two Pythia versions. The old models are still available, but we suggest the retrained suite if you are just starting to use Pythia.
**This is the current release.** Please note that all models in the *Pythia* suite were renamed in January 2023. For clarity, a table comparing the old and new names is provided in this model card, together with exact parameter counts.

# Pythia-1B ## Model Details - Developed by: EleutherAI - Model type: Transformer-based Language Model - Language: English - Learn more: Pythia's GitHub repository for training procedure, config files, and details on how to use. See paper for more evals and implementation details. - Library: GPT-NeoX - License: Apache 2.0 - Contact: to ask questions about this model, join the EleutherAI Discord, and post them in . Please read the existing *Pythia* documentation before asking about it in the EleutherAI Discord. For general correspondence: contact@eleuther. ai.
| Pythia model | Non-Embedding Params | Layers | Model Dim | Heads | Batch Size | Learning Rate | Equivalent Models | | -----------: | -------------------: | :----: | :-------: | :---: | :--------: | :-------------------: | :--------------------: | | 70M | 18,915,328 | 6 | 512 | 8 | 2M | 1.0 x 10-3 | — | | 160M | 85,056,000 | 12 | 768 | 12 | 2M | 6.0 x 10-4 | GPT-Neo 125M, OPT-125M | | 410M | 302,311,424 | 24 | 1024 | 16 | 2M | 3.0 x 10-4 | OPT-350M | | 1.0B | 805,736,448 | 16 | 2048 | 8 | 2M | 3.0 x 10-4 | — | | 1.4B | 1,208,602,624 | 24 | 2048 | 16 | 2M | 2.0 x 10-4 | GPT-Neo 1.3B, OPT-1.3B | | 2.8B | 2,517,652,480 | 32 | 2560 | 32 | 2M | 1.6 x 10-4 | GPT-Neo 2.7B, OPT-2.7B | | 6.9B | 6,444,163,072 | 32 | 4096 | 32 | 2M | 1.2 x 10-4 | OPT-6.7B | | 12B | 11,327,027,200 | 36 | 5120 | 40 | 2M | 1.2 x 10-4 | — |
Engineering details for the Pythia Suite. Deduped and non-deduped models of a given size have the same hyperparameters. “Equivalent” models have exactly the same architecture, and the same number of non-embedding parameters.
## Uses and Limitations ### Intended Use The primary intended use of Pythia is research on the behavior, functionality, and limitations of large language models. This suite is intended to provide a controlled setting for performing scientific experiments. We also provide 154 checkpoints per model: initial , 10 log-spaced checkpoints , and 143 evenly-spaced checkpoints from to . These checkpoints are hosted on Hugging Face as branches. Note that branch corresponds exactly to the model checkpoint on the branch of each model. You may also further fine-tune and adapt Pythia-1B for deployment, as long as your use is in accordance with the Apache 2.0 license. Pythia models work with the Hugging Face Transformers Library. If you decide to use pre-trained Pythia-1B as a basis for your fine-tuned model, please conduct your own risk and bias assessment. ### Out-of-scope use The Pythia Suite is **not** intended for deployment. It is not a in itself a product and cannot be used for human-facing interactions. For example, the model may generate harmful or offensive text. Please evaluate the risks associated with your particular use case. Pythia models are English-language only, and are not suitable for translation or generating text in other languages. Pythia-1B has not been fine-tuned for downstream contexts in which language models are commonly deployed, such as writing genre prose, or commercial chatbots. This means Pythia-1B will **not** respond to a given prompt the way a product like ChatGPT does. This is because, unlike this model, ChatGPT was fine-tuned using methods such as Reinforcement Learning from Human Feedback (RLHF) to better “follow” human instructions. ### Limitations and biases The core functionality of a large language model is to take a string of text and predict the next token. The token used by the model need not produce the most “accurate” text. Never rely on Pythia-1B to produce factually accurate output. This model was trained on the Pile, a dataset known to contain profanity and texts that are lewd or otherwise offensive. See Section 6 of the Pile paper for a discussion of documented biases with regards to gender, religion, and race. Pythia-1B may produce socially unacceptable or undesirable text, *even if* the prompt itself does not include anything explicitly offensive. If you plan on using text generated through, for example, the Hosted Inference API, we recommend having a human curate the outputs of this language model before presenting it to other people. Please inform your audience that the text was generated by Pythia-1B. ### Quickstart Pythia models can be loaded and used via the following code, demonstrated here for the third checkpoint: Revision/branch corresponds exactly to the model checkpoint on the branch of each model.
For more information on how to use all Pythia models, see documentation on GitHub. ## Training ### Training data The Pile is a 825GiB general-purpose dataset in English. It was created by EleutherAI specifically for training large language models. It contains texts from 22 diverse sources, roughly broken down into five categories: academic writing (e.g. arXiv), internet (e.g. CommonCrawl), prose (e.g. Project Gutenberg), dialogue (e.g. YouTube subtitles), and miscellaneous (e.g. GitHub, Enron Emails). See the Pile paper for a breakdown of all data sources, methodology, and a discussion of ethical implications. Consult the datasheet for more detailed documentation about the Pile and its component datasets. The Pile can be downloaded from the official website, or from a community mirror.
The Pile was **not** deduplicated before being used to train Pythia-1B. ### Training procedure All models were trained on the exact same data, in the exact same order. Each model saw 299,892,736,000 tokens during training, and 143 checkpoints for each model are saved every 2,097,152,000 tokens, spaced evenly throughout training, from to (which is the same as ). In addition, we also provide frequent early checkpoints: and . This corresponds to training for just under 1 epoch on the Pile for non-deduplicated models, and about 1.5 epochs on the deduplicated Pile. All *Pythia* models trained for 143000 steps at a batch size of 2M (2,097,152 tokens).
See GitHub for more details on training procedure, including how to reproduce it.
Pythia uses the same tokenizer as GPT-NeoX- 20B. ## Evaluations All 16 *Pythia* models were evaluated using the LM Evaluation Harness. You can access the results by model and step at in the GitHub repository.
Expand the sections below to see plots of evaluation results for all Pythia and Pythia-deduped models compared with OPT and BLOOM.
LAMBADA – OpenAI
Physical Interaction: Question Answering (PIQA)
WinoGrande
AI2 Reasoning Challenge—Easy Set
SciQ
## Changelog This section compares differences between previously released Pythia v0 and the current models. See Appendix B of the Pythia paper for further discussion of these changes and the motivation behind them. We found that retraining Pythia had no impact on benchmark performance. - All model sizes are now trained with uniform batch size of 2M tokens. Previously, the models of size 160M, 410M, and 1.4B parameters were trained with batch sizes of 4M tokens. - We added checkpoints at initialization (step 0) and steps {1,2,4,8,16,32,64, 128,256,512} in addition to every 1000 training steps. - Flash Attention was used in the new retrained suite. - We remedied a minor inconsistency that existed in the original suite: all models of size 2.8B parameters or smaller had a learning rate (LR) schedule which decayed to a minimum LR of 10% the starting LR rate, but the 6.9B and 12B models all used an LR schedule which decayed to a minimum LR of 0. In the redone training runs, we rectified this inconsistency: all models now were trained with LR decaying to a minimum of 0.1× their maximum LR. ### Naming convention and parameter count *Pythia* models were renamed in January 2023. It is possible that the old naming convention still persists in some documentation by accident. The current naming convention (70M, 160M, etc.) is based on total parameter count.
| current Pythia suffix | old suffix | total params | non-embedding params | | --------------------: | ---------: | -------------: | -------------------: | | 70M | 19M | 70,426,624 | 18,915,328 | | 160M | 125M | 162,322,944 | 85,056,000 | | 410M | 350M | 405,334,016 | 302,311,424 | | 1B | 800M | 1,011,781,632 | 805,736,448 | | 1.4B | 1.3B | 1,414,647,808 | 1,208,602,624 | | 2.8B | 2.7B | 2,775,208,960 | 2,517,652,480 | | 6.9B | 6.7B | 6,857,302,016 | 6,444,163,072 | | 12B | 13B | 11,846,072,320 | 11,327,027,200 |
", + "model_explanation_gemini": "A 1-billion-parameter English language model designed for interpretability research, trained on the Pile dataset with reproducible checkpoints, matching or exceeding similar-sized models like OPT and GPT-Neo.\n\n**Features:** \n- Transformer-based causal language model \n- Part of the Pythia Scaling Suite (1B parameters) \n- Trained on the Pile dataset (with deduplicated variant available) \n- 154 intermediate checkpoints provided \n- Matches/exceeds performance of comparable models" +} \ No newline at end of file diff --git a/model_data_json/FacebookAI_xlm-roberta-large-finetuned-conll03-english.json b/model_data_json/FacebookAI_xlm-roberta-large-finetuned-conll03-english.json new file mode 100644 index 0000000000000000000000000000000000000000..d8cf3390ca286ae01bc2ac977564147fbb209e16 --- /dev/null +++ b/model_data_json/FacebookAI_xlm-roberta-large-finetuned-conll03-english.json @@ -0,0 +1,115 @@ +{ + "model_id": "FacebookAI/xlm-roberta-large-finetuned-conll03-english", + "downloads": 79650, + "tags": [ + "transformers", + "pytorch", + "rust", + "onnx", + "safetensors", + "xlm-roberta", + "token-classification", + "multilingual", + "af", + "am", + "ar", + "as", + "az", + "be", + "bg", + "bn", + "br", + "bs", + "ca", + "cs", + "cy", + "da", + "de", + "el", + "en", + "eo", + "es", + "et", + "eu", + "fa", + "fi", + "fr", + "fy", + "ga", + "gd", + "gl", + "gu", + "ha", + "he", + "hi", + "hr", + "hu", + "hy", + "id", + "is", + "it", + "ja", + "jv", + "ka", + "kk", + "km", + "kn", + "ko", + "ku", + "ky", + "la", + "lo", + "lt", + "lv", + "mg", + "mk", + "ml", + "mn", + "mr", + "ms", + "my", + "ne", + "nl", + "no", + "om", + "or", + "pa", + "pl", + "ps", + "pt", + "ro", + "ru", + "sa", + "sd", + "si", + "sk", + "sl", + "so", + "sq", + "sr", + "su", + "sv", + "sw", + "ta", + "te", + "th", + "tl", + "tr", + "ug", + "uk", + "ur", + "uz", + "vi", + "xh", + "yi", + "zh", + "arxiv:1911.02116", + "arxiv:2008.03415", + "arxiv:1910.09700", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: - multilingual - af - am - ar - as - az - be - bg - bn - br - bs - ca - cs - cy - da - de - el - en - eo - es - et - eu - fa - fi - fr - fy - ga - gd - gl - gu - ha - he - hi - hr - hu - hy - id - is - it - ja - jv - ka - kk - km - kn - ko - ku - ky - la - lo - lt - lv - mg - mk - ml - mn - mr - ms - my - ne - nl - no - om - or - pa - pl - ps - pt - ro - ru - sa - sd - si - sk - sl - so - sq - sr - su - sv - sw - ta - te - th - tl - tr - ug - uk - ur - uz - vi - xh - yi - zh --- # xlm-roberta-large-finetuned-conll03-english # Table of Contents 1. Model Details 2. Uses 3. Bias, Risks, and Limitations 4. Training 5. Evaluation 6. Environmental Impact 7. Technical Specifications 8. Citation 9. Model Card Authors 10. How To Get Started With the Model # Model Details ## Model Description The XLM-RoBERTa model was proposed in Unsupervised Cross-lingual Representation Learning at Scale by Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer and Veselin Stoyanov. It is based on Facebook's RoBERTa model released in 2019. It is a large multi-lingual language model, trained on 2.5TB of filtered CommonCrawl data. This model is XLM-RoBERTa-large fine-tuned with the conll2003 dataset in English. - **Developed by:** See associated paper - **Model type:** Multi-lingual language model - **Language(s) (NLP) or Countries (images):** XLM-RoBERTa is a multilingual model trained on 100 different languages; see GitHub Repo for full list; model is fine-tuned on a dataset in English - **License:** More information needed - **Related Models:** RoBERTa, XLM - **Parent Model:** XLM-RoBERTa-large - **Resources for more information:** -GitHub Repo -Associated Paper # Uses ## Direct Use The model is a language model. The model can be used for token classification, a natural language understanding task in which a label is assigned to some tokens in a text. ## Downstream Use Potential downstream use cases include Named Entity Recognition (NER) and Part-of-Speech (PoS) tagging. To learn more about token classification and other potential downstream use cases, see the Hugging Face token classification docs. ## Out-of-Scope Use The model should not be used to intentionally create hostile or alienating environments for people. # Bias, Risks, and Limitations **CONTENT WARNING: Readers should be made aware that language generated by this model may be disturbing or offensive to some and may propagate historical and current stereotypes.** Significant research has explored bias and fairness issues with language models (see, e.g., Sheng et al. (2021) and Bender et al. (2021)). In the context of tasks relevant to this model, Mishra et al. (2020) explore social biases in NER systems for English and find that there is systematic bias in existing NER systems in that they fail to identify named entities from different demographic groups (though this paper did not look at BERT). For example, using a sample sentence from Mishra et al. (2020): ## Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. # Training See the following resources for training data and training procedure details: - XLM-RoBERTa-large model card - CoNLL-2003 data card - Associated paper # Evaluation See the associated paper for evaluation details. # Environmental Impact Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). - **Hardware Type:** 500 32GB Nvidia V100 GPUs (from the associated paper) - **Hours used:** More information needed - **Cloud Provider:** More information needed - **Compute Region:** More information needed - **Carbon Emitted:** More information needed # Technical Specifications See the associated paper for further details. # Citation **BibTeX:** **APA:** - Conneau, A., Khandelwal, K., Goyal, N., Chaudhary, V., Wenzek, G., Guzmán, F., ... & Stoyanov, V. (2019). Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:1911.02116. # Model Card Authors This model card was written by the team at Hugging Face. # How to Get Started with the Model Use the code below to get started with the model. You can use this model directly within a pipeline for NER.
Click to expand
", + "model_explanation_gemini": "\"Finetuned for English token classification tasks like Named Entity Recognition (NER) using the CoNLL-2003 dataset, based on a multilingual XLM-RoBERTa-large model.\"\n\nFeatures: \n- Multilingual base (XLM-RoBERTa-large) fine-tuned specifically for English \n- Task: Token classification (e.g., NER, POS tagging) \n- Trained on CoNLL-2003 dataset \n\nComparison: \nUnlike generic multilingual models, this variant" +} \ No newline at end of file diff --git a/model_data_json/Helsinki-NLP_opus-mt-en-da.json b/model_data_json/Helsinki-NLP_opus-mt-en-da.json new file mode 100644 index 0000000000000000000000000000000000000000..2314a4de47cb5f827dc946d20974e3bcdceebf5e --- /dev/null +++ b/model_data_json/Helsinki-NLP_opus-mt-en-da.json @@ -0,0 +1,20 @@ +{ + "model_id": "Helsinki-NLP/opus-mt-en-da", + "downloads": 81331, + "tags": [ + "transformers", + "pytorch", + "tf", + "marian", + "text2text-generation", + "translation", + "en", + "da", + "license:apache-2.0", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- tags: - translation license: apache-2.0 --- ### opus-mt-en-da * source languages: en * target languages: da * OPUS readme: en-da * dataset: opus * model: transformer-align * pre-processing: normalization + SentencePiece * download original weights: opus-2019-12-18.zip * test set translations: opus-2019-12-18.test.txt * test set scores: opus-2019-12-18.eval.txt ## Benchmarks | testset | BLEU | chr-F | |-----------------------|-------|-------| | Tatoeba.en.da | 60.4 | 0.745 |", + "model_explanation_gemini": "Translates English (en) to Danish (da) using a transformer-align model with SentencePiece pre-processing. \n\nFeatures: \n- Source language: English (en) \n- Target language: Danish (da) \n- Architecture: transformer-align \n- Pre-processing: normalization + SentencePiece \n- Training dataset: OPUS \n- Benchmarked on Tatoeba with BLEU (60.4) and chr-F (0.745) scores \n\nComparison: \nNo explicit comparison with other" +} \ No newline at end of file diff --git a/model_data_json/HuggingFaceTB_SmolLM2-1.7B-Instruct.json b/model_data_json/HuggingFaceTB_SmolLM2-1.7B-Instruct.json new file mode 100644 index 0000000000000000000000000000000000000000..4b550ee9de73802a8de98661d11393989293c603 --- /dev/null +++ b/model_data_json/HuggingFaceTB_SmolLM2-1.7B-Instruct.json @@ -0,0 +1,25 @@ +{ + "model_id": "HuggingFaceTB/SmolLM2-1.7B-Instruct", + "downloads": 80704, + "tags": [ + "transformers", + "tensorboard", + "onnx", + "safetensors", + "llama", + "text-generation", + "transformers.js", + "conversational", + "en", + "arxiv:2502.02737", + "base_model:HuggingFaceTB/SmolLM2-1.7B", + "base_model:quantized:HuggingFaceTB/SmolLM2-1.7B", + "license:apache-2.0", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- library_name: transformers license: apache-2.0 language: - en pipeline_tag: text-generation tags: - safetensors - onnx - transformers.js base_model: - HuggingFaceTB/SmolLM2-1.7B --- # SmolLM2 !image/png ## Table of Contents 1. Model Summary 2. Evaluation 3. Examples 4. Limitations 5. Training 6. License 7. Citation ## Model Summary SmolLM2 is a family of compact language models available in three size: 135M, 360M, and 1.7B parameters. They are capable of solving a wide range of tasks while being lightweight enough to run on-device. More details in our paper: The 1.7B variant demonstrates significant advances over its predecessor SmolLM1-1.7B, particularly in instruction following, knowledge, reasoning, and mathematics. It was trained on 11 trillion tokens using a diverse dataset combination: FineWeb-Edu, DCLM, The Stack, along with new mathematics and coding datasets that we curated and will release soon. We developed the instruct version through supervised fine-tuning (SFT) using a combination of public datasets and our own curated datasets. We then applied Direct Preference Optimization (DPO) using UltraFeedback. The instruct model additionally supports tasks such as text rewriting, summarization and function calling thanks to datasets developed by Argilla such as Synth-APIGen-v0.1. You can find the SFT dataset here: For more details refer to: You will find pre-training, post-training, evaluation and local inference code. ### How to use #### Transformers #### Chat in TRL You can also use the TRL CLI to chat with the model from the terminal: #### Transformers.js ## Evaluation In this section, we report the evaluation results of SmolLM2. All evaluations are zero-shot unless stated otherwise, and we use lighteval to run them. ## Base Pre-Trained Model | Metric | SmolLM2-1.7B | Llama-1B | Qwen2.5-1.5B | SmolLM1-1.7B | |------------------|--------------|-------------|---------------|--------------| | HellaSwag | **68.7** | 61.2 | 66.4 | 62.9 | | ARC (Average) | **60.5** | 49.2 | 58.5 | 59.9 | | PIQA | **77.6** | 74.8 | 76.1 | 76.0 | | MMLU-Pro (MCF) | **19.4** | 11.7 | 13.7 | 10.8 | | CommonsenseQA | **43.6** | 41.2 | 34.1 | 38.0 | | TriviaQA | **36.7** | 28.1 | 20.9 | 22.5 | | Winogrande | **59.4** | 57.8 | 59.3 | 54.7 | | OpenBookQA | 42.2 | 38.4 | 40.0 | **42.4** | | GSM8K (5-shot) | 31.0 | 7.2 | **61.3** | 5.5 | ## Instruction Model | Metric | SmolLM2-1.7B-Instruct | Llama-1B-Instruct | Qwen2.5-1.5B-Instruct | SmolLM1-1.7B-Instruct | |:-----------------------------|:---------------------:|:-----------------:|:----------------------:|:----------------------:| | IFEval (Average prompt/inst) | **56.7** | 53.5 | 47.4 | 23.1 | | MT-Bench | 6.13 | 5.48 | **6.52** | 4.33 | | OpenRewrite-Eval (micro_avg RougeL) | 44.9 | 39.2 | **46.9** | NaN | | HellaSwag | **66.1** | 56.1 | 60.9 | 55.5 | | ARC (Average) | **51.7** | 41.6 | 46.2 | 43.7 | | PIQA | **74.4** | 72.3 | 73.2 | 71.6 | | MMLU-Pro (MCF) | 19.3 | 12.7 | **24.2** | 11.7 | | BBH (3-shot) | 32.2 | 27.6 | **35.3** | 25.7 | | GSM8K (5-shot) | **48.2** | 26.8 | 42.8 | 4.62 | ## Examples Below are some system and instruct prompts that work well for special tasks ### Text rewriting ### Summarization ### Function calling SmolLM2-1.7B-Instruct can handle function calling, it scores 27% on the BFCL Leaderboard. Here's how you can leverage it: More details such as parallel function calls and tools not available can be found here ## Limitations SmolLM2 models primarily understand and generate content in English. They can produce text on a variety of topics, but the generated content may not always be factually accurate, logically consistent, or free from biases present in the training data. These models should be used as assistive tools rather than definitive sources of information. Users should always verify important information and critically evaluate any generated content. ## Training ### Model - **Architecture:** Transformer decoder - **Pretraining tokens:** 11T - **Precision:** bfloat16 ### Hardware - **GPUs:** 256 H100 ### Software - **Training Framework:** nanotron - **Alignment Handbook** alignment-handbook ## License Apache 2.0 ## Citation", + "model_explanation_gemini": "A compact 1.7B-parameter instruction-tuned language model optimized for on-device use, excelling in tasks like text rewriting, summarization, function calling, and reasoning while outperforming similar-sized models like Llama-1B and SmolLM1-1.7B in instruction following and knowledge tasks. \n\n**Features:** \n- Lightweight design for on-device deployment \n- Instruction fine-tuning (SFT + DPO) for improved task performance \n- Supports text" +} \ No newline at end of file diff --git a/model_data_json/HuggingFaceTB_SmolVLM-Instruct.json b/model_data_json/HuggingFaceTB_SmolVLM-Instruct.json new file mode 100644 index 0000000000000000000000000000000000000000..6a0a437bac0147e5794c2ad1a78a0f611f11fe21 --- /dev/null +++ b/model_data_json/HuggingFaceTB_SmolVLM-Instruct.json @@ -0,0 +1,23 @@ +{ + "model_id": "HuggingFaceTB/SmolVLM-Instruct", + "downloads": 72887, + "tags": [ + "transformers", + "onnx", + "safetensors", + "idefics3", + "image-text-to-text", + "conversational", + "en", + "dataset:HuggingFaceM4/the_cauldron", + "dataset:HuggingFaceM4/Docmatix", + "arxiv:2504.05299", + "base_model:HuggingFaceTB/SmolLM2-1.7B-Instruct", + "base_model:quantized:HuggingFaceTB/SmolLM2-1.7B-Instruct", + "license:apache-2.0", + "endpoints_compatible", + "region:us" + ], + "description": "--- library_name: transformers license: apache-2.0 datasets: - HuggingFaceM4/the_cauldron - HuggingFaceM4/Docmatix pipeline_tag: image-text-to-text language: - en base_model: - HuggingFaceTB/SmolLM2-1.7B-Instruct - google/siglip-so400m-patch14-384 --- \"Image # SmolVLM SmolVLM is a compact open multimodal model that accepts arbitrary sequences of image and text inputs to produce text outputs. Designed for efficiency, SmolVLM can answer questions about images, describe visual content, create stories grounded on multiple images, or function as a pure language model without visual inputs. Its lightweight architecture makes it suitable for on-device applications while maintaining strong performance on multimodal tasks. ## Model Summary - **Developed by:** Hugging Face 🤗 - **Model type:** Multi-modal model (image+text) - **Language(s) (NLP):** English - **License:** Apache 2.0 - **Architecture:** Based on Idefics3 (see technical summary) ## Resources - **Demo:** SmolVLM Demo - **Blog:** Blog post ## Uses SmolVLM can be used for inference on multimodal (image + text) tasks where the input comprises text queries along with one or more images. Text and images can be interleaved arbitrarily, enabling tasks like image captioning, visual question answering, and storytelling based on visual content. The model does not support image generation. To fine-tune SmolVLM on a specific task, you can follow the fine-tuning tutorial. ### Technical Summary SmolVLM leverages the lightweight SmolLM2 language model to provide a compact yet powerful multimodal experience. It introduces several changes compared to previous Idefics models: - **Image compression:** We introduce a more radical image compression compared to Idefics3 to enable the model to infer faster and use less RAM. - **Visual Token Encoding:** SmolVLM uses 81 visual tokens to encode image patches of size 384×384. Larger images are divided into patches, each encoded separately, enhancing efficiency without compromising performance. More details about the training and architecture are available in our technical report. ### How to get started You can use transformers to load, infer and fine-tune SmolVLM. ### Model optimizations **Precision**: For better performance, load and run the model in half-precision ( or ) if your hardware supports it. You can also load SmolVLM with 4/8-bit quantization using bitsandbytes, torchao or Quanto. Refer to this page for other options. **Vision Encoder Efficiency**: Adjust the image resolution by setting when initializing the processor, where N is your desired value. The default works well, which results in input images of size 1536×1536. For documents, might be beneficial. Decreasing N can save GPU memory and is appropriate for lower-resolution images. This is also useful if you want to fine-tune on videos. ## Misuse and Out-of-scope Use SmolVLM is not intended for high-stakes scenarios or critical decision-making processes that affect an individual's well-being or livelihood. The model may produce content that appears factual but may not be accurate. Misuse includes, but is not limited to: - Prohibited Uses: - Evaluating or scoring individuals (e.g., in employment, education, credit) - Critical automated decision-making - Generating unreliable factual content - Malicious Activities: - Spam generation - Disinformation campaigns - Harassment or abuse - Unauthorized surveillance ### License SmolVLM is built upon the shape-optimized SigLIP as image encoder and SmolLM2 for text decoder part. We release the SmolVLM checkpoints under the Apache 2.0 license. ## Training Details ### Training Data The training data comes from The Cauldron and Docmatix datasets, with emphasis on document understanding (25%) and image captioning (18%), while maintaining balanced coverage across other crucial capabilities like visual reasoning, chart comprehension, and general instruction following. \"Example ## Evaluation | Model | MMMU (val) | MathVista (testmini) | MMStar (val) | DocVQA (test) | TextVQA (val) | Min GPU RAM required (GB) | |-------------------|------------|----------------------|--------------|---------------|---------------|---------------------------| | SmolVLM | 38.8 | 44.6 | 42.1 | 81.6 | 72.7 | 5.02 | | Qwen-VL 2B | 41.1 | 47.8 | 47.5 | 90.1 | 79.7 | 13.70 | | InternVL2 2B | 34.3 | 46.3 | 49.8 | 86.9 | 73.4 | 10.52 | | PaliGemma 3B 448px| 34.9 | 28.7 | 48.3 | 32.2 | 56.0 | 6.72 | | moondream2 | 32.4 | 24.3 | 40.3 | 70.5 | 65.2 | 3.87 | | MiniCPM-V-2 | 38.2 | 39.8 | 39.1 | 71.9 | 74.1 | 7.88 | | MM1.5 1B | 35.8 | 37.2 | 0.0 | 81.0 | 72.5 | NaN | # Citation information You can cite us in the following way:", + "model_explanation_gemini": "\"SmolVLM is a compact multimodal model that processes interleaved image and text inputs to generate text outputs, excelling in tasks like visual question answering, image captioning, and storytelling while optimized for efficiency and on-device use.\"\n\n**Model Features:** \n1. Multimodal (image + text) input with arbitrary interleaving \n2. Text-only capability (functions as language model) \n3. Lightweight architecture for on-device applications \n4. 81 visual tokens" +} \ No newline at end of file diff --git a/model_data_json/HuggingFaceTB_SmolVLM2-2.2B-Instruct.json b/model_data_json/HuggingFaceTB_SmolVLM2-2.2B-Instruct.json new file mode 100644 index 0000000000000000000000000000000000000000..bdcea01e30a4fc3d80f485a481b1c2c8120ae146 --- /dev/null +++ b/model_data_json/HuggingFaceTB_SmolVLM2-2.2B-Instruct.json @@ -0,0 +1,33 @@ +{ + "model_id": "HuggingFaceTB/SmolVLM2-2.2B-Instruct", + "downloads": 74571, + "tags": [ + "transformers", + "safetensors", + "smolvlm", + "image-text-to-text", + "video-text-to-text", + "conversational", + "en", + "dataset:HuggingFaceM4/the_cauldron", + "dataset:HuggingFaceM4/Docmatix", + "dataset:lmms-lab/LLaVA-OneVision-Data", + "dataset:lmms-lab/M4-Instruct-Data", + "dataset:HuggingFaceFV/finevideo", + "dataset:MAmmoTH-VL/MAmmoTH-VL-Instruct-12M", + "dataset:lmms-lab/LLaVA-Video-178K", + "dataset:orrzohar/Video-STaR", + "dataset:Mutonix/Vript", + "dataset:TIGER-Lab/VISTA-400K", + "dataset:Enxin/MovieChat-1K_train", + "dataset:ShareGPT4Video/ShareGPT4Video", + "arxiv:2504.05299", + "base_model:HuggingFaceTB/SmolVLM-Instruct", + "base_model:finetune:HuggingFaceTB/SmolVLM-Instruct", + "license:apache-2.0", + "endpoints_compatible", + "region:us" + ], + "description": "--- library_name: transformers license: apache-2.0 datasets: - HuggingFaceM4/the_cauldron - HuggingFaceM4/Docmatix - lmms-lab/LLaVA-OneVision-Data - lmms-lab/M4-Instruct-Data - HuggingFaceFV/finevideo - MAmmoTH-VL/MAmmoTH-VL-Instruct-12M - lmms-lab/LLaVA-Video-178K - orrzohar/Video-STaR - Mutonix/Vript - TIGER-Lab/VISTA-400K - Enxin/MovieChat-1K_train - ShareGPT4Video/ShareGPT4Video pipeline_tag: image-text-to-text tags: - video-text-to-text language: - en base_model: - HuggingFaceTB/SmolVLM-Instruct --- \"Image # SmolVLM2 2.2B SmolVLM2-2.2B is a lightweight multimodal model designed to analyze video content. The model processes videos, images, and text inputs to generate text outputs - whether answering questions about media files, comparing visual content, or transcribing text from images. Despite its compact size, requiring only 5.2GB of GPU RAM for video inference, it delivers robust performance on complex multimodal tasks. This efficiency makes it particularly well-suited for on-device applications where computational resources may be limited. ## Model Summary - **Developed by:** Hugging Face 🤗 - **Model type:** Multi-modal model (image/multi-image/video/text) - **Language(s) (NLP):** English - **License:** Apache 2.0 - **Architecture:** Based on Idefics3 (see technical summary) ## Resources - **Demo:** Video Highlight Generator - **Blog:** Blog post ## Uses SmolVLM2 can be used for inference on multimodal (video / image / text) tasks where the input consists of text queries along with video or one or more images. Text and media files can be interleaved arbitrarily, enabling tasks like captioning, visual question answering, and storytelling based on visual content. The model does not support image or video generation. To fine-tune SmolVLM2 on a specific task, you can follow the fine-tuning tutorial. ## Evaluation ### Vision Evaluation | Model | Mathvista | MMMU | OCRBench | MMStar | AI2D | ChartQA_Test | Science_QA | TextVQA Val | DocVQA Val | |-------------------|-----------|-------|----------|--------|------|--------------|------------|-------------|------------| | **SmolVLM2 2.2B** | 51.5 | 42 | 72.9 | 46 | 70 | 68.84 | 90 | 73.21 | 79.98 | | SmolVLM 2.2B | 43.9 | 38.3 | 65.5 | 41.8 | 84.5 | 71.6 | 84.5 | 72.1 | 79.7 | ### Video Evaluation We evaluated the performance of the SmolVLM2 family on the following scientific benchmarks: | Size | Video-MME | MLVU | MVBench | |----------|-----------------|----------|---------------| | 2.2B | 52.1 | 55.2 | 46.27 | | 500M | 42.2 | 47.3 | 39.73 | | 256M | 33.7 | 40.6 | 32.7 | ### How to get started You can use transformers to load, infer and fine-tune SmolVLM. Make sure you have num2words, flash-attn and latest transformers installed. You can load the model as follows. #### Simple Inference You preprocess your inputs directly using chat templates and directly passing them #### Video Inference To use SmolVLM2 for video inference, make sure you have decord installed. #### Multi-image Interleaved Inference You can interleave multiple media with text using chat templates. ### Model optimizations ## Misuse and Out-of-scope Use SmolVLM is not intended for high-stakes scenarios or critical decision-making processes that affect an individual's well-being or livelihood. The model may produce content that appears factual but may not be accurate. Misuse includes, but is not limited to: - Prohibited Uses: - Evaluating or scoring individuals (e.g., in employment, education, credit) - Critical automated decision-making - Generating unreliable factual content - Malicious Activities: - Spam generation - Disinformation campaigns - Harassment or abuse - Unauthorized surveillance ### License SmolVLM2 is built upon the shape-optimized SigLIP as image encoder and SmolLM2 for text decoder part. We release the SmolVLM2 checkpoints under the Apache 2.0 license. ## Citation information You can cite us in the following way: ## Training Data SmolVLM2 used 3.3M samples for training originally from ten different datasets: LlaVa Onevision, M4-Instruct, Mammoth, LlaVa Video 178K, FineVideo, VideoStar, VRipt, Vista-400K, MovieChat and ShareGPT4Video. In the following plots we give a general overview of the samples across modalities and the source of those samples. ## Data Split per modality | Data Type | Percentage | |--------------|------------| | Image | 34.4% | | Text | 20.2% | | Video | 33.0% | | Multi-image | 12.3% | ## Granular dataset slices per modality ### Text Datasets | Dataset | Percentage | |--------------------------------------------|------------| | llava-onevision/magpie_pro_ft3_80b_mt | 6.8% | | llava-onevision/magpie_pro_ft3_80b_tt | 6.8% | | llava-onevision/magpie_pro_qwen2_72b_tt | 5.8% | | llava-onevision/mathqa | 0.9% | ### Multi-image Datasets | Dataset | Percentage | |--------------------------------------------|------------| | m4-instruct-data/m4_instruct_multiimage | 10.4% | | mammoth/multiimage-cap6 | 1.9% | ### Image Datasets | Dataset | Percentage | |--------------------------------------------|------------| | llava-onevision/other | 17.4% | | llava-onevision/vision_flan | 3.9% | | llava-onevision/mavis_math_metagen | 2.6% | | llava-onevision/mavis_math_rule_geo | 2.5% | | llava-onevision/sharegpt4o | 1.7% | | llava-onevision/sharegpt4v_coco | 1.5% | | llava-onevision/image_textualization | 1.3% | | llava-onevision/sharegpt4v_llava | 0.9% | | llava-onevision/mapqa | 0.9% | | llava-onevision/qa | 0.8% | | llava-onevision/textocr | 0.8% | ### Video Datasets | Dataset | Percentage | |--------------------------------------------|------------| | llava-video-178k/1-2m | 7.3% | | llava-video-178k/2-3m | 7.0% | | other-video/combined | 5.7% | | llava-video-178k/hound | 4.4% | | llava-video-178k/0-30s | 2.4% | | video-star/starb | 2.2% | | vista-400k/combined | 2.2% | | vript/long | 1.0% | | ShareGPT4Video/all | 0.8% |", + "model_explanation_gemini": "A lightweight multimodal model analyzing video, image, and text inputs to generate text outputs for tasks like visual question answering, captioning, and transcription, optimized for efficiency in resource-limited environments. \n\n**Features:** \n- Multimodal (video/image/text input, text output) \n- Tasks: captioning, visual QA, storytelling, text transcription \n- Compact size (5.2GB GPU RAM for video inference) \n- No image/video generation capability \n- Apache 2.0 license" +} \ No newline at end of file diff --git a/model_data_json/KBLab_kb-whisper-small.json b/model_data_json/KBLab_kb-whisper-small.json new file mode 100644 index 0000000000000000000000000000000000000000..c47043b3851c7ee8dd3fcf6ce826ffc228b57959 --- /dev/null +++ b/model_data_json/KBLab_kb-whisper-small.json @@ -0,0 +1,21 @@ +{ + "model_id": "KBLab/kb-whisper-small", + "downloads": 28474, + "tags": [ + "transformers", + "onnx", + "safetensors", + "whisper", + "automatic-speech-recognition", + "ctranslate2", + "sv", + "dataset:KBLab/rixvox-v2", + "base_model:openai/whisper-small", + "base_model:quantized:openai/whisper-small", + "license:apache-2.0", + "endpoints_compatible", + "region:us" + ], + "description": "--- library_name: transformers base_model: openai/whisper-small language: - sv pipeline_tag: automatic-speech-recognition license: apache-2.0 datasets: - KBLab/rixvox-v2 tags: - ctranslate2 --- ## KB-Whisper Small The National Library of Sweden releases a new suite of Whisper models trained on over 50,000 hours of Swedish speech. In evaluations across FLEURS, CommonVoice and NST, our best performing model reduces the Word Error Rate (WER) by an average of 47% compared to OpenAI's . The performance of smaller Whisper model sizes on Swedish speech has also substantially improved, with outperforming (a model six times its size). | Model size | | FLEURS | CommonVoice | NST | |------------|---------|--------|-------------|------| | tiny | **KBLab** | **13.2** | **12.9** | **11.2** | | | OpenAI | 59.2 | 67.8 | 85.2 | | base | **KBLab** | **9.1** | **8.7** | **7.8** | | | OpenAI | 39.6 | 52.1 | 53.4 | | small | **KBLab** | **7.3** | **6.4** | **6.6** | | | OpenAI | 20.6 | 26.4 | 26.4 | | medium | **KBLab** | **6.6** | **5.4** | **5.8** | | | OpenAI | 12.1 | 15.8 | 17.1 | | large-v3 | **KBLab** | **5.4** | **4.1** | **5.2** | | | OpenAI | 7.8 | 9.5 | 11.3 | Table: **Word Error Rate (WER)** comparison between KBLab's Whisper models and the corresponding OpenAI versions. ### Usage We provide checkpoints in different formats: , (GGML), , and (used in and ). #### Hugging Face Inference example for using with Hugging Face: #### Faster-whisper Faster-whisper provides fast and efficient inference via a reimplementation of Whisper using . #### WhisperX WhisperX provides a convenient method of getting accurate word level timestamps. The library combines (force aligns) the text output of Whisper with the accurate timestamps of Wav2vec2. We provide an example below of how to use together with KBLab/wav2vec2-large-voxrex-swedish. #### Whisper.cpp / GGML We provide GGML checkpoints used in the apps and . To use our model with first clone the repository and build the library: To use the model you need to download one of the GGML checkpoints we have uploaded. You can either press the download buttons here, or download using : Run inference by specifying the model path after the argument , along with the path to the audio file as the last positional argument. #### onnx (optimum) and transformers.js usage You can use the checkpoints via Hugging Face's library in the following manner: An example of an app that runs inference locally in the browser with and can be found at (created by Pierre Mesure). A template for setting up such an app with javascript can be found at ### Training data Our models have been trained on over 50,000 hours of Swedish audio with text transcriptions. The models were trained in 2 stages, each characterized by the application of different quality filters and thresholds for said filters. Stage 1 employed low threshold values (0 to 0.30 BLEU depending on dataset), whereas Stage 2 used stricter thresholds (, weighted ROUGE-N , CER of first and last 10 characters ). | Dataset | Continued pretraining (h) -- Stage 1 | Finetuning (h) -- Stage 2 | |-------------|--------------------------|--------------| | Subtitles | 34,261 | 3,110 | | Riksdag | 21,949 | 5,119 | | ISOF | 54 | 54 | | NST | 250 | 250 | | **Total** | **56,514** | **8,533** | The default when loading our models through Hugging Face is **Stage 2**. We have however also uploaded continued pretraining checkpoints and tagged them. You can load these other checkpoints by specifying the in . The pretrained checkpoints tag can for example be found here: []( The Stage 2 default model tag is named . We supply a different stage 2 checkpoint -- with a more condensed style of transcribing -- under the name . ### Evaluation #### WER | Model size | | FLEURS | CommonVoice | NST | |------------|---------|--------|-------------|------| | tiny | **KBLab** | **13.2** | **12.9** | **11.2** | | | OpenAI | 59.2 | 67.8 | 85.2 | | base | **KBLab** | **9.1** | **8.7** | **7.8** | | | OpenAI | 39.6 | 52.1 | 53.4 | | small | **KBLab** | **7.3** | **6.4** | **6.6** | | | OpenAI | 20.6 | 26.4 | 26.4 | | medium | **KBLab** | **6.6** | **5.4** | **5.8** | | | OpenAI | 12.1 | 15.8 | 17.1 | | large-v3 | **KBLab** | **5.4** | **4.1** | **5.2** | | | OpenAI | 7.8 | 9.5 | 11.3 | #### BLEU Score | Model size | | FLEURS | CommonVoice | NST | |------------|---------|--------|-------------|------| | tiny | KBLab | **76.6** | **73.7** | **74.3** | | | OpenAI | 26.9 | 21.1 | 24.0 | | base | KBLab | **83.2** | **79.9** | **78.3** | | | OpenAI | 41.1 | 32.5 | 36.9 | | small | KBLab | **86.6** | **83.5** | **79.6** | | | OpenAI | 64.0 | 56.5 | 58.2 | | medium | KBLab | **87.6** | **85.0** | **80.2** | | | OpenAI | 77.1 | 70.1 | 68.9 | | large-v3 | KBLab | **89.8** | **87.2** | **81.1** | | | OpenAI | 84.9 | 79.1 | 75.1 | ### Acknowledgements We acknowledge the EuroHPC Joint Undertaking for awarding this project access to the EuroHPC supercomputer LEONARDO, hosted by CINECA (Italy) and the LEONARDO consortium through an EuroHPC AI and Data-Intensive Applications Access call. ### Citation Paper reference coming soon.", + "model_explanation_gemini": "A Swedish-optimized automatic speech recognition model trained on 50,000+ hours of Swedish speech, significantly reducing word error rates compared to OpenAI's Whisper models. \n\n**Features**: \n- Optimized for Swedish speech recognition \n- Trained on 50,000+ hours of Swedish audio \n- Lower Word Error Rate (WER) than OpenAI's Whisper models (47% average improvement) \n- Available in multiple formats (CT2, GGML, ONNX, Transformers" +} \ No newline at end of file diff --git a/model_data_json/LGAI-EXAONE_EXAONE-3.0-7.8B-Instruct.json b/model_data_json/LGAI-EXAONE_EXAONE-3.0-7.8B-Instruct.json new file mode 100644 index 0000000000000000000000000000000000000000..14777a0cdebe90155ecb18fd1a0b61a38919b2c3 --- /dev/null +++ b/model_data_json/LGAI-EXAONE_EXAONE-3.0-7.8B-Instruct.json @@ -0,0 +1,21 @@ +{ + "model_id": "LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct", + "downloads": 69297, + "tags": [ + "transformers", + "safetensors", + "exaone", + "text-generation", + "lg-ai", + "conversational", + "custom_code", + "en", + "ko", + "arxiv:2408.03541", + "license:other", + "autotrain_compatible", + "region:us" + ], + "description": "--- license: other license_name: exaone license_link: LICENSE language: - en - ko tags: - lg-ai - exaone ---


# EXAONE-3.0-7.8B-Instruct **👋👋 We have revised our license for revitalizing the research ecosystem.👋👋** ## Introduction We introduce EXAONE-3.0-7.8B-Instruct, a pre-trained and instruction-tuned bilingual (English and Korean) generative model with 7.8 billion parameters. The model was pre-trained with 8T curated tokens and post-trained with supervised fine-tuning and direct preference optimization. It demonstrates highly competitive benchmark performance against other state-of-the-art open models of similar size. For more details, please refer to our technical report, blog and GitHub. ## Quickstart We recommend to use transformers v4.41 or later. > ### Note > The EXAONE 3.0 instruction-tuned language model was trained to utilize the system prompt, > so we highly recommend using the system prompts provided in the code snippet above. ## Evaluation We compared EXAONE-3.0-7.8B-Instruct with similar-sized instruction-tuned LLMs. To verify the performance of real-world use cases, we measured benchmarks that have a high correlation with LMSYS Chatbot Arena. Some experimental results are shown below. The full evaluation results can be found in the technical report. | Language | Benchmark | EXAONE 3.0
7.8B Inst. | Llama 3.1
8B Inst. | Gemma 2
9B Inst. | QWEN 2
7B Inst. | Phi 3
7B Inst. | Mistral 7B
Inst. | | :-----: | :----- | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | | English | MT-Bench | **9.01** | 7.95 | 8.52 | 8.41 | 8.52 | 7.72 | | | Arena-Hard-v0.1 | **46.8** | 28.0 | 42.1 | 21.7 | 29.1 | 16.2 | | | WildBench | **48.2** | 34.5 | 41.5 | 34.9 | 32.8 | 29.0 | | | AlpacaEval 2.0 LC | 45.0 | 31.5 | **47.5** | 24.5 | 37.1 | 31.0 | | Korean | KoMT-Bench[1] | **8.92** | 6.06 | 7.92 | 7.69 | 4.87 | 5.20 | | | LogicKor | **8.62** | 5.40 | 8.07 | 6.12 | 3.76 | 3.42 | - [1] KoMT-Bench is a dataset created by translating MT-Bench into Korean; see README for more details. ## Limitation The EXAONE language model has certain limitations and may occasionally generate inappropriate responses. The language model generates responses based on the output probability of tokens, and it is determined during learning from training data. While we have made every effort to exclude personal, harmful, and biased information from the training data, some problematic content may still be included, potentially leading to undesirable responses. Please note that the text generated by EXAONE language model does not reflects the views of LG AI Research. - Inappropriate answers may be generated, which contain personal, harmful or other inappropriate information. - Biased responses may be generated, which are associated with age, gender, race, and so on. - The generated responses rely heavily on statistics from the training data, which can result in the generation of semantically or syntactically incorrect sentences. - Since the model does not reflect the latest information, the responses may be false or contradictory. LG AI Research strives to reduce potential risks that may arise from EXAONE language model. Users are not allowed to engage in any malicious activities (e.g., keying in illegal information) that may induce the creation of inappropriate outputs violating LG AI’s ethical principles when using EXAONE language model. ## License The model is licensed under EXAONE AI Model License Agreement 1.1 - NC ## Citation ## Contact LG AI Research Technical Support: contact_us@lgresearch.ai", + "model_explanation_gemini": "A 7.8B-parameter bilingual (English and Korean) instruction-tuned generative model optimized for competitive performance in benchmarks, trained with supervised fine-tuning and direct preference optimization. \n\n**Features:** \n- Bilingual (English and Korean) \n- 7.8 billion parameters \n- Pre-trained on 8T tokens \n- Instruction-tuned with supervised fine-tuning and DPO \n- Competitive benchmark performance against similar-sized models \n- Requires system prompts for optimal use \n\n**Comparison:**" +} \ No newline at end of file diff --git a/model_data_json/LorenzoDeMattei_GePpeTto.json b/model_data_json/LorenzoDeMattei_GePpeTto.json new file mode 100644 index 0000000000000000000000000000000000000000..9ff336fc1baa575dbe48eae3ad009228a9cdb409 --- /dev/null +++ b/model_data_json/LorenzoDeMattei_GePpeTto.json @@ -0,0 +1,20 @@ +{ + "model_id": "LorenzoDeMattei/GePpeTto", + "downloads": 78170, + "tags": [ + "transformers", + "pytorch", + "jax", + "safetensors", + "gpt2", + "text-generation", + "it", + "arxiv:2004.14253", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: it --- # GePpeTto GPT2 Model 🇮🇹 Pretrained GPT2 117M model for Italian. You can find further details in the paper: Lorenzo De Mattei, Michele Cafagna, Felice Dell’Orletta, Malvina Nissim, Marco Guerini \"GePpeTto Carves Italian into a Language Model\", arXiv preprint. Pdf available at: ## Pretraining Corpus The pretraining set comprises two main sources. The first one is a dump of Italian Wikipedia (November 2019), consisting of 2.8GB of text. The second one is the ItWac corpus (Baroni et al., 2009), which amounts to 11GB of web texts. This collection provides a mix of standard and less standard Italian, on a rather wide chronological span, with older texts than the Wikipedia dump (the latter stretches only to the late 2000s). ## Pretraining details This model was trained using GPT2's Hugging Face implemenation on 4 NVIDIA Tesla T4 GPU for 620k steps. Training parameters: - GPT-2 small configuration - vocabulary size: 30k - Batch size: 32 - Block size: 100 - Adam Optimizer - Initial learning rate: 5e-5 - Warm up steps: 10k ## Perplexity scores | Domain | Perplexity | |---|---| | Wikipedia | 26.1052 | | ItWac | 30.3965 | | Legal | 37.2197 | | News | 45.3859 | | Social Media | 84.6408 | For further details, qualitative analysis and human evaluation check out: ## Load Pretrained Model You can use this model by installing Huggingface library . And you can use it directly by initializing it like this: ## Example using GPT2LMHeadModel Output is, ## Citation Please use the following bibtex entry: ## References Marco Baroni, Silvia Bernardini, Adriano Ferraresi, and Eros Zanchetta. 2009. The WaCky wide web: a collection of very large linguistically processed webcrawled corpora. Language resources and evaluation, 43(3):209–226.", + "model_explanation_gemini": "\"GePpeTto is a 117M-parameter GPT-2 model pretrained on Italian text from Wikipedia and ItWac for generating and processing Italian language.\"\n\n**Model Features:** \n- Language: Italian (🇮🇹) \n- Architecture: GPT-2 small (117M parameters) \n- Pretraining Corpus: Italian Wikipedia (2.8GB) + ItWac (11GB) \n- Training: 620k steps on 4× Tesla T4" +} \ No newline at end of file diff --git a/model_data_json/M-CLIP_XLM-Roberta-Large-Vit-B-16Plus.json b/model_data_json/M-CLIP_XLM-Roberta-Large-Vit-B-16Plus.json new file mode 100644 index 0000000000000000000000000000000000000000..eafad916ccbeff94d8d67df661287388568b8e9a --- /dev/null +++ b/model_data_json/M-CLIP_XLM-Roberta-Large-Vit-B-16Plus.json @@ -0,0 +1,61 @@ +{ + "model_id": "M-CLIP/XLM-Roberta-Large-Vit-B-16Plus", + "downloads": 83225, + "tags": [ + "transformers", + "pytorch", + "tf", + "multilingual", + "af", + "sq", + "am", + "ar", + "az", + "bn", + "bs", + "bg", + "ca", + "zh", + "hr", + "cs", + "da", + "nl", + "en", + "et", + "fr", + "de", + "el", + "hi", + "hu", + "is", + "id", + "it", + "ja", + "mk", + "ml", + "mr", + "pl", + "pt", + "ro", + "ru", + "sr", + "sl", + "es", + "sw", + "sv", + "tl", + "te", + "tr", + "tk", + "uk", + "ur", + "ug", + "uz", + "vi", + "xh", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: - multilingual - af - sq - am - ar - az - bn - bs - bg - ca - zh - hr - cs - da - nl - en - et - fr - de - el - hi - hu - is - id - it - ja - mk - ml - mr - pl - pt - ro - ru - sr - sl - es - sw - sv - tl - te - tr - tk - uk - ur - ug - uz - vi - xh --- ## Multilingual-clip: XLM-Roberta-Large-Vit-B-16Plus Multilingual-CLIP extends OpenAI's English text encoders to multiple other languages. This model *only* contains the multilingual text encoder. The corresponding image model can be retrieved via instructions found on open_clip repository on Github. We provide a usage example below. ## Requirements To use both the multilingual text encoder and corresponding image encoder, we need to install the packages []( and []( ## Usage Extracting embeddings from the text encoder can be done in the following way: Extracting embeddings from the corresponding image encoder: ## Evaluation results None of the M-CLIP models have been extensivly evaluated, but testing them on Txt2Img retrieval on the humanly translated MS-COCO dataset, we see the following **R@10** results: | Name | En | De | Es | Fr | Zh | It | Pl | Ko | Ru | Tr | Jp | | ----------------------------------|:-----: |:-----: |:-----: |:-----: | :-----: |:-----: |:-----: |:-----: |:-----: |:-----: |:-----: | | OpenAI CLIP Vit-B/32| 90.3 | - | - | - | - | - | - | - | - | - | - | | OpenAI CLIP Vit-L/14| 91.8 | - | - | - | - | - | - | - | - | - | - | | OpenCLIP ViT-B-16+-| 94.3 | - | - | - | - | - | - | - | - | - | - | | LABSE Vit-L/14| 91.6 | 89.6 | 89.5 | 89.9 | 88.9 | 90.1 | 89.8 | 80.8 | 85.5 | 89.8 | 73.9 | | XLM-R Large Vit-B/32| 91.8 | 88.7 | 89.1 | 89.4 | 89.3 | 89.8| 91.4 | 82.1 | 86.1 | 88.8 | 81.0 | | XLM-R Vit-L/14| 92.4 | 90.6 | 91.0 | 90.0 | 89.7 | 91.1 | 91.3 | 85.2 | 85.8 | 90.3 | 81.9 | | XLM-R Large Vit-B/16+| **95.0** | **93.0** | **93.6** | **93.1** | **94.0** | **93.1** | **94.4** | **89.0** | **90.0** | **93.0** | **84.2** | ## Training/Model details Further details about the model training and data can be found in the model card.", + "model_explanation_gemini": "Extends OpenAI's CLIP text encoder to multiple languages for generating multilingual text embeddings that pair with an image encoder for cross-modal tasks like text-to-image retrieval.\n\nModel Features: \n- Multilingual text encoder (supports 50+ languages) \n- Based on XLM-Roberta-Large architecture \n- Paired with ViT-B-16Plus image encoder (separately available) \n- Designed for cross-modal retrieval tasks \n\nComparison: \nThe XLM-R Large Vit-B/16" +} \ No newline at end of file diff --git a/model_data_json/MahmoodLab_TITAN.json b/model_data_json/MahmoodLab_TITAN.json new file mode 100644 index 0000000000000000000000000000000000000000..f384432848aa36c5a82eae4d9b58b26121beca94 --- /dev/null +++ b/model_data_json/MahmoodLab_TITAN.json @@ -0,0 +1,22 @@ +{ + "model_id": "MahmoodLab/TITAN", + "downloads": 239360, + "tags": [ + "safetensors", + "titan", + "histology", + "pathology", + "vision", + "pytorch", + "self-supervised", + "vit", + "image-feature-extraction", + "custom_code", + "en", + "arxiv:2411.19666", + "license:cc-by-nc-nd-4.0", + "region:us" + ], + "description": "--- license: cc-by-nc-nd-4.0 language: - en tags: - histology - pathology - vision - pytorch - self-supervised - vit extra_gated_prompt: >- This model and associated code are released under the CC-BY-NC-ND 4.0 license and may only be used for non-commercial, academic research purposes with proper attribution. Any commercial use, sale, or other monetization of the TITAN model and its derivatives, which include models trained on outputs from the TITAN model or datasets created from the TITAN model, is prohibited and requires prior approval. Please note that the primary email used to sign up for your Hugging Face account must match your institutional email to receive approval. By downloading the model, you attest that all information (affiliation, research use) is correct and up-to-date. Downloading the model requires prior registration on Hugging Face and agreeing to the terms of use. By downloading this model, you agree not to distribute, publish or reproduce a copy of the model. If another user within your organization wishes to use the TITAN model, they must register as an individual user and agree to comply with the terms of use. Users may not attempt to re-identify the deidentified data used to develop the underlying model. If you are a commercial entity, please contact the corresponding author. extra_gated_fields: Full name (first and last): text Current affiliation (no abbreviations): text Type of Affiliation: type: select options: - Academia - Industry - label: Other value: other Current and official institutional email (**this must match your primary email in your Hugging Face account, @gmail/@hotmail/@qq email domains will be denied**): text Please explain your intended research use: text I agree to all terms outlined above: checkbox I agree to use this model for non-commercial, academic purposes only: checkbox I agree not to distribute the model, if another user within your organization wishes to use the TITAN model, they must register as an individual user: checkbox metrics: - accuracy pipeline_tag: image-feature-extraction --- # Model Card for TITAN-preview \\[Preprint\\] | \\[Github Repo\\] | \\[Cite\\] ## What is TITAN? **TITAN** (**T**ransformer-based pathology **I**mage and **T**ext **A**lignment **N**etwork) is a multimodal whole-slide foundation model pre-trained using visual self-supervised learning and vision-language alignment. It leverages 335,645 whole-slide images (WSIs) from a diverse set of internally collected neoplastic, infectious, and inflammatory cases at Mass General Brigham. Additionally, TITAN utilizes over 182,000 pathology reports and more than 423,000 synthetic captions generated by PathChat, our pathology co-pilot. TITAN's slide embeddings achieve state-of-the-art performance on diverse downstream tasks, including linear probing, few-shot and zero-shot classification, rare cancer retrieval, cross-modal retrieval, and pathology report generation. This is a preview and we will bring you further updates and improvements. **your request will be denied**. To fix this, you can: (1) add your official institutional email to your HF account, and confirm your email address to verify, and (2) set your institutional email as your primary email in your HF account. Other reasons for your request access being denied include other mistakes in the form submitted, for example: full name includes abbreviations, affiliation is not spelled out, the described research use is not sufficient, or email domain address not recognized. ## Model Description - **Developed by:** Mahmood Lab AI for Pathology @ Harvard/BWH - **Model type:** Pretrained vision-language encoders - **Pretraining dataset:** Mass-340K, sourced from private histology collections (BWH / MGH), in addition to slides from the public GTEx consortium. - **Repository:** - **Preprint:** - **License:** CC-BY-NC-ND-4.0 ### Requirements ### Model Usage TITAN-preview is a vision-lanuage model trained on CONCH v1.5 patch features with patch size of 512x512 pixels at 20x magnification. Following authentication (using ), both TITAN-preview (slide and language encoders) and CONCH v1.5 (patch encoder) can be loaded using the commands below: You can directly use TITAN-preview for slide-level feature extaction. TITAN builds a feature grids from CONCH v1.5 patch features using the coordinates and the distance between the patches. As patch coordinates are always saved at the slides' level 0 magnification, TITAN takes patch_size_lv0 which represents the distance between two adjacent patches at level 0 magnification. It is 1024 if slide is 40x, or 512 if slide is 20x. We have this info saved in our demo TCGA features. Slide-level feature extraction can be done in the following way: These pre-extracted features can then be used for slide-level classification (via linear probing), retrieval (via l2 distance), and other machine learning settings, without task-specific finetuning. We also released all TCGA TITAN-preview features in . We demonstrated more detailed linear probe and zero-shot evaluation in our github. ## License and Terms of Use This model and associated code are released under the CC-BY-NC-ND 4.0 license and may only be used for non-commercial, academic research purposes with proper attribution. Any commercial use, sale, or other monetization of the TITAN model and its derivatives, which include models trained on outputs from the TITAN model or datasets created from the TITAN model, is prohibited and requires prior approval. Downloading the model requires prior registration on Hugging Face and agreeing to the terms of use. By downloading this model, you agree not to distribute, publish or reproduce a copy of the model. If another user within your organization wishes to use the TITAN model, they must register as an individual user and agree to comply with the terms of use. Users may not attempt to re-identify the deidentified data used to develop the underlying model. If you are a commercial entity, please contact the corresponding author. ## Contact For any additional questions or comments, contact Faisal Mahmood (), \\ Tong Ding (), \\ Sophia J. Wagner (), \\ Andrew H. Song (), \\ or Richard J. Chen (), ## Acknowledgements The project was built on top of amazing repositories such as ViT, iBOT, OpenClip, LGSSL, and Timm (ViT model implementation). We thank the authors and developers for their contribution. ## BibTeX If you found our work useful in your research, please consider citing our work at: Ding, T.\\*, Wagner S.J.\\*, Song, A.H.\\*, Chen, R.J.\\* et al. Multimodal Whole Slide Foundation Model for Pathology, Arxiv, 2024", + "model_explanation_gemini": "TITAN is a multimodal foundation model for pathology, combining visual self-supervised learning and vision-language alignment to analyze whole-slide images and generate pathology reports, achieving state-of-the-art performance in tasks like classification, retrieval, and report generation.\n\nFeatures: \n- Multimodal (image + text) pathology model \n- Pretrained on 335,645 WSIs and 182,000+ reports \n- Self-supervised ViT architecture \n- Supports zero-shot/few-shot learning \n-" +} \ No newline at end of file diff --git a/model_data_json/MoritzLaurer_bge-m3-zeroshot-v2.0.json b/model_data_json/MoritzLaurer_bge-m3-zeroshot-v2.0.json new file mode 100644 index 0000000000000000000000000000000000000000..1ed715fcfa5626f149d041a2d42ac94dbff99d29 --- /dev/null +++ b/model_data_json/MoritzLaurer_bge-m3-zeroshot-v2.0.json @@ -0,0 +1,22 @@ +{ + "model_id": "MoritzLaurer/bge-m3-zeroshot-v2.0", + "downloads": 70245, + "tags": [ + "transformers", + "onnx", + "safetensors", + "xlm-roberta", + "text-classification", + "zero-shot-classification", + "multilingual", + "arxiv:2312.17543", + "base_model:BAAI/bge-m3-retromae", + "base_model:quantized:BAAI/bge-m3-retromae", + "license:mit", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: - multilingual tags: - text-classification - zero-shot-classification base_model: BAAI/bge-m3-retromae pipeline_tag: zero-shot-classification library_name: transformers license: mit --- # Model description: bge-m3-zeroshot-v2.0 ## zeroshot-v2.0 series of models Models in this series are designed for efficient zeroshot classification with the Hugging Face pipeline. These models can do classification without training data and run on both GPUs and CPUs. An overview of the latest zeroshot classifiers is available in my Zeroshot Classifier Collection. The main update of this series of models is that several models are trained on fully commercially-friendly data for users with strict license requirements. These models can do one universal classification task: determine whether a hypothesis is \"true\" or \"not true\" given a text ( vs. ). This task format is based on the Natural Language Inference task (NLI). The task is so universal that any classification task can be reformulated into this task by the Hugging Face pipeline. ## Training data Models with a \"\" in the name are trained on two types of fully commercially-friendly data: 1. Synthetic data generated with Mixtral-8x7B-Instruct-v0.1. I first created a list of 500+ diverse text classification tasks for 25 professions in conversations with Mistral-large. The data was manually curated. I then used this as seed data to generate several hundred thousand texts for these tasks with Mixtral-8x7B-Instruct-v0.1. The final dataset used is available in the synthetic_zeroshot_mixtral_v0.1 dataset in the subset . Data curation was done in multiple iterations and will be improved in future iterations. 2. Two commercially-friendly NLI datasets: (MNLI, FEVER-NLI). These datasets were added to increase generalization. 3. Models without a \"\" in the name also included a broader mix of training data with a broader mix of licenses: ANLI, WANLI, LingNLI, and all datasets in this list where . ## How to use the models forces the model to decide on only one class. enables the model to choose multiple classes. ## Metrics The models were evaluated on 28 different text classification tasks with the f1_macro metric. The main reference point is which is, at the time of writing (03.04.24), the most used commercially-friendly 0-shot classifier. !results_aggreg_v2.0 | | facebook/bart-large-mnli | roberta-base-zeroshot-v2.0-c | roberta-large-zeroshot-v2.0-c | deberta-v3-base-zeroshot-v2.0-c | deberta-v3-base-zeroshot-v2.0 (fewshot) | deberta-v3-large-zeroshot-v2.0-c | deberta-v3-large-zeroshot-v2.0 (fewshot) | bge-m3-zeroshot-v2.0-c | bge-m3-zeroshot-v2.0 (fewshot) | |:---------------------------|---------------------------:|-----------------------------:|------------------------------:|--------------------------------:|-----------------------------------:|---------------------------------:|------------------------------------:|-----------------------:|--------------------------:| | all datasets mean | 0.497 | 0.587 | 0.622 | 0.619 | 0.643 (0.834) | 0.676 | 0.673 (0.846) | 0.59 | (0.803) | | amazonpolarity (2) | 0.937 | 0.924 | 0.951 | 0.937 | 0.943 (0.961) | 0.952 | 0.956 (0.968) | 0.942 | (0.951) | | imdb (2) | 0.892 | 0.871 | 0.904 | 0.893 | 0.899 (0.936) | 0.923 | 0.918 (0.958) | 0.873 | (0.917) | | appreviews (2) | 0.934 | 0.913 | 0.937 | 0.938 | 0.945 (0.948) | 0.943 | 0.949 (0.962) | 0.932 | (0.954) | | yelpreviews (2) | 0.948 | 0.953 | 0.977 | 0.979 | 0.975 (0.989) | 0.988 | 0.985 (0.994) | 0.973 | (0.978) | | rottentomatoes (2) | 0.83 | 0.802 | 0.841 | 0.84 | 0.86 (0.902) | 0.869 | 0.868 (0.908) | 0.813 | (0.866) | | emotiondair (6) | 0.455 | 0.482 | 0.486 | 0.459 | 0.495 (0.748) | 0.499 | 0.484 (0.688) | 0.453 | (0.697) | | emocontext (4) | 0.497 | 0.555 | 0.63 | 0.59 | 0.592 (0.799) | 0.699 | 0.676 (0.81) | 0.61 | (0.798) | | empathetic (32) | 0.371 | 0.374 | 0.404 | 0.378 | 0.405 (0.53) | 0.447 | 0.478 (0.555) | 0.387 | (0.455) | | financialphrasebank (3) | 0.465 | 0.562 | 0.455 | 0.714 | 0.669 (0.906) | 0.691 | 0.582 (0.913) | 0.504 | (0.895) | | banking77 (72) | 0.312 | 0.124 | 0.29 | 0.421 | 0.446 (0.751) | 0.513 | 0.567 (0.766) | 0.387 | (0.715) | | massive (59) | 0.43 | 0.428 | 0.543 | 0.512 | 0.52 (0.755) | 0.526 | 0.518 (0.789) | 0.414 | (0.692) | | wikitoxic_toxicaggreg (2) | 0.547 | 0.751 | 0.766 | 0.751 | 0.769 (0.904) | 0.741 | 0.787 (0.911) | 0.736 | (0.9) | | wikitoxic_obscene (2) | 0.713 | 0.817 | 0.854 | 0.853 | 0.869 (0.922) | 0.883 | 0.893 (0.933) | 0.783 | (0.914) | | wikitoxic_threat (2) | 0.295 | 0.71 | 0.817 | 0.813 | 0.87 (0.946) | 0.827 | 0.879 (0.952) | 0.68 | (0.947) | | wikitoxic_insult (2) | 0.372 | 0.724 | 0.798 | 0.759 | 0.811 (0.912) | 0.77 | 0.779 (0.924) | 0.783 | (0.915) | | wikitoxic_identityhate (2) | 0.473 | 0.774 | 0.798 | 0.774 | 0.765 (0.938) | 0.797 | 0.806 (0.948) | 0.761 | (0.931) | | hateoffensive (3) | 0.161 | 0.352 | 0.29 | 0.315 | 0.371 (0.862) | 0.47 | 0.461 (0.847) | 0.291 | (0.823) | | hatexplain (3) | 0.239 | 0.396 | 0.314 | 0.376 | 0.369 (0.765) | 0.378 | 0.389 (0.764) | 0.29 | (0.729) | | biasframes_offensive (2) | 0.336 | 0.571 | 0.583 | 0.544 | 0.601 (0.867) | 0.644 | 0.656 (0.883) | 0.541 | (0.855) | | biasframes_sex (2) | 0.263 | 0.617 | 0.835 | 0.741 | 0.809 (0.922) | 0.846 | 0.815 (0.946) | 0.748 | (0.905) | | biasframes_intent (2) | 0.616 | 0.531 | 0.635 | 0.554 | 0.61 (0.881) | 0.696 | 0.687 (0.891) | 0.467 | (0.868) | | agnews (4) | 0.703 | 0.758 | 0.745 | 0.68 | 0.742 (0.898) | 0.819 | 0.771 (0.898) | 0.687 | (0.892) | | yahootopics (10) | 0.299 | 0.543 | 0.62 | 0.578 | 0.564 (0.722) | 0.621 | 0.613 (0.738) | 0.587 | (0.711) | | trueteacher (2) | 0.491 | 0.469 | 0.402 | 0.431 | 0.479 (0.82) | 0.459 | 0.538 (0.846) | 0.471 | (0.518) | | spam (2) | 0.505 | 0.528 | 0.504 | 0.507 | 0.464 (0.973) | 0.74 | 0.597 (0.983) | 0.441 | (0.978) | | wellformedquery (2) | 0.407 | 0.333 | 0.333 | 0.335 | 0.491 (0.769) | 0.334 | 0.429 (0.815) | 0.361 | (0.718) | | manifesto (56) | 0.084 | 0.102 | 0.182 | 0.17 | 0.187 (0.376) | 0.258 | 0.256 (0.408) | 0.147 | (0.331) | | capsotu (21) | 0.34 | 0.479 | 0.523 | 0.502 | 0.477 (0.664) | 0.603 | 0.502 (0.686) | 0.472 | (0.644) | These numbers indicate zeroshot performance, as no data from these datasets was added in the training mix. Note that models without a \"\" in the title were evaluated twice: one run without any data from these 28 datasets to test pure zeroshot performance (the first number in the respective column) and the final run including up to 500 training data points per class from each of the 28 datasets (the second number in brackets in the column, \"fewshot\"). No model was trained on test data. Details on the different datasets are available here: ## When to use which model - **deberta-v3-zeroshot vs. roberta-zeroshot**: deberta-v3 performs clearly better than roberta, but it is a bit slower. roberta is directly compatible with Hugging Face's production inference TEI containers and flash attention. These containers are a good choice for production use-cases. tl;dr: For accuracy, use a deberta-v3 model. If production inference speed is a concern, you can consider a roberta model (e.g. in a TEI container and HF Inference Endpoints). - **commercial use-cases**: models with \"\" in the title are guaranteed to be trained on only commercially-friendly data. Models without a \"\" were trained on more data and perform better, but include data with non-commercial licenses. Legal opinions diverge if this training data affects the license of the trained model. For users with strict legal requirements, the models with \"\" in the title are recommended. - **Multilingual/non-English use-cases**: use bge-m3-zeroshot-v2.0 or bge-m3-zeroshot-v2.0-c. Note that multilingual models perform worse than English-only models. You can therefore also first machine translate your texts to English with libraries like EasyNMT and then apply any English-only model to the translated data. Machine translation also facilitates validation in case your team does not speak all languages in the data. - **context window**: The models can process up to 8192 tokens. The other models can process up to 512. Note that longer text inputs both make the mode slower and decrease performance, so if you're only working with texts of up to 400~ words / 1 page, use e.g. a deberta model for better performance. - The latest updates on new models are always available in the Zeroshot Classifier Collection. ## Reproduction Reproduction code is available in the directory here: ## Limitations and bias The model can only do text classification tasks. Biases can come from the underlying foundation model, the human NLI training data and the synthetic data generated by Mixtral. ## License The foundation model was published under the MIT license. The licenses of the training data vary depending on the model, see above. ## Citation This model is an extension of the research described in this paper. If you use this model academically, please cite: ### Ideas for cooperation or questions? If you have questions or ideas for cooperation, contact me at moritz{at}huggingface{dot}co or LinkedIn ### Flexible usage and \"prompting\" You can formulate your own hypotheses by changing the of the zeroshot pipeline. Similar to \"prompt engineering\" for LLMs, you can test different formulations of your and verbalized classes to improve performance.", + "model_explanation_gemini": "A multilingual zero-shot text classification model trained on commercially-friendly data to determine if a hypothesis is \"true\" or \"not true\" for a given text, reformulating any classification task into a Natural Language Inference (NLI) format. \n\n**Features:** \n- Zero-shot classification (no training data required) \n- Supports GPU and CPU inference \n- Commercially-friendly training data (synthetic Mixtral-generated data + MNLI/FEVER-NLI) \n- Handles single-label" +} \ No newline at end of file diff --git a/model_data_json/NbAiLab_nb-wav2vec2-300m-nynorsk.json b/model_data_json/NbAiLab_nb-wav2vec2-300m-nynorsk.json new file mode 100644 index 0000000000000000000000000000000000000000..d16f808f58929d2f9dad103460f42dd632423d05 --- /dev/null +++ b/model_data_json/NbAiLab_nb-wav2vec2-300m-nynorsk.json @@ -0,0 +1,21 @@ +{ + "model_id": "NbAiLab/nb-wav2vec2-300m-nynorsk", + "downloads": 73025, + "tags": [ + "transformers", + "pytorch", + "tensorboard", + "safetensors", + "wav2vec2", + "automatic-speech-recognition", + "nn", + "dataset:NbAiLab/NPSC", + "arxiv:2307.01672", + "license:apache-2.0", + "model-index", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: apache-2.0 tags: - automatic-speech-recognition datasets: - NbAiLab/NPSC language: - nn model-index: - name: nb-wav2vec2-300m-nynorsk results: - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: NPSC type: NbAiLab/NPSC args: 16K_mp3_nynorsk metrics: - name: Test (Nynorsk) WER type: wer value: 0.1222 - name: Test (Nynorsk) CER type: cer value: 0.0419 --- # Norwegian Wav2Vec2 Model - 300M - VoxRex - Nynorsk This model is finetuned on top of feature extractor VoxRex-model from the National Library of Sweden. The finetuned model achieves the following results on the test set with a 5-gram KenLM. The numbers in parentheses are the results without the language model: - **WER: 0.1222** (0.1537) - **CER: 0.0419** (0.0468) ## Model description This is one of several Wav2Vec-models our team created during the 🤗 hosted Robust Speech Event. This is the complete list of our models and their final scores: | Model | Final WER | | |:--------------|:------------|:------------:| | NbAiLab/nb-wav2vec2-1b-bokmaal | 6.33 | | | NbAiLab/nb-wav2vec2-300m-bokmaal | 7.03 | | | NbAiLab/nb-wav2vec2-1b-nynorsk | 11.32 | | | NbAiLab/nb-wav2vec2-300m-nynorsk (this model) | 12.22 | | ### Dataset In parallel with the event, the team also converted the Norwegian Parliamentary Speech Corpus (NPSC) to the NbAiLab/NPSC in 🤗 Dataset format and used that as the main source for training. ## Code We have released all the code developed during the event so that the Norwegian NLP community can build upon it when developing even better Norwegian ASR models. The finetuning of these models is not very computationally demanding. After following the instructions here, you should be able to train your own automatic speech recognition system in less than a day with an average GPU. ## Team The following people contributed to building this model: Rolv-Arild Braaten, Per Egil Kummervold, Andre Kåsen, Javier de la Rosa, Per Erik Solberg, and Freddy Wetjen. ## Training procedure To reproduce these results, we strongly recommend that you follow the instructions from 🤗 to train a simple Swedish model. When you have verified that you are able to do this, create a fresh new repo. You can then start by copying the files and from our repo. Running these will create all the other necessary files, and should let you reproduce our results. With some tweaks to the hyperparameters, you might even be able to build an even better ASR. Good luck! ### Language Model As the scores indicate, adding even a simple 5-gram language will improve the results. 🤗 has provided another very nice blog explaining how to add a 5-gram language model to improve the ASR model. You can build this from your own corpus, for instance by extracting some suitable text from the Norwegian Colossal Corpus. You can also skip some of the steps in the guide, and copy the 5-gram model from this repo. ### Parameters The final model was run using these parameters: Using these settings, the training might take 3-4 days on an average GPU. You can, however, get a decent model and faster results by tweaking these parameters. | Parameter| Comment | |:-------------|:-----| | per_device_train_batch_size | Adjust this to the maximum of available memory. 16 or 24 might be good settings depending on your system | |gradient_accumulation_steps |Can be adjusted even further up to increase batch size and speed up training without running into memory issues | | learning_rate|Can be increased, maybe as high as 1e-4. Speeds up training but might add instability | | epochs| Can be decreased significantly. This is a huge dataset and you might get a decent result already after a couple of epochs| ## Citation See", + "model_explanation_gemini": "Finetuned for automatic speech recognition in Nynorsk, this model achieves a 12.22% WER on the NPSC dataset, enhanced by a 5-gram KenLM.\n\nFeatures: \n- **Task:** Automatic Speech Recognition (ASR) \n- **Base Model:** VoxRex from the National Library of Sweden \n- **Dataset:** NbAiLab/NPSC (Norwegian Parliamentary Speech Corpus, Nynorsk subset) \n- **Metrics:** \n - W" +} \ No newline at end of file diff --git a/model_data_json/OpenGVLab_InternVL2_5-38B-MPO.json b/model_data_json/OpenGVLab_InternVL2_5-38B-MPO.json new file mode 100644 index 0000000000000000000000000000000000000000..8282c381fa3958b88f8e009207630b6118f2ce49 --- /dev/null +++ b/model_data_json/OpenGVLab_InternVL2_5-38B-MPO.json @@ -0,0 +1,27 @@ +{ + "model_id": "OpenGVLab/InternVL2_5-38B-MPO", + "downloads": 78877, + "tags": [ + "transformers", + "tensorboard", + "safetensors", + "internvl_chat", + "feature-extraction", + "internvl", + "custom_code", + "image-text-to-text", + "conversational", + "multilingual", + "dataset:OpenGVLab/MMPR-v1.1", + "arxiv:2312.14238", + "arxiv:2404.16821", + "arxiv:2412.05271", + "arxiv:2411.10442", + "base_model:OpenGVLab/InternVL2_5-38B", + "base_model:finetune:OpenGVLab/InternVL2_5-38B", + "license:mit", + "region:us" + ], + "description": "--- license: mit pipeline_tag: image-text-to-text library_name: transformers base_model: - OpenGVLab/InternVL2_5-38B base_model_relation: finetune datasets: - OpenGVLab/MMPR-v1.1 language: - multilingual tags: - internvl - custom_code --- # InternVL2_5-38B-MPO [\\[📂 GitHub\\]]( [\\[📜 InternVL 1.0\\]]( [\\[📜 InternVL 1.5\\]]( [\\[📜 InternVL 2.5\\]]( [\\[📜 InternVL2.5-MPO\\]]( [\\[🆕 Blog\\]]( [\\[🗨️ Chat Demo\\]]( [\\[🤗 HF Demo\\]]( [\\[🚀 Quick Start\\]](#quick-start) [\\[📖 Documents\\]](

\"image\" ## Introduction We introduce InternVL2.5-MPO, an advanced multimodal large language model (MLLM) series that demonstrates superior overall performance. This series builds upon InternVL2.5 and Mixed Preference Optimization. !image/png ## InternVL 2.5 Family In the following table, we provide an overview of the InternVL2.5-MPO series. | Model Name | Vision Part | Language Part | HF Link | | :-----------------: | :-------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------: | :------------------------------------------------------------: | | InternVL2_5-1B-MPO | InternViT-300M-448px-V2_5 | Qwen2.5-0.5B-Instruct | 🤗 link | | InternVL2_5-2B-MPO | InternViT-300M-448px-V2_5 | internlm2_5-1_8b-chat | 🤗 link | | InternVL2_5-4B-MPO | InternViT-300M-448px-V2_5 | Qwen2.5-3B-Instruct | 🤗 link | | InternVL2_5-8B-MPO | InternViT-300M-448px-V2_5 | internlm2_5-7b-chat | 🤗 link | | InternVL2_5-26B-MPO | InternViT-6B-448px-V2_5 | internlm2_5-20b-chat | 🤗 link | | InternVL2_5-38B-MPO | InternViT-6B-448px-V2_5 | Qwen2.5-32B-Instruct | 🤗 link | | InternVL2_5-78B-MPO | InternViT-6B-448px-V2_5 | Qwen2.5-72B-Instruct | 🤗 link | ## Model Architecture As shown in the following figure, InternVL2.5-MPO retains the same model architecture as InternVL 2.5 and its predecessors, InternVL 1.5 and 2.0, following the \"ViT-MLP-LLM\" paradigm. In this new version, we integrate a newly incrementally pre-trained InternViT with various pre-trained LLMs, including InternLM 2.5 and Qwen 2.5, using a randomly initialized MLP projector. !image/png As in the previous version, we applied a pixel unshuffle operation, reducing the number of visual tokens to one-quarter of the original. Besides, we adopted a similar dynamic resolution strategy as InternVL 1.5, dividing images into tiles of 448×448 pixels. The key difference, starting from InternVL 2.0, is that we additionally introduced support for multi-image and video data. ## Key Designs ### Multi-Modal Preference Dataset MMPR is a large-scale and high-quality multimodal reasoning preference dataset. This dataset includes about 3 million samples. !image/jpeg !image/jpeg To construct this dataset, we propose an efficient data construction pipeline. Specifically, we categorize the multimodal data into **samples with clear ground truths** and **samples without clear ground truths**. - **For samples with clear ground truths:** the model is prompted to first provide the reasoning process and then give the final answer in the format like . Responses matching the ground truth answer constitute the positive set \\\\(\\mathcal{Y}_p\\\\), while those that do not match make up the negative set \\\\(\\mathcal{Y}_n\\\\). Additionally, responses that fail to provide a clear final answer are also merged into \\\\(\\mathcal{Y}_n\\\\). Given these responses labeled as positive or negative, we build the preference pairs by selecting a chosen response \\\\(y_c\\\\) from \\\\(\\mathcal{Y}_p\\\\) and a negative response \\\\(y_r\\\\) from \\\\(\\mathcal{Y}_n\\\\). - **For samples without clear ground truths:** we propose a simple yet effective method: Dropout Next-Token Prediction (Dropout NTP). Specifically, we use the responses generated by InternVL2-8B as chosen answers. Given the chosen answer, we truncate it by half and then prompt InternVL2-8B to complete the remaining portion of the truncated answer without access to the image input. This generated completion serves as the rejected answer for the paired sample. It is worth noting that while the responses generated by InternVL2-8B may not be perfect, the completions generated without the image input will introduce more hallucinations than those generated with the image input. Therefore, the partial order relationship between the chosen and rejected responses holds true. The data construction pipeline is open-sourced, see more details in our document. ### Mixed Preference Optimization The key insight behind MPO is that *an effective PO process should enable the model to learn the relative preference between pairs of responses, the absolute quality of individual responses, and the process for generating preferred responses.* We define the training objective as a combination of preference loss \\\\(\\mathcal{L}_{\\text{p}}\\\\), quality loss \\\\(\\mathcal{L}_{\\text{q}}\\\\), and generation loss \\\\(\\mathcal{L}_{\\text{g}}\\\\), referred to as Mixed Preference Optimization: $$ \\mathcal{L}=w_{p}\\cdot\\mathcal{L}_{\\text{p}} + w_{q}\\cdot\\mathcal{L}_{\\text{q}} + w_{g}\\cdot\\mathcal{L}_{\\text{g}}, $$ where \\\\(w_{*}\\\\) represents the weight assigned to each loss component. In this work, we empirically compare different variants of preference loss. Based on the experimental results, we use DPO as our preference loss and BCO as our quality loss. Specifically, the DPO serves as the preference loss to enable the model to learn the relative preference between chosen and rejected responses. This algorithm optimizes the following loss function: $$ \\mathcal{L}_{\\text{p}}=-\\log \\sigma\\left(\\beta \\log \\frac{\\pi_\\theta\\left(y_c \\mid x\\right)}{\\pi_0\\left(y_c \\mid x\\right)}-\\beta \\log \\frac{\\pi_\\theta\\left(y_r \\mid x\\right)}{\\pi_0\\left(y_r \\mid x\\right)}\\right), $$ where \\\\(\\beta\\\\) is the KL penalty coefficient, and \\\\(x\\\\), \\\\(y_c\\\\), and \\\\(y_r\\\\) are user query, chosen response, and rejected response, respectively. The policy model \\\\(\\pi_\\theta\\\\) is initialized from model \\\\(\\pi_0\\\\). Additionally, the BCO loss is employed as the quality loss, which helps the model to understand the absolute quality of individual responses. The loss function is defined as: $$ \\mathcal{L}_{\\text{q}}=\\mathcal{L}_{\\text{q}}^+ + \\mathcal{L}_{\\text{q}}^-, $$ where \\\\(\\mathcal{L}_{\\text{q}}^{+}\\\\) and \\\\(\\mathcal{L}_{\\text{q}}^{+}\\\\) represent the loss for chosen and rejected responses, respectively. Each response type's loss is calculated independently, requiring the model to differentiate the absolute quality of individual responses. The loss terms are given by: $$ \\mathcal{L}_{\\text{q}}^+=-\\log \\sigma\\left(\\beta \\log \\frac{\\pi_\\theta\\left(y_c \\mid x\\right)}{\\pi_0\\left(y_c \\mid x\\right)} - \\delta\\right), $$ $$ \\mathcal{L}_{\\text{q}}^-=-\\log \\sigma\\left(-\\left(\\beta \\log \\frac{\\pi_\\theta\\left(y_r \\mid x\\right)}{\\pi_0\\left(y_r \\mid x\\right)} - \\delta\\right) \\right), $$ where \\\\(\\delta\\\\) represents the reward shift, calculated as the moving average of previous rewards to stabilize training. Finally, the SFT loss is used as the generation loss to help the model learn the generation process of preferred responses. The loss function is defined as: $$ \\mathcal{L}_{\\text{gen}}=-\\frac{\\log\\pi_\\theta\\left(y_c \\mid x\\right)}{\\left| y_c \\right|}. $$ ## Evaluation on Multimodal Capability To comprehensively compare InternVL's performance before and after MPO, we employ the benchmarks from OpenCompass Learderboard, including both well-established classic datasets and newly introduced ones. These benchmarks span a wide range of categories, aiming to provide a thorough and balanced assessment of InternVL’s capabilities across various multimodal tasks. We provide the evaluation results in the tables behind. | Model | Avg. | MMBench v1.1 | MMStar | MMMU | MathVista | HallusionBench | AI2D | OCRBench | MMVet | | ------------------- | ---- | ------------ | ------ | ---- | --------- | -------------- | ---- | -------- | ----- | | InternVL2-5-1B | 54.9 | 66.5 | 51.3 | 41.2 | 47.1 | 39.4 | 69.0 | 77.4 | 47.2 | | InternVL2-5-1B-MPO | 56.4 | 67.2 | 49.7 | 40.8 | 53.0 | 40.0 | 69.4 | 83.6 | 47.2 | | InternVL2-5-2B | 59.9 | 70.9 | 54.3 | 43.2 | 51.1 | 42.3 | 74.9 | 80.2 | 62.6 | | InternVL2-5-2B-MPO | 62.0 | 71.6 | 55.0 | 45.0 | 56.4 | 43.0 | 75.3 | 84.2 | 65.4 | | InternVL2-5-4B | 65.1 | 78.2 | 58.7 | 51.8 | 60.8 | 46.6 | 81.4 | 82.0 | 61.5 | | InternVL2-5-4B-MPO | 67.6 | 78.6 | 60.2 | 51.6 | 65.3 | 47.8 | 82.0 | 88.0 | 67.1 | | InternVL2-5-8B | 68.9 | 82.5 | 63.2 | 56.2 | 64.5 | 49.0 | 84.6 | 82.1 | 62.8 | | InternVL2-5-8B-MPO | 70.4 | 82.4 | 65.7 | 54.9 | 68.9 | 51.4 | 84.5 | 88.3 | 66.9 | | InternVL2-5-26B | 71.6 | 84.6 | 66.5 | 60.7 | 68.0 | 55.8 | 86.2 | 85.4 | 65.4 | | InternVL2-5-26B-MPO | 72.7 | 84.2 | 67.2 | 57.7 | 72.8 | 55.3 | 86.2 | 91.2 | 67.1 | | InternVL2-5-38B | 73.5 | 85.4 | 68.5 | 64.6 | 72.4 | 57.9 | 87.6 | 84.1 | 67.2 | | InternVL2-5-38B-MPO | 75.5 | 85.6 | 69.8 | 64.1 | 73.8 | 61.5 | 88.1 | 88.5 | 72.5 | | InternVL2-5-78B | 75.2 | 87.5 | 69.5 | 70.0 | 70.6 | 57.4 | 89.1 | 85.3 | 71.8 | | InternVL2-5-78B-MPO | 76.6 | 87.3 | 73.1 | 68.3 | 73.8 | 58.7 | 89.3 | 91.2 | 71.4 | ## Quick Start We provide an example code to run using . > Please use transformers>=4.37.2 to ensure the model works normally. ### Model Loading #### 16-bit (bf16 / fp16) #### BNB 8-bit Quantization #### Multiple GPUs The reason for writing the code this way is to avoid errors that occur during multi-GPU inference due to tensors not being on the same device. By ensuring that the first and last layers of the large language model (LLM) are on the same device, we prevent such errors. ### Inference with Transformers #### Streaming Output Besides this method, you can also use the following code to get streamed output. ## Finetune Many repositories now support fine-tuning of the InternVL series models, including InternVL, SWIFT, XTurner, and others. Please refer to their documentation for more details on fine-tuning. ## Deployment ### LMDeploy LMDeploy is a toolkit for compressing, deploying, and serving LLMs & VLMs. LMDeploy abstracts the complex inference process of multi-modal Vision-Language Models (VLM) into an easy-to-use pipeline, similar to the Large Language Model (LLM) inference pipeline. #### A 'Hello, world' Example If occurs while executing this case, please install the required dependency packages as prompted. #### Multi-images Inference When dealing with multiple images, you can put them all in one list. Keep in mind that multiple images will lead to a higher number of input tokens, and as a result, the size of the context window typically needs to be increased. #### Batch Prompts Inference Conducting inference with batch prompts is quite straightforward; just place them within a list structure: #### Multi-turn Conversation There are two ways to do the multi-turn conversations with the pipeline. One is to construct messages according to the format of OpenAI and use above introduced method, the other is to use the interface. #### Service LMDeploy's enables models to be easily packed into services with a single command. The provided RESTful APIs are compatible with OpenAI's interfaces. Below are an example of service startup: To use the OpenAI-style interface, you need to install OpenAI: Then, use the code below to make the API call: ## License This project is released under the MIT License. This project uses the pre-trained Qwen2.5-32B-Instruct as a component, which is licensed under the Apache License 2.0. ## Citation If you find this project useful in your research, please consider citing:", + "model_explanation_gemini": "**A multilingual multimodal large language model (MLLM) optimized for image-text-to-text tasks, built on InternVL2.5 with Mixed Preference Optimization for superior performance.** \n\n### **Features:** \n- **Multimodal Capability:** Processes both image and text inputs to generate text outputs. \n- **Multilingual Support:** Works across multiple languages. \n- **Advanced Architecture:** Uses a \"ViT-MLP-LLM\" paradigm with InternViT and LLMs like Qwen" +} \ No newline at end of file diff --git a/model_data_json/OpenGVLab_InternVL3-2B.json b/model_data_json/OpenGVLab_InternVL3-2B.json new file mode 100644 index 0000000000000000000000000000000000000000..4f09152f45d91ec97cdd119d76357b75d9478828 --- /dev/null +++ b/model_data_json/OpenGVLab_InternVL3-2B.json @@ -0,0 +1,28 @@ +{ + "model_id": "OpenGVLab/InternVL3-2B", + "downloads": 80574, + "tags": [ + "transformers", + "safetensors", + "internvl_chat", + "feature-extraction", + "internvl", + "custom_code", + "image-text-to-text", + "conversational", + "multilingual", + "dataset:OpenGVLab/MMPR-v1.2", + "arxiv:2312.14238", + "arxiv:2404.16821", + "arxiv:2412.05271", + "arxiv:2411.10442", + "arxiv:2504.10479", + "arxiv:2412.09616", + "base_model:OpenGVLab/InternVL3-2B-Instruct", + "base_model:finetune:OpenGVLab/InternVL3-2B-Instruct", + "license:apache-2.0", + "region:us" + ], + "description": "--- license: apache-2.0 license_name: qwen license_link: pipeline_tag: image-text-to-text library_name: transformers base_model: - OpenGVLab/InternVL3-2B-Instruct base_model_relation: finetune datasets: - OpenGVLab/MMPR-v1.2 language: - multilingual tags: - internvl - custom_code --- # InternVL3-2B [\\[📂 GitHub\\]]( [\\[📜 InternVL 1.0\\]]( [\\[📜 InternVL 1.5\\]]( [\\[📜 InternVL 2.5\\]]( [\\[📜 InternVL2.5-MPO\\]]( [\\[📜 InternVL3\\]]( [\\[🆕 Blog\\]]( [\\[🗨️ Chat Demo\\]]( [\\[🤗 HF Demo\\]]( [\\[🚀 Quick Start\\]](#quick-start) [\\[📖 Documents\\]](
\"image\" ## Introduction We introduce InternVL3, an advanced multimodal large language model (MLLM) series that demonstrates superior overall performance. Compared to InternVL 2.5, InternVL3 exhibits superior multimodal perception and reasoning capabilities, while further extending its multimodal capabilities to encompass tool usage, GUI agents, industrial image analysis, 3D vision perception, and more. Additionally, we compare InternVL3 with Qwen2.5 Chat models, whose corresponding pre-trained base models are employed as the initialization of the langauge component in InternVL3. Benefitting from Native Multimodal Pre-Training, the InternVL3 series achieves even better overall text performance than the Qwen2.5 series. !image/png ## InternVL3 Family In the following table, we provide an overview of the InternVL3 series. | Model Name | Vision Part | Language Part | HF Link | | :-----------: | :-------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------: | :------------------------------------------------------: | | InternVL3-1B | InternViT-300M-448px-V2_5 | Qwen2.5-0.5B | 🤗 link | | InternVL3-2B | InternViT-300M-448px-V2_5 | Qwen2.5-1.5B | 🤗 link | | InternVL3-8B | InternViT-300M-448px-V2_5 | Qwen2.5-7B | 🤗 link | | InternVL3-9B | InternViT-300M-448px-V2_5 | internlm3-8b-instruct | 🤗 link | | InternVL3-14B | InternViT-300M-448px-V2_5 | Qwen2.5-14B | 🤗 link | | InternVL3-38B | InternViT-6B-448px-V2_5 | Qwen2.5-32B | 🤗 link | | InternVL3-78B | InternViT-6B-448px-V2_5 | Qwen2.5-72B | 🤗 link | !image/png ## Model Architecture As shown in the following figure, InternVL3 retains the same model architecture as InternVL 2.5 and its predecessors, InternVL 1.5 and 2.0, following the \"ViT-MLP-LLM\" paradigm. In this new version, we integrate a newly incrementally pre-trained InternViT with various pre-trained LLMs, including InternLM 3 and Qwen 2.5, using a randomly initialized MLP projector. !image/png As in the previous version, we applied a pixel unshuffle operation, reducing the number of visual tokens to one-quarter of the original. Besides, we adopted a similar dynamic resolution strategy as InternVL 1.5, dividing images into tiles of 448×448 pixels. The key difference, starting from InternVL 2.0, is that we additionally introduced support for multi-image and video data. Notably, in InternVL3, we integrate the Variable Visual Position Encoding (V2PE), which utilizes smaller, more flexible position increments for visual tokens. Benefiting from V2PE, InternVL3 exhibits better long context understanding capabilities compared to its predecessors. ## Training Strategy ### Native Multimodal Pre-Training We propose a Native Multimodal Pre-Training approach that consolidates language and vision learning into a single pre-training stage. In contrast to standard paradigms that first train a language-only model and subsequently adapt it to handle additional modalities, our method interleaves multimodal data (e.g., image-text, video-text, or image-text interleaved sequences) with large-scale textual corpora. This unified training scheme allows the model to learn both linguistic and multimodal representations simultaneously, ultimately enhancing its capability to handle vision-language tasks without the need for separate alignment or bridging modules. Please see our paper for more details. ### Supervised Fine-Tuning In this phase, the techniques of random JPEG compression, square loss re-weighting, and multimodal data packing proposed in InternVL2.5 are also employed in the InternVL3 series. The main advancement of the SFT phase in InternVL3 compared to InternVL2.5 lies in the use of higher-quality and more diverse training data. Specifically, we further extend training samples for tool use, 3D scene understanding, GUI operations, long context tasks, video understanding, scientific diagrams, creative writing, and multimodal reasoning. ### Mixed Preference Optimization During Pre-training and SFT, the model is trained to predict the next token conditioned on previous ground-truth tokens. However, during inference, the model predicts each token based on its own prior outputs. This discrepancy between ground-truth tokens and model-predicted tokens introduces a distribution shift, which can impair the model’s Chain-of-Thought (CoT) reasoning capabilities. To mitigate this issue, we employ MPO, which introduces additional supervision from both positive and negative samples to align the model response distribution with the ground-truth distribution, thereby improving reasoning performance. Specifically, the training objective of MPO is a combination of preference loss \\\\(\\mathcal{L}_{\\text{p}}\\\\), quality loss \\\\(\\mathcal{L}_{\\text{q}}\\\\), and generation loss \\\\(\\mathcal{L}_{\\text{g}}\\\\), which can be formulated as follows: $$ \\mathcal{L}=w_{p}\\cdot\\mathcal{L}_{\\text{p}} + w_{q}\\cdot\\mathcal{L}_{\\text{q}} + w_{g}\\cdot\\mathcal{L}_{\\text{g}}, $$ where \\\\(w_{*}\\\\) represents the weight assigned to each loss component. Please see our paper for more details about MPO. ### Test-Time Scaling Test-Time Scaling has been shown to be an effective method to enhance the reasoning abilities of LLMs and MLLMs. In this work, we use the Best-of-N evaluation strategy and employ VisualPRM-8B as the critic model to select the best response for reasoning and mathematics evaluation. ## Evaluation on Multimodal Capability ### Multimodal Reasoning and Mathematics !image/png ### OCR, Chart, and Document Understanding !image/png ### Multi-Image & Real-World Comprehension !image/png ### Comprehensive Multimodal & Hallucination Evaluation !image/png ### Visual Grounding !image/png ### Multimodal Multilingual Understanding !image/png ### Video Understanding !image/png ### GUI Grounding !image/png ### Spatial Reasoning !image/png ## Evaluation on Language Capability We compare InternVL3 with Qwen2.5 Chat models, whose corresponding pre-trained base models are employed as the initialization of the langauge component in InternVL3. Benefitting from Native Multimodal Pre-Training, the InternVL3 series achieves even better overall text performance than the Qwen2.5 series. Please note that the evaluation scores of Qwen2.5 series may differ from those officially reported, as we have adopted the prompt versions provided in the table across all datasets for OpenCompass evaluation. !image/png ## Ablation Study ### Native Multimodal Pre-Training We conduct experiments on the InternVL2-8B model while keeping its architecture, initialization parameters, and training data entirely unchanged. Traditionally, InternVL2-8B employs a training pipeline that begins with an MLP warmup phase for feature alignment followed by an Instruction Tuning stage. In our experiments, we substitute the conventional MLP warmup phase with a native multimodal pre-training process. This modification isolates the contribution of native multimodal pre-training to the overall multimodal capability of the model. The evaluation results in the Figure below shows that the model with native multimodal pre-training exhibits performance on most benchmarks that is comparable to the fully multi-stage-trained InternVL2-8B baseline. Furthermore, when followed by instruction tuning on higher-quality data, the model demonstrates further performance gains across evaluated multimodal tasks. These findings underscore the efficiency of native multimodal pre-training in imparting powerful multimodal capabilities to MLLMs. !image/png ### Mixed Preference Optimization As shown in the table below, models fine-tuned with MPO demonstrate superior reasoning performance across seven multimodal reasoning benchmarks compared to their counterparts without MPO. Specifically, InternVL3-78B and InternVL3-38B outperform their counterparts by 4.1 and 4.5 points, respectively. Notably, the training data used for MPO is a subset of that used for SFT, indicating that the performance improvements primarily stem from the training algorithm rather than the training data. !image/png ### Variable Visual Position Encoding As reported in the table below, the introduction of V2PE leads to significant performance gains across most evaluation metrics. In addition, our ablation studies—by varying the positional increment \\\\( \\delta \\\\)—reveal that even for tasks primarily involving conventional contexts, relatively small \\\\( \\delta \\\\) values can achieve optimal performance. These findings provide important insights for future efforts aimed at refining position encoding strategies for visual tokens in MLLMs. !image/png ## Quick Start We provide an example code to run using . > Please use transformers>=4.37.2 to ensure the model works normally. ### Model Loading #### 16-bit (bf16 / fp16) #### BNB 8-bit Quantization #### Multiple GPUs The reason for writing the code this way is to avoid errors that occur during multi-GPU inference due to tensors not being on the same device. By ensuring that the first and last layers of the large language model (LLM) are on the same device, we prevent such errors. ### Inference with Transformers #### Streaming Output Besides this method, you can also use the following code to get streamed output. ## Finetune Many repositories now support fine-tuning of the InternVL series models, including InternVL, SWIFT, XTurner, and others. Please refer to their documentation for more details on fine-tuning. ## Deployment ### LMDeploy LMDeploy is a toolkit for compressing, deploying, and serving LLMs & VLMs. LMDeploy abstracts the complex inference process of multi-modal Vision-Language Models (VLM) into an easy-to-use pipeline, similar to the Large Language Model (LLM) inference pipeline. #### A 'Hello, world' Example If occurs while executing this case, please install the required dependency packages as prompted. #### Multi-images Inference When dealing with multiple images, you can put them all in one list. Keep in mind that multiple images will lead to a higher number of input tokens, and as a result, the size of the context window typically needs to be increased. #### Batch Prompts Inference Conducting inference with batch prompts is quite straightforward; just place them within a list structure: #### Multi-turn Conversation There are two ways to do the multi-turn conversations with the pipeline. One is to construct messages according to the format of OpenAI and use above introduced method, the other is to use the interface. #### Service LMDeploy's enables models to be easily packed into services with a single command. The provided RESTful APIs are compatible with OpenAI's interfaces. Below are an example of service startup: To use the OpenAI-style interface, you need to install OpenAI: Then, use the code below to make the API call: ## License This project is released under the MIT License. This project uses the pre-trained Qwen2.5 as a component, which is licensed under the Apache-2.0 License. ## Citation If you find this project useful in your research, please consider citing:", + "model_explanation_gemini": "\"InternVL3-2B is a multilingual multimodal large language model (MLLM) with enhanced multimodal perception, reasoning, and extended capabilities like tool usage, GUI agents, and industrial image analysis, outperforming its predecessor InternVL 2.5 and Qwen2.5 Chat models in text performance.\"\n\n**Features:** \n- Multimodal (image-text-to-text) \n- Multilingual support \n- Superior multimodal perception/reasoning \n- Extended capabilities (tool usage, GUI agents" +} \ No newline at end of file diff --git a/model_data_json/QuantFactory_Meta-Llama-3.1-8B-Instruct-GGUF.json b/model_data_json/QuantFactory_Meta-Llama-3.1-8B-Instruct-GGUF.json new file mode 100644 index 0000000000000000000000000000000000000000..30e1872edcffe55e2e8a716bd0688839ea8625f0 --- /dev/null +++ b/model_data_json/QuantFactory_Meta-Llama-3.1-8B-Instruct-GGUF.json @@ -0,0 +1,28 @@ +{ + "model_id": "QuantFactory/Meta-Llama-3.1-8B-Instruct-GGUF", + "downloads": 77242, + "tags": [ + "gguf", + "facebook", + "meta", + "pytorch", + "llama", + "llama-3", + "text-generation", + "en", + "de", + "fr", + "it", + "pt", + "hi", + "es", + "th", + "arxiv:2204.05149", + "license:llama3.1", + "endpoints_compatible", + "region:us", + "conversational" + ], + "description": "--- language: - en - de - fr - it - pt - hi - es - th pipeline_tag: text-generation tags: - facebook - meta - pytorch - llama - llama-3 license: llama3.1 extra_gated_prompt: >- ### LLAMA 3.1 COMMUNITY LICENSE AGREEMENT Llama 3.1 Version Release Date: July 23, 2024 \"Agreement\" means the terms and conditions for use, reproduction, distribution and modification of the Llama Materials set forth herein. \"Documentation\" means the specifications, manuals and documentation accompanying Llama 3.1 distributed by Meta at \"Licensee\" or \"you\" means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity’s behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf. \"Llama 3.1\" means the foundational large language models and software and algorithms, including machine-learning model code, trained model weights, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing distributed by Meta at \"Llama Materials\" means, collectively, Meta’s proprietary Llama 3.1 and Documentation (and any portion thereof) made available under this Agreement. \"Meta\" or \"we\" means Meta Platforms Ireland Limited (if you are located in or, if you are an entity, your principal place of business is in the EEA or Switzerland) and Meta Platforms, Inc. (if you are located outside of the EEA or Switzerland). 1. License Rights and Redistribution. a. Grant of Rights. You are granted a non-exclusive, worldwide, non-transferable and royalty-free limited license under Meta’s intellectual property or other rights owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy, create derivative works of, and make modifications to the Llama Materials. b. Redistribution and Use. i. If you distribute or make available the Llama Materials (or any derivative works thereof), or a product or service (including another AI model) that contains any of them, you shall (A) provide a copy of this Agreement with any such Llama Materials; and (B) prominently display “Built with Llama” on a related website, user interface, blogpost, about page, or product documentation. If you use the Llama Materials or any outputs or results of the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include “Llama” at the beginning of any such AI model name. ii. If you receive Llama Materials, or any derivative works thereof, from a Licensee as part of an integrated end user product, then Section 2 of this Agreement will not apply to you. iii. You must retain in all copies of the Llama Materials that you distribute the following attribution notice within a “Notice” text file distributed as a part of such copies: “Llama 3.1 is licensed under the Llama 3.1 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.” iv. Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the Llama Materials (available at which is hereby incorporated by reference into this Agreement. 2. Additional Commercial Terms. If, on the Llama 3.1 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights. 3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN “AS IS” BASIS, WITHOUT WARRANTIES OF ANY KIND, AND META DISCLAIMS ALL WARRANTIES OF ANY KIND, BOTH EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS. 4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING. 5. Intellectual Property. a. No trademark licenses are granted under this Agreement, and in connection with the Llama Materials, neither Meta nor Licensee may use any name or mark owned by or associated with the other or any of its affiliates, except as required for reasonable and customary use in describing and redistributing the Llama Materials or as set forth in this Section 5(a). Meta hereby grants you a license to use “Llama” (the “Mark”) solely as required to comply with the last sentence of Section 1.b.i. You will comply with Meta’s brand guidelines (currently accessible at ). All goodwill arising out of your use of the Mark will inure to the benefit of Meta. b. Subject to Meta’s ownership of Llama Materials and derivatives made by or for Meta, with respect to any derivative works and modifications of the Llama Materials that are made by you, as between you and Meta, you are and will be the owner of such derivative works and modifications. c. If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 3.1 outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials. 6. Term and Termination. The term of this Agreement will commence upon your acceptance of this Agreement or access to the Llama Materials and will continue in full force and effect until terminated in accordance with the terms and conditions herein. Meta may terminate this Agreement if you are in breach of any term or condition of this Agreement. Upon termination of this Agreement, you shall delete and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the termination of this Agreement. 7. Governing Law and Jurisdiction. This Agreement will be governed and construed under the laws of the State of California without regard to choice of law principles, and the UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement. The courts of California shall have exclusive jurisdiction of any dispute arising out of this Agreement. ### Llama 3.1 Acceptable Use Policy Meta is committed to promoting safe and fair use of its tools and features, including Llama 3.1. If you access or use Llama 3.1, you agree to this Acceptable Use Policy (“Policy”). The most recent copy of this policy can be found at #### Prohibited Uses We want everyone to use Llama 3.1 safely and responsibly. You agree you will not use, or allow others to use, Llama 3.1 to: 1. Violate the law or others’ rights, including to: 1. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as: 1. Violence or terrorism 2. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material 3. Human trafficking, exploitation, and sexual violence 4. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials. 5. Sexual solicitation 6. Any other criminal activity 3. Engage in, promote, incite, or facilitate the harassment, abuse, threatening, or bullying of individuals or groups of individuals 4. Engage in, promote, incite, or facilitate discrimination or other unlawful or harmful conduct in the provision of employment, employment benefits, credit, housing, other economic benefits, or other essential goods and services 5. Engage in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or related professional practices 6. Collect, process, disclose, generate, or infer health, demographic, or other sensitive personal or private information about individuals without rights and consents required by applicable laws 7. Engage in or facilitate any action or generate any content that infringes, misappropriates, or otherwise violates any third-party rights, including the outputs or results of any products or services using the Llama Materials 8. Create, generate, or facilitate the creation of malicious code, malware, computer viruses or do anything else that could disable, overburden, interfere with or impair the proper working, integrity, operation or appearance of a website or computer system 2. Engage in, promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals, including use of Llama 3.1 related to the following: 1. Military, warfare, nuclear industries or applications, espionage, use for materials or activities that are subject to the International Traffic Arms Regulations (ITAR) maintained by the United States Department of State 2. Guns and illegal weapons (including weapon development) 3. Illegal drugs and regulated/controlled substances 4. Operation of critical infrastructure, transportation technologies, or heavy machinery 5. Self-harm or harm to others, including suicide, cutting, and eating disorders 6. Any content intended to incite or promote violence, abuse, or any infliction of bodily harm to an individual 3. Intentionally deceive or mislead others, including use of Llama 3.1 related to the following: 1. Generating, promoting, or furthering fraud or the creation or promotion of disinformation 2. Generating, promoting, or furthering defamatory content, including the creation of defamatory statements, images, or other content 3. Generating, promoting, or further distributing spam 4. Impersonating another individual without consent, authorization, or legal right 5. Representing that the use of Llama 3.1 or outputs are human-generated 6. Generating or facilitating false online engagement, including fake reviews and other means of fake online engagement 4. Fail to appropriately disclose to end users any known dangers of your AI system Please report any violation of this Policy, software “bug,” or other problems that could lead to a violation of this Policy through one of the following means: * Reporting issues with the model: * Reporting risky content generated by the model: developers.facebook.com/llama_output_feedback * Reporting bugs and security concerns: facebook.com/whitehat/info * Reporting violations of the Acceptable Use Policy or unlicensed uses of Meta Llama 3: LlamaUseReport@meta.com extra_gated_fields: First Name: text Last Name: text Date of birth: date_picker Country: country Affiliation: text Job title: type: select options: - Student - Research Graduate - AI researcher - AI developer/engineer - Reporter - Other geo: ip_location By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy: checkbox extra_gated_description: The information you provide will be collected, stored, processed and shared in accordance with the Meta Privacy Policy. extra_gated_button_content: Submit --- is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes (text in/text out). The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks. **Model developer**: Meta **Model Architecture:** Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
Training Data Params Input modalities Output modalities Context length GQA Token count Knowledge cutoff
Llama 3.1 (text only) A new mix of publicly available online data. 8B Multilingual Text Multilingual Text and code 128k Yes 15T+ December 2023
70B Multilingual Text Multilingual Text and code 128k Yes
405B Multilingual Text Multilingual Text and code 128k Yes
**Supported languages:** English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. **Llama 3.1 family of models**. Token counts refer to pretraining data only. All model versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date:** July 23, 2024. **Status:** This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. **License:** A custom commercial license, the Llama 3.1 Community License, is available at: Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model README. For more technical information about generation parameters and recipes for how to use Llama 3.1 in applications, please go here. ## Intended Use **Intended Use Cases** Llama 3.1 is intended for commercial and research use in multiple languages. Instruction tuned text only models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. The Llama 3.1 model collection also supports the ability to leverage the outputs of its models to improve other models including synthetic data generation and distillation. The Llama 3.1 Community License allows for these use cases. **Out-of-scope** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3.1 Community License. Use in languages beyond those explicitly referenced as supported in this model card**. **Note: Llama 3.1 has been trained on a broader collection of languages than the 8 supported languages. Developers may fine-tune Llama 3.1 models for languages beyond the 8 supported languages provided they comply with the Llama 3.1 Community License and the Acceptable Use Policy and in such cases are responsible for ensuring that any uses of Llama 3.1 in additional languages is done in a safe and responsible manner. ## How to use This repository contains two versions of Meta-Llama-3.1-8B-Instruct, for use with transformers and with the original codebase. ### Use with transformers Starting with onward, you can run conversational inference using the Transformers abstraction or by leveraging the Auto classes with the function. Make sure to update your transformers installation via . Note: You can also find detailed recipes on how to use the model locally, with , assisted generations, quantised and more at []( ### Use with Please, follow the instructions in the repository To download Original checkpoints, see the example command below leveraging : ## Hardware and Software **Training Factors** We used custom training libraries, Meta's custom built GPU cluster, and production infrastructure for pretraining. Fine-tuning, annotation, and evaluation were also performed on production infrastructure. **Training utilized a cumulative of** 39.3M GPU hours of computation on H100-80GB (TDP of 700W) type hardware, per the table below. Training time is the total GPU time required for training each model and power consumption is the peak power capacity per GPU device used, adjusted for power usage efficiency. **Training Greenhouse Gas Emissions** Estimated total location-based greenhouse gas emissions were **11,390** tons CO2eq for training. Since 2020, Meta has maintained net zero greenhouse gas emissions in its global operations and matched 100% of its electricity use with renewable energy, therefore the total market-based greenhouse gas emissions for training were 0 tons CO2eq.
Training Time (GPU hours) Training Power Consumption (W) Training Location-Based Greenhouse Gas Emissions

(tons CO2eq)

Training Market-Based Greenhouse Gas Emissions

(tons CO2eq)

Llama 3.1 8B 1.46M 700 420 0
Llama 3.1 70B 7.0M 700 2,040 0
Llama 3.1 405B 30.84M 700 8,930 0
Total 39.3M
11,390 0
The methodology used to determine training energy use and greenhouse gas emissions can be found here. Since Meta is openly releasing these models, the training energy use and greenhouse gas emissions will not be incurred by others. ## Training Data **Overview:** Llama 3.1 was pretrained on ~15 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over 25M synthetically generated examples. **Data Freshness:** The pretraining data has a cutoff of December 2023. ## Benchmark scores In this section, we report the results for Llama 3.1 models on standard automatic benchmarks. For all the evaluations, we use our internal evaluations library. ### Base pretrained models
Category Benchmark # Shots Metric Llama 3 8B Llama 3.1 8B Llama 3 70B Llama 3.1 70B Llama 3.1 405B
General MMLU 5 macro_avg/acc_char 66.7 66.7 79.5 79.3 85.2
MMLU-Pro (CoT) 5 macro_avg/acc_char 36.2 37.1 55.0 53.8 61.6
AGIEval English 3-5 average/acc_char 47.1 47.8 63.0 64.6 71.6
CommonSenseQA 7 acc_char 72.6 75.0 83.8 84.1 85.8
Winogrande 5 acc_char - 60.5 - 83.3 86.7
BIG-Bench Hard (CoT) 3 average/em 61.1 64.2 81.3 81.6 85.9
ARC-Challenge 25 acc_char 79.4 79.7 93.1 92.9 96.1
Knowledge reasoning TriviaQA-Wiki 5 em 78.5 77.6 89.7 89.8 91.8
Reading comprehension SQuAD 1 em 76.4 77.0 85.6 81.8 89.3
QuAC (F1) 1 f1 44.4 44.9 51.1 51.1 53.6
BoolQ 0 acc_char 75.7 75.0 79.0 79.4 80.0
DROP (F1) 3 f1 58.4 59.5 79.7 79.6 84.8
### Instruction tuned models
Category Benchmark # Shots Metric Llama 3 8B Instruct Llama 3.1 8B Instruct Llama 3 70B Instruct Llama 3.1 70B Instruct Llama 3.1 405B Instruct
General MMLU 5 macro_avg/acc 68.5 69.4 82.0 83.6 87.3
MMLU (CoT) 0 macro_avg/acc 65.3 73.0 80.9 86.0 88.6
MMLU-Pro (CoT) 5 micro_avg/acc_char 45.5 48.3 63.4 66.4 73.3
IFEval 76.8 80.4 82.9 87.5 88.6
Reasoning ARC-C 0 acc 82.4 83.4 94.4 94.8 96.9
GPQA 0 em 34.6 30.4 39.5 41.7 50.7
Code HumanEval 0 pass@1 60.4 72.6 81.7 80.5 89.0
MBPP ++ base version 0 pass@1 70.6 72.8 82.5 86.0 88.6
Multipl-E HumanEval 0 pass@1 - 50.8 - 65.5 75.2
Multipl-E MBPP 0 pass@1 - 52.4 - 62.0 65.7
Math GSM-8K (CoT) 8 em_maj1@1 80.6 84.5 93.0 95.1 96.8
MATH (CoT) 0 final_em 29.1 51.9 51.0 68.0 73.8
Tool Use API-Bank 0 acc 48.3 82.6 85.1 90.0 92.0
BFCL 0 acc 60.3 76.1 83.0 84.8 88.5
Gorilla Benchmark API Bench 0 acc 1.7 8.2 14.7 29.7 35.3
Nexus (0-shot) 0 macro_avg/acc 18.1 38.5 47.8 56.7 58.7
Multilingual Multilingual MGSM (CoT) 0 em - 68.9 - 86.9 91.6
#### Multilingual benchmarks
Category Benchmark Language Llama 3.1 8B Llama 3.1 70B Llama 3.1 405B
General MMLU (5-shot, macro_avg/acc) Portuguese 62.12 80.13 84.95
Spanish 62.45 80.05 85.08
Italian 61.63 80.4 85.04
German 60.59 79.27 84.36
French 62.34 79.82 84.66
Hindi 50.88 74.52 80.31
Thai 50.32 72.95 78.21
## Responsibility & Safety As part of our Responsible release approach, we followed a three-pronged strategy to managing trust & safety risks: * Enable developers to deploy helpful, safe and flexible experiences for their target audience and for the use cases supported by Llama. * Protect developers against adversarial users aiming to exploit Llama capabilities to potentially cause harm. * Provide protections for the community to help prevent the misuse of our models. ### Responsible deployment Llama is a foundational technology designed to be used in a variety of use cases, examples on how Meta’s Llama models have been responsibly deployed can be found in our Community Stories webpage. Our approach is to build the most helpful models enabling the world to benefit from the technology power, by aligning our model safety for the generic use cases addressing a standard set of harms. Developers are then in the driver seat to tailor safety for their use case, defining their own policy and deploying the models with the necessary safeguards in their Llama systems. Llama 3.1 was developed following the best practices outlined in our Responsible Use Guide, you can refer to the Responsible Use Guide to learn more. #### Llama 3.1 instruct Our main objectives for conducting safety fine-tuning are to provide the research community with a valuable resource for studying the robustness of safety fine-tuning, as well as to offer developers a readily available, safe, and powerful model for various applications to reduce the developer workload to deploy safe AI systems. For more details on the safety mitigations implemented please read the Llama 3 paper. **Fine-tuning data** We employ a multi-faceted approach to data collection, combining human-generated data from our vendors with synthetic data to mitigate potential safety risks. We’ve developed many large language model (LLM)-based classifiers that enable us to thoughtfully select high-quality prompts and responses, enhancing data quality control. **Refusals and Tone** Building on the work we started with Llama 3, we put a great emphasis on model refusals to benign prompts as well as refusal tone. We included both borderline and adversarial prompts in our safety data strategy, and modified our safety data responses to follow tone guidelines. #### Llama 3.1 systems **Large language models, including Llama 3.1, are not designed to be deployed in isolation but instead should be deployed as part of an overall AI system with additional safety guardrails as required.** Developers are expected to deploy system safeguards when building agentic systems. Safeguards are key to achieve the right helpfulness-safety alignment as well as mitigating safety and security risks inherent to the system and any integration of the model or system with external tools. As part of our responsible release approach, we provide the community with safeguards that developers should deploy with Llama models or other LLMs, including Llama Guard 3, Prompt Guard and Code Shield. All our reference implementations demos contain these safeguards by default so developers can benefit from system-level safety out-of-the-box. #### New capabilities Note that this release introduces new capabilities, including a longer context window, multilingual inputs and outputs and possible integrations by developers with third party tools. Building with these new capabilities requires specific considerations in addition to the best practices that generally apply across all Generative AI use cases. **Tool-use**: Just like in standard software development, developers are responsible for the integration of the LLM with the tools and services of their choice. They should define a clear policy for their use case and assess the integrity of the third party services they use to be aware of the safety and security limitations when using this capability. Refer to the Responsible Use Guide for best practices on the safe deployment of the third party safeguards. **Multilinguality**: Llama 3.1 supports 7 languages in addition to English: French, German, Hindi, Italian, Portuguese, Spanish, and Thai. Llama may be able to output text in other languages than those that meet performance thresholds for safety and helpfulness. We strongly discourage developers from using this model to converse in non-supported languages without implementing finetuning and system controls in alignment with their policies and the best practices shared in the Responsible Use Guide. ### Evaluations We evaluated Llama models for common use cases as well as specific capabilities. Common use cases evaluations measure safety risks of systems for most commonly built applications including chat bot, coding assistant, tool calls. We built dedicated, adversarial evaluation datasets and evaluated systems composed of Llama models and Llama Guard 3 to filter input prompt and output response. It is important to evaluate applications in context, and we recommend building dedicated evaluation dataset for your use case. Prompt Guard and Code Shield are also available if relevant to the application. Capability evaluations measure vulnerabilities of Llama models inherent to specific capabilities, for which were crafted dedicated benchmarks including long context, multilingual, tools calls, coding or memorization. **Red teaming** For both scenarios, we conducted recurring red teaming exercises with the goal of discovering risks via adversarial prompting and we used the learnings to improve our benchmarks and safety tuning datasets. We partnered early with subject-matter experts in critical risk areas to understand the nature of these real-world harms and how such models may lead to unintended harm for society. Based on these conversations, we derived a set of adversarial goals for the red team to attempt to achieve, such as extracting harmful information or reprogramming the model to act in a potentially harmful capacity. The red team consisted of experts in cybersecurity, adversarial machine learning, responsible AI, and integrity in addition to multilingual content specialists with background in integrity issues in specific geographic markets. ### Critical and other risks We specifically focused our efforts on mitigating the following critical risk areas: **1- CBRNE (Chemical, Biological, Radiological, Nuclear, and Explosive materials) helpfulness** To assess risks related to proliferation of chemical and biological weapons, we performed uplift testing designed to assess whether use of Llama 3.1 models could meaningfully increase the capabilities of malicious actors to plan or carry out attacks using these types of weapons. **2. Child Safety** Child Safety risk assessments were conducted using a team of experts, to assess the model’s capability to produce outputs that could result in Child Safety risks and inform on any necessary and appropriate risk mitigations via fine tuning. We leveraged those expert red teaming sessions to expand the coverage of our evaluation benchmarks through Llama 3 model development. For Llama 3, we conducted new in-depth sessions using objective based methodologies to assess the model risks along multiple attack vectors including the additional languages Llama 3 is trained on. We also partnered with content specialists to perform red teaming exercises assessing potentially violating content while taking account of market specific nuances or experiences. **3. Cyber attack enablement** Our cyber attack uplift study investigated whether LLMs can enhance human capabilities in hacking tasks, both in terms of skill level and speed. Our attack automation study focused on evaluating the capabilities of LLMs when used as autonomous agents in cyber offensive operations, specifically in the context of ransomware attacks. This evaluation was distinct from previous studies that considered LLMs as interactive assistants. The primary objective was to assess whether these models could effectively function as independent agents in executing complex cyber-attacks without human intervention. Our study of Llama-3.1-405B’s social engineering uplift for cyber attackers was conducted to assess the effectiveness of AI models in aiding cyber threat actors in spear phishing campaigns. Please read our Llama 3.1 Cyber security whitepaper to learn more. ### Community Generative AI safety requires expertise and tooling, and we believe in the strength of the open community to accelerate its progress. We are active members of open consortiums, including the AI Alliance, Partnership on AI and MLCommons, actively contributing to safety standardization and transparency. We encourage the community to adopt taxonomies like the MLCommons Proof of Concept evaluation to facilitate collaboration and transparency on safety and content evaluations. Our Purple Llama tools are open sourced for the community to use and widely distributed across ecosystem partners including cloud service providers. We encourage community contributions to our Github repository. We also set up the Llama Impact Grants program to identify and support the most compelling applications of Meta’s Llama model for societal benefit across three categories: education, climate and open innovation. The 20 finalists from the hundreds of applications can be found here. Finally, we put in place a set of resources including an output reporting mechanism and bug bounty program to continuously improve the Llama technology with the help of the community. ## Ethical Considerations and Limitations The core values of Llama 3.1 are openness, inclusivity and helpfulness. It is meant to serve everyone, and to work for a wide range of use cases. It is thus designed to be accessible to people across many different backgrounds, experiences and perspectives. Llama 3.1 addresses users and their needs as they are, without insertion unnecessary judgment or normativity, while reflecting the understanding that even content that may appear problematic in some cases can serve valuable purposes in others. It respects the dignity and autonomy of all users, especially in terms of the values of free thought and expression that power innovation and progress. But Llama 3.1 is a new technology, and like any new technology, there are risks associated with its use. Testing conducted to date has not covered, nor could it cover, all scenarios. For these reasons, as with all LLMs, Llama 3.1’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 3.1 models, developers should perform safety testing and tuning tailored to their specific applications of the model. Please refer to available resources including our Responsible Use Guide, Trust and Safety solutions, and other resources to learn more about responsible development.", + "model_explanation_gemini": "A quantized version of Meta's Llama-3.1-8B-Instruct model optimized for efficient inference, supporting multilingual text generation.\n\n**Features:** \n- Multilingual support (English, German, French, Italian, Portuguese, Hindi, Spanish, Thai) \n- Text generation capability \n- Quantized GGUF format for efficiency \n- Derived from Meta's Llama 3.1 architecture \n- Community license with redistribution terms \n\n**Comparison:** \nThis model is a quantized variant" +} \ No newline at end of file diff --git a/model_data_json/Qwen_QwQ-32B-GGUF.json b/model_data_json/Qwen_QwQ-32B-GGUF.json new file mode 100644 index 0000000000000000000000000000000000000000..9f16815eddc38d56eb09fb417056b8c2c0a4874e --- /dev/null +++ b/model_data_json/Qwen_QwQ-32B-GGUF.json @@ -0,0 +1,20 @@ +{ + "model_id": "Qwen/QwQ-32B-GGUF", + "downloads": 68965, + "tags": [ + "gguf", + "chat", + "text-generation", + "en", + "arxiv:2309.00071", + "arxiv:2412.15115", + "base_model:Qwen/QwQ-32B", + "base_model:quantized:Qwen/QwQ-32B", + "license:apache-2.0", + "endpoints_compatible", + "region:us", + "conversational" + ], + "description": "--- license: apache-2.0 license_link: language: - en pipeline_tag: text-generation base_model: Qwen/QwQ-32B tags: - chat --- # QwQ-32B-GGUF \"Chat\" ## Introduction QwQ is the reasoning model of the Qwen series. Compared with conventional instruction-tuned models, QwQ, which is capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks, especially hard problems. QwQ-32B is the medium-sized reasoning model, which is capable of achieving competitive performance against state-of-the-art reasoning models, e.g., DeepSeek-R1, o1-mini.

**This repo contains the QwQ 32B model in the GGUF Format**, which has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training (Supervised Finetuning and Reinforcement Learning) - Architecture: transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias - Number of Parameters: 32.5B - Number of Paramaters (Non-Embedding): 31.0B - Number of Layers: 64 - Number of Attention Heads (GQA): 40 for Q and 8 for KV - Context Length: Full 131,072 tokens - Quantization: q4_K_M, q5_0, q5_K_M, q6_K, q8_0 **Note:** For the best experience, please review the usage guidelines before deploying QwQ models. You can try our demo or access QwQ models via QwenChat. For more details, please refer to our blog, GitHub, and Documentation. ## Requirements QwQ is based on Qwen2.5, whose code has been in the latest Hugging face . We advise you to use the latest version of . With , you will encounter the following error: Also check out our AWQ documentation for more usage guide. ## Quickstart heck out our llama.cpp documentation for more usage guide. We advise you to clone []( and install it following the official guide. We follow the latest version of llama.cpp. In the following demonstration, we assume that you are running commands under the repository . You can use the following commands for inference: ### Usage Guidelines To achieve optimal performance, we recommend the following settings: 1. **Enforce Thoughtful Output**: Ensure the model starts with \"\\\\n\" to prevent generating empty thinking content, which can degrade output quality. 2. **Sampling Parameters**: - Use Temperature=0.6, TopP=0.95, MinP=0 instead of Greedy decoding to avoid endless repetitions. - Use TopK between 20 and 40 to filter out rare token occurrences while maintaining the diversity of the generated output. - For supported frameworks, you can adjust the parameter between 0 and 2 to reduce endless repetitions. However, using a higher value may result in occasional language mixing and a slight decrease in performance. 3. **No Thinking Content in History**: In multi-turn conversations, the historical model output should only include the final output part and does not need to include the thinking content. This feature is already implemented in . 4. **Standardize Output Format**: We recommend using prompts to standardize model outputs when benchmarking. - **Math Problems**: Include \"Please reason step by step, and put your final answer within \\boxed{}.\" in the prompt. - **Multiple-Choice Questions**: Add the following JSON structure to the prompt to standardize responses: \"Please show your choice in the field with only the choice letter, e.g.,.\" in the prompt. 5. **Handle Long Inputs**: For inputs exceeding 32,768 tokens, enable YaRN to improve the model's ability to capture long-sequence information effectively. Currently, only vLLM supports YARN for length extrapolating. If you want to process sequences up to 131,072 tokens, please refer to non-GGUF models. 6. **Other References**: You can also consult Unsloth's Guide to see if their approach meets your needs. (Thanks to the Unsloth team!) ## Evaluation & Performance Detailed evaluation results are reported in this 📑 blog. For requirements on GPU memory and the respective throughput, see results here. ## Citation If you find our work helpful, feel free to give us a cite.", + "model_explanation_gemini": "A 32.5B-parameter reasoning-focused language model optimized for enhanced performance on complex tasks through structured thinking and step-by-step problem-solving, supporting long-context (131K tokens) and GGUF quantization. \n\n**Features:** \n- **Type:** Causal language model \n- **Architecture:** Transformers with RoPE, SwiGLU, RMSNorm, Attention QKV bias \n- **Training:** Pretraining + supervised finetuning + RL \n- **Context Length:**" +} \ No newline at end of file diff --git a/model_data_json/Qwen_Qwen2-VL-2B.json b/model_data_json/Qwen_Qwen2-VL-2B.json new file mode 100644 index 0000000000000000000000000000000000000000..46e8956afb91e668e80cf2cebe5721d22f022583 --- /dev/null +++ b/model_data_json/Qwen_Qwen2-VL-2B.json @@ -0,0 +1,21 @@ +{ + "model_id": "Qwen/Qwen2-VL-2B", + "downloads": 77369, + "tags": [ + "transformers", + "safetensors", + "qwen2_vl", + "image-text-to-text", + "multimodal", + "conversational", + "en", + "arxiv:2409.12191", + "arxiv:2308.12966", + "license:apache-2.0", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: apache-2.0 language: - en pipeline_tag: image-text-to-text tags: - multimodal library_name: transformers --- # Qwen2-VL-2B ## Introduction We're excited to unveil **Qwen2-VL**, the latest iteration of our Qwen-VL model, representing nearly a year of innovation. > [!Important] > This is the base pretrained model of Qwen2-VL-2B without instruction tuning. ### What’s New in Qwen2-VL? #### Key Enhancements: * **SoTA understanding of images of various resolution & ratio**: Qwen2-VL achieves state-of-the-art performance on visual understanding benchmarks, including MathVista, DocVQA, RealWorldQA, MTVQA, etc. * **Understanding videos of 20min+**: Qwen2-VL can understand videos over 20 minutes for high-quality video-based question answering, dialog, content creation, etc. * **Agent that can operate your mobiles, robots, etc.**: with the abilities of complex reasoning and decision making, Qwen2-VL can be integrated with devices like mobile phones, robots, etc., for automatic operation based on visual environment and text instructions. * **Multilingual Support**: to serve global users, besides English and Chinese, Qwen2-VL now supports the understanding of texts in different languages inside images, including most European languages, Japanese, Korean, Arabic, Vietnamese, etc. #### Model Architecture Updates: * **Naive Dynamic Resolution**: Unlike before, Qwen2-VL can handle arbitrary image resolutions, mapping them into a dynamic number of visual tokens, offering a more human-like visual processing experience.

* **Multimodal Rotary Position Embedding (M-ROPE)**: Decomposes positional embedding into parts to capture 1D textual, 2D visual, and 3D video positional information, enhancing its multimodal processing capabilities.

We have three models with 2, 7 and 72 billion parameters. This repo contains the **pretrained** 2B Qwen2-VL model. For more information, visit our Blog and GitHub. ## Requirements The code of Qwen2-VL has been in the latest Hugging Face and we advise you to install the latest version with command , or you might encounter the following error: ## Citation If you find our work helpful, feel free to give us a cite.", + "model_explanation_gemini": "\"Qwen2-VL-2B is a 2-billion-parameter multimodal pretrained model excelling in visual understanding (images, videos), multilingual text recognition, and device integration for tasks like question answering, content creation, and robotic control.\"\n\n**Features**: \n- State-of-the-art image understanding (variable resolution/ratio) \n- Video comprehension (20+ minutes) \n- Multilingual text recognition in images (European, Asian, Arabic languages) \n- Device integration for robotic/m" +} \ No newline at end of file diff --git a/model_data_json/Qwen_Qwen2-VL-72B.json b/model_data_json/Qwen_Qwen2-VL-72B.json new file mode 100644 index 0000000000000000000000000000000000000000..4426684004063200d9d2fe13534dd14b492ac6c9 --- /dev/null +++ b/model_data_json/Qwen_Qwen2-VL-72B.json @@ -0,0 +1,21 @@ +{ + "model_id": "Qwen/Qwen2-VL-72B", + "downloads": 77971, + "tags": [ + "transformers", + "safetensors", + "qwen2_vl", + "image-text-to-text", + "multimodal", + "conversational", + "en", + "arxiv:2409.12191", + "arxiv:2308.12966", + "license:other", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: other license_name: qwen license_link: language: - en pipeline_tag: image-text-to-text tags: - multimodal library_name: transformers --- # Qwen2-VL-72B ## Introduction We're excited to unveil **Qwen2-VL**, the latest iteration of our Qwen-VL model, representing nearly a year of innovation. > [!Important] > This is the base pretrained model of Qwen2-VL-72B without instruction tuning. ### What’s New in Qwen2-VL? #### Key Enhancements: * **SoTA understanding of images of various resolution & ratio**: Qwen2-VL achieves state-of-the-art performance on visual understanding benchmarks, including MathVista, DocVQA, RealWorldQA, MTVQA, etc. * **Understanding videos of 20min+**: Qwen2-VL can understand videos over 20 minutes for high-quality video-based question answering, dialog, content creation, etc. * **Agent that can operate your mobiles, robots, etc.**: with the abilities of complex reasoning and decision making, Qwen2-VL can be integrated with devices like mobile phones, robots, etc., for automatic operation based on visual environment and text instructions. * **Multilingual Support**: to serve global users, besides English and Chinese, Qwen2-VL now supports the understanding of texts in different languages inside images, including most European languages, Japanese, Korean, Arabic, Vietnamese, etc. #### Model Architecture Updates: * **Naive Dynamic Resolution**: Unlike before, Qwen2-VL can handle arbitrary image resolutions, mapping them into a dynamic number of visual tokens, offering a more human-like visual processing experience.

* **Multimodal Rotary Position Embedding (M-ROPE)**: Decomposes positional embedding into parts to capture 1D textual, 2D visual, and 3D video positional information, enhancing its multimodal processing capabilities.

We have three models with 2, 7 and 72 billion parameters. This repo contains the **pretrained** 72B Qwen2-VL model. For more information, visit our Blog and GitHub. ## Requirements The code of Qwen2-VL has been in the latest Hugging Face and we advise you to install the latest version with command , or you might encounter the following error: ## Citation If you find our work helpful, feel free to give us a cite.", + "model_explanation_gemini": "\"Qwen2-VL-72B is a multimodal, pretrained base model excelling in high-resolution image understanding, long-video comprehension (20+ minutes), multilingual text recognition in images, and device operation via visual reasoning, featuring dynamic resolution handling and enhanced multimodal processing.\"\n\n**Features:** \n1. State-of-the-art visual understanding (images, videos, documents). \n2. Supports arbitrary image resolutions via dynamic token mapping. \n3. Processes 20+ minute videos for QA/dialog" +} \ No newline at end of file diff --git a/model_data_json/Qwen_Qwen2.5-32B-Instruct-AWQ.json b/model_data_json/Qwen_Qwen2.5-32B-Instruct-AWQ.json new file mode 100644 index 0000000000000000000000000000000000000000..d0599589066ca826505b03685c4f643ce9ddb9ce --- /dev/null +++ b/model_data_json/Qwen_Qwen2.5-32B-Instruct-AWQ.json @@ -0,0 +1,26 @@ +{ + "model_id": "Qwen/Qwen2.5-32B-Instruct-AWQ", + "downloads": 70467, + "tags": [ + "transformers", + "safetensors", + "qwen2", + "text-generation", + "chat", + "conversational", + "en", + "arxiv:2309.00071", + "arxiv:2407.10671", + "base_model:Qwen/Qwen2.5-32B-Instruct", + "base_model:quantized:Qwen/Qwen2.5-32B-Instruct", + "license:apache-2.0", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "4-bit", + "awq", + "region:us" + ], + "description": "--- base_model: Qwen/Qwen2.5-32B-Instruct language: - en library_name: transformers license: apache-2.0 license_link: pipeline_tag: text-generation tags: - chat --- # Qwen2.5-32B-Instruct-AWQ ## Introduction Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2: - Significantly **more knowledge** and has greatly improved capabilities in **coding** and **mathematics**, thanks to our specialized expert models in these domains. - Significant improvements in **instruction following**, **generating long texts** (over 8K tokens), **understanding structured data** (e.g, tables), and **generating structured outputs** especially JSON. **More resilient to the diversity of system prompts**, enhancing role-play implementation and condition-setting for chatbots. - **Long-context Support** up to 128K tokens and can generate up to 8K tokens. - **Multilingual support** for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more. **This repo contains the AWQ-quantized 4-bit instruction-tuned 32B Qwen2.5 model**, which has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Architecture: transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias - Number of Parameters: 32.5B - Number of Paramaters (Non-Embedding): 31.0B - Number of Layers: 64 - Number of Attention Heads (GQA): 40 for Q and 8 for KV - Context Length: Full 131,072 tokens and generation 8192 tokens - Please refer to this section for detailed instructions on how to deploy Qwen2.5 for handling long texts. - Quantization: AWQ 4-bit For more details, please refer to our blog, GitHub, and Documentation. ## Requirements The code of Qwen2.5 has been in the latest Hugging face and we advise you to use the latest version of . With , you will encounter the following error: Also check out our AWQ documentation for more usage guide. ## Quickstart Here provides a code snippet with to show you how to load the tokenizer and model and how to generate contents. ### Processing Long Texts The current is set for context length up to 32,768 tokens. To handle extensive inputs exceeding 32,768 tokens, we utilize YaRN, a technique for enhancing model length extrapolation, ensuring optimal performance on lengthy texts. For supported frameworks, you could add the following to to enable YaRN: For deployment, we recommend using vLLM. Please refer to our Documentation for usage if you are not familar with vLLM. Presently, vLLM only supports static YARN, which means the scaling factor remains constant regardless of input length, **potentially impacting performance on shorter texts**. We advise adding the configuration only when processing long contexts is required. ## Evaluation & Performance Detailed evaluation results are reported in this 📑 blog. For quantized models, the benchmark results against the original bfloat16 models can be found here For requirements on GPU memory and the respective throughput, see results here. ## Citation If you find our work helpful, feel free to give us a cite.", + "model_explanation_gemini": "A 4-bit AWQ-quantized, 32.5B-parameter instruction-tuned causal language model optimized for text generation, featuring enhanced coding, mathematics, multilingual support (29+ languages), structured data handling, and long-context processing (128K tokens). \n\n**Features:** \n- **Core Task:** Text generation (chat, instruction following) \n- **Key Improvements:** Better coding/math, structured output (JSON), long-text generation (8K tokens), multilingual support" +} \ No newline at end of file diff --git a/model_data_json/Qwen_Qwen2.5-32B.json b/model_data_json/Qwen_Qwen2.5-32B.json new file mode 100644 index 0000000000000000000000000000000000000000..84f0774931278a3b386b9272aeb4e3345f8fa56a --- /dev/null +++ b/model_data_json/Qwen_Qwen2.5-32B.json @@ -0,0 +1,16 @@ +{ + "model_id": "Qwen/Qwen2.5-32B", + "downloads": 76454, + "tags": [ + "safetensors", + "qwen2", + "text-generation", + "conversational", + "en", + "arxiv:2407.10671", + "license:apache-2.0", + "region:us" + ], + "description": "--- license: apache-2.0 license_link: language: - en pipeline_tag: text-generation --- # Qwen2.5-32B ## Introduction Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2: - Significantly **more knowledge** and has greatly improved capabilities in **coding** and **mathematics**, thanks to our specialized expert models in these domains. - Significant improvements in **instruction following**, **generating long texts** (over 8K tokens), **understanding structured data** (e.g, tables), and **generating structured outputs** especially JSON. **More resilient to the diversity of system prompts**, enhancing role-play implementation and condition-setting for chatbots. - **Long-context Support** up to 128K tokens and can generate up to 8K tokens. - **Multilingual support** for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more. **This repo contains the base 32B Qwen2.5 model**, which has the following features: - Type: Causal Language Models - Training Stage: Pretraining - Architecture: transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias - Number of Parameters: 32.5B - Number of Paramaters (Non-Embedding): 31.0B - Number of Layers: 64 - Number of Attention Heads (GQA): 40 for Q and 8 for KV - Context Length: 131,072 tokens **We do not recommend using base language models for conversations.** Instead, you can apply post-training, e.g., SFT, RLHF, continued pretraining, etc., on this model. For more details, please refer to our blog, GitHub, and Documentation. ## Requirements The code of Qwen2.5 has been in the latest Hugging face and we advise you to use the latest version of . With , you will encounter the following error: ## Evaluation & Performance Detailed evaluation results are reported in this 📑 blog. For requirements on GPU memory and the respective throughput, see results here. ## Citation If you find our work helpful, feel free to give us a cite.", + "model_explanation_gemini": "\"Qwen2.5-32B is a 32.5B-parameter causal language model optimized for text generation, featuring enhanced coding, mathematics, multilingual support (29+ languages), long-context handling (128K tokens), structured data understanding, and improved instruction following compared to Qwen2.\"\n\n### **Model Features:** \n- **Type:** Causal Language Model \n- **Architecture:** Transformers with RoPE, SwiGLU, RMSNorm, GQA (40" +} \ No newline at end of file diff --git a/model_data_json/Qwen_Qwen2.5-Math-7B-Instruct.json b/model_data_json/Qwen_Qwen2.5-Math-7B-Instruct.json new file mode 100644 index 0000000000000000000000000000000000000000..1d56f51391de66bd1b4e3ef42b56ce9387cffa60 --- /dev/null +++ b/model_data_json/Qwen_Qwen2.5-Math-7B-Instruct.json @@ -0,0 +1,23 @@ +{ + "model_id": "Qwen/Qwen2.5-Math-7B-Instruct", + "downloads": 71240, + "tags": [ + "transformers", + "safetensors", + "qwen2", + "text-generation", + "chat", + "conversational", + "en", + "arxiv:2409.12122", + "base_model:Qwen/Qwen2.5-Math-7B", + "base_model:finetune:Qwen/Qwen2.5-Math-7B", + "license:apache-2.0", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- base_model: Qwen/Qwen2.5-Math-7B language: - en pipeline_tag: text-generation tags: - chat library_name: transformers license: apache-2.0 license_link: --- # Qwen2.5-Math-7B-Instruct > [!Warning] >

> > 🚨 Qwen2.5-Math mainly supports solving English and Chinese math problems through CoT and TIR. We do not recommend using this series of models for other tasks. > >
## Introduction In August 2024, we released the first series of mathematical LLMs - Qwen2-Math - of our Qwen family. A month later, we have upgraded it and open-sourced **Qwen2.5-Math** series, including base models **Qwen2.5-Math-1.5B/7B/72B**, instruction-tuned models **Qwen2.5-Math-1.5B/7B/72B-Instruct**, and mathematical reward model **Qwen2.5-Math-RM-72B**. Unlike Qwen2-Math series which only supports using Chain-of-Thught (CoT) to solve English math problems, Qwen2.5-Math series is expanded to support using both CoT and Tool-integrated Reasoning (TIR) to solve math problems in both Chinese and English. The Qwen2.5-Math series models have achieved significant performance improvements compared to the Qwen2-Math series models on the Chinese and English mathematics benchmarks with CoT. ![]( While CoT plays a vital role in enhancing the reasoning capabilities of LLMs, it faces challenges in achieving computational accuracy and handling complex mathematical or algorithmic reasoning tasks, such as finding the roots of a quadratic equation or computing the eigenvalues of a matrix. TIR can further improve the model's proficiency in precise computation, symbolic manipulation, and algorithmic manipulation. Qwen2.5-Math-1.5B/7B/72B-Instruct achieve 79.7, 85.3, and 87.8 respectively on the MATH benchmark using TIR. ## Model Details For more details, please refer to our blog post and GitHub repo. ## Requirements * for Qwen2.5-Math models. The latest version is recommended. > [!Warning] >
> > 🚨 This is a must because transformers integrated Qwen2 codes since 4.37.0. > >
For requirements on GPU memory and the respective throughput, see similar results of Qwen2 here. ## Quick Start > [!Important] > > **Qwen2.5-Math-7B-Instruct** is an instruction model for chatting; > > **Qwen2.5-Math-7B** is a base model typically used for completion and few-shot inference, serving as a better starting point for fine-tuning. > ### 🤗 Hugging Face Transformers Qwen2.5-Math can be deployed and infered in the same way as Qwen2.5. Here we show a code snippet to show you how to use the chat model with : ## Citation If you find our work helpful, feel free to give us a citation.", + "model_explanation_gemini": "Specializes in solving English and Chinese math problems using Chain-of-Thought (CoT) and Tool-integrated Reasoning (TIR), with improved accuracy over its predecessor Qwen2-Math. \n\n**Features:** \n1. Supports English and Chinese math problem-solving. \n2. Utilizes CoT and TIR for enhanced reasoning and computational precision. \n3. Outperforms Qwen2-Math on Chinese/English benchmarks. \n4. Available in 1.5B," +} \ No newline at end of file diff --git a/model_data_json/Qwen_Qwen2.5-VL-7B-Instruct-AWQ.json b/model_data_json/Qwen_Qwen2.5-VL-7B-Instruct-AWQ.json new file mode 100644 index 0000000000000000000000000000000000000000..15e96dff41950772d9a48ac8fc9a2a76f304ce72 --- /dev/null +++ b/model_data_json/Qwen_Qwen2.5-VL-7B-Instruct-AWQ.json @@ -0,0 +1,26 @@ +{ + "model_id": "Qwen/Qwen2.5-VL-7B-Instruct-AWQ", + "downloads": 78495, + "tags": [ + "transformers", + "safetensors", + "qwen2_5_vl", + "image-text-to-text", + "multimodal", + "conversational", + "en", + "arxiv:2309.00071", + "arxiv:2409.12191", + "arxiv:2308.12966", + "base_model:Qwen/Qwen2.5-VL-7B-Instruct", + "base_model:quantized:Qwen/Qwen2.5-VL-7B-Instruct", + "license:apache-2.0", + "text-generation-inference", + "endpoints_compatible", + "4-bit", + "awq", + "region:us" + ], + "description": "--- license: apache-2.0 language: - en pipeline_tag: image-text-to-text tags: - multimodal library_name: transformers base_model: - Qwen/Qwen2.5-VL-7B-Instruct --- # Qwen2.5-VL-7B-Instruct-AWQ \"Chat\" ## Introduction In the past five months since Qwen2-VL’s release, numerous developers have built new models on the Qwen2-VL vision-language models, providing us with valuable feedback. During this period, we focused on building more useful vision-language models. Today, we are excited to introduce the latest addition to the Qwen family: Qwen2.5-VL. #### Key Enhancements: * **Understand things visually**: Qwen2.5-VL is not only proficient in recognizing common objects such as flowers, birds, fish, and insects, but it is highly capable of analyzing texts, charts, icons, graphics, and layouts within images. * **Being agentic**: Qwen2.5-VL directly plays as a visual agent that can reason and dynamically direct tools, which is capable of computer use and phone use. * **Understanding long videos and capturing events**: Qwen2.5-VL can comprehend videos of over 1 hour, and this time it has a new ability of cpaturing event by pinpointing the relevant video segments. * **Capable of visual localization in different formats**: Qwen2.5-VL can accurately localize objects in an image by generating bounding boxes or points, and it can provide stable JSON outputs for coordinates and attributes. * **Generating structured outputs**: for data like scans of invoices, forms, tables, etc. Qwen2.5-VL supports structured outputs of their contents, benefiting usages in finance, commerce, etc. #### Model Architecture Updates: * **Dynamic Resolution and Frame Rate Training for Video Understanding**: We extend dynamic resolution to the temporal dimension by adopting dynamic FPS sampling, enabling the model to comprehend videos at various sampling rates. Accordingly, we update mRoPE in the time dimension with IDs and absolute time alignment, enabling the model to learn temporal sequence and speed, and ultimately acquire the ability to pinpoint specific moments.

* **Streamlined and Efficient Vision Encoder** We enhance both training and inference speeds by strategically implementing window attention into the ViT. The ViT architecture is further optimized with SwiGLU and RMSNorm, aligning it with the structure of the Qwen2.5 LLM. We have three models with 3, 7 and 72 billion parameters. This repo contains the instruction-tuned 7B Qwen2.5-VL model with AWQ. For more information, visit our Blog and GitHub. ## Evaluation ## Requirements The code of Qwen2.5-VL has been in the latest Hugging face transformers and we advise you to build from source with command: or you might encounter the following error: ## Quickstart Below, we provide simple examples to show how to use Qwen2.5-VL with 🤖 ModelScope and 🤗 Transformers. The code of Qwen2.5-VL has been in the latest Hugging face transformers and we advise you to build from source with command: or you might encounter the following error: We offer a toolkit to help you handle various types of visual input more conveniently, as if you were using an API. This includes base64, URLs, and interleaved images and videos. You can install it using the following command: If you are not using Linux, you might not be able to install from PyPI. In that case, you can use which will fall back to using torchvision for video processing. However, you can still install decord from source to get decord used when loading video. ### Using 🤗 Transformers to Chat Here we show a code snippet to show you how to use the chat model with and : ### 🤖 ModelScope We strongly advise users especially those in mainland China to use ModelScope. can help you solve issues concerning downloading checkpoints. ### More Usage Tips For input images, we support local files, base64, and URLs. For videos, we currently only support local files. #### Image Resolution for performance boost The model supports a wide range of resolution inputs. By default, it uses the native resolution for input, but higher resolutions can enhance performance at the cost of more computation. Users can set the minimum and maximum number of pixels to achieve an optimal configuration for their needs, such as a token count range of 256-1280, to balance speed and memory usage. Besides, We provide two methods for fine-grained control over the image size input to the model: 1. Define min_pixels and max_pixels: Images will be resized to maintain their aspect ratio within the range of min_pixels and max_pixels. 2. Specify exact dimensions: Directly set and . These values will be rounded to the nearest multiple of 28. ### Processing Long Texts The current is set for context length up to 32,768 tokens. To handle extensive inputs exceeding 32,768 tokens, we utilize YaRN, a technique for enhancing model length extrapolation, ensuring optimal performance on lengthy texts. For supported frameworks, you could add the following to to enable YaRN: { ..., \"type\": \"yarn\", \"mrope_section\": [ 16, 24, 24 ], \"factor\": 4, \"original_max_position_embeddings\": 32768 } However, it should be noted that this method has a significant impact on the performance of temporal and spatial localization tasks, and is therefore not recommended for use. At the same time, for long video inputs, since MRoPE itself is more economical with ids, the max_position_embeddings can be directly modified to a larger value, such as 64k. ### Benchmark #### Performance of Quantized Models This section reports the generation performance of quantized models (including GPTQ and AWQ) of the Qwen2.5-VL series. Specifically, we report: - MMMU_VAL (Accuracy) - DocVQA_VAL (Accuracy) - MMBench_DEV_EN (Accuracy) - MathVista_MINI (Accuracy) We use VLMEvalkit to evaluate all models. | Model Size | Quantization | MMMU_VAL | DocVQA_VAL | MMBench_EDV_EN | MathVista_MINI | | --- | --- | --- | --- | --- | --- | | Qwen2.5-VL-72B-Instruct | BF16
(🤗🤖) | 70.0 | 96.1 | 88.2 | 75.3 | | | AWQ
(🤗🤖) | 69.1 | 96.0 | 87.9 | 73.8 | | Qwen2.5-VL-7B-Instruct | BF16
(🤗🤖) | 58.4 | 94.9 | 84.1 | 67.9 | | | AWQ
(🤗🤖) | 55.6 | 94.6 | 84.2 | 64.7 | | Qwen2.5-VL-3B-Instruct | BF16
(🤗🤖) | 51.7 | 93.0 | 79.8 | 61.4 | | | AWQ
(🤗🤖) | 49.1 | 91.8 | 78.0 | 58.8 | ## Citation If you find our work helpful, feel free to give us a cite.", + "model_explanation_gemini": "**A multimodal vision-language model excelling in visual understanding, agentic reasoning, long-video comprehension, object localization, and structured output generation for tasks like document analysis.** \n\n### **Features:** \n1. **Visual Understanding** – Recognizes objects, analyzes text/graphics in images, and interprets charts/layouts. \n2. **Agentic Capabilities** – Functions as a visual agent, dynamically using tools for computer/phone tasks. \n3. **Long-Video Processing** –" +} \ No newline at end of file diff --git a/model_data_json/Qwen_Qwen3-32B.json b/model_data_json/Qwen_Qwen3-32B.json new file mode 100644 index 0000000000000000000000000000000000000000..77de116457a24885bfc3eefde09e623a14cf3141 --- /dev/null +++ b/model_data_json/Qwen_Qwen3-32B.json @@ -0,0 +1,18 @@ +{ + "model_id": "Qwen/Qwen3-32B", + "downloads": 75675, + "tags": [ + "transformers", + "safetensors", + "qwen3", + "text-generation", + "conversational", + "arxiv:2309.00071", + "license:apache-2.0", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- library_name: transformers license: apache-2.0 license_link: pipeline_tag: text-generation --- # Qwen3-32B \"Chat\" ## Qwen3 Highlights Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features: - **Uniquely support of seamless switching between thinking mode** (for complex logical reasoning, math, and coding) and **non-thinking mode** (for efficient, general-purpose dialogue) **within single model**, ensuring optimal performance across various scenarios. - **Significantly enhancement in its reasoning capabilities**, surpassing previous QwQ (in thinking mode) and Qwen2.5 instruct models (in non-thinking mode) on mathematics, code generation, and commonsense logical reasoning. - **Superior human preference alignment**, excelling in creative writing, role-playing, multi-turn dialogues, and instruction following, to deliver a more natural, engaging, and immersive conversational experience. - **Expertise in agent capabilities**, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open-source models in complex agent-based tasks. - **Support of 100+ languages and dialects** with strong capabilities for **multilingual instruction following** and **translation**. ## Model Overview **Qwen3-32B** has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parameters: 32.8B - Number of Paramaters (Non-Embedding): 31.2B - Number of Layers: 64 - Number of Attention Heads (GQA): 64 for Q and 8 for KV - Context Length: 32,768 natively and 131,072 tokens with YaRN. For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our blog, GitHub, and Documentation. ## Quickstart The code of Qwen3 has been in the latest Hugging Face and we advise you to use the latest version of . With , you will encounter the following error: The following contains a code snippet illustrating how to use the model generate content based on given inputs. For deployment, you can use or or to create an OpenAI-compatible API endpoint: - SGLang: - vLLM: For local use, applications such as Ollama, LMStudio, MLX-LM, llama.cpp, and KTransformers have also supported Qwen3. ## Switching Between Thinking and Non-Thinking Mode > [!TIP] > The switch is also available in APIs created by SGLang and vLLM. > Please refer to our documentation for SGLang and vLLM users. ### By default, Qwen3 has thinking capabilities enabled, similar to QwQ-32B. This means the model will use its reasoning abilities to enhance the quality of generated responses. For example, when explicitly setting or leaving it as the default value in , the model will engage its thinking mode. In this mode, the model will generate think content wrapped in a block, followed by the final response. > [!NOTE] > For thinking mode, use , , , and (the default setting in ). **DO NOT use greedy decoding**, as it can lead to performance degradation and endless repetitions. For more detailed guidance, please refer to the Best Practices section. ### We provide a hard switch to strictly disable the model's thinking behavior, aligning its functionality with the previous Qwen2.5-Instruct models. This mode is particularly useful in scenarios where disabling thinking is essential for enhancing efficiency. In this mode, the model will not generate any think content and will not include a block. > [!NOTE] > For non-thinking mode, we suggest using , , , and . For more detailed guidance, please refer to the Best Practices section. ### Advanced Usage: Switching Between Thinking and Non-Thinking Modes via User Input We provide a soft switch mechanism that allows users to dynamically control the model's behavior when . Specifically, you can add and to user prompts or system messages to switch the model's thinking mode from turn to turn. The model will follow the most recent instruction in multi-turn conversations. Here is an example of a multi-turn conversation: > [!NOTE] > For API compatibility, when , regardless of whether the user uses or , the model will always output a block wrapped in . However, the content inside this block may be empty if thinking is disabled. > When , the soft switches are not valid. Regardless of any or tags input by the user, the model will not generate think content and will not include a block. ## Agentic Use Qwen3 excels in tool calling capabilities. We recommend using Qwen-Agent to make the best use of agentic ability of Qwen3. Qwen-Agent encapsulates tool-calling templates and tool-calling parsers internally, greatly reducing coding complexity. To define the available tools, you can use the MCP configuration file, use the integrated tool of Qwen-Agent, or integrate other tools by yourself. ## Processing Long Texts Qwen3 natively supports context lengths of up to 32,768 tokens. For conversations where the total length (including both input and output) significantly exceeds this limit, we recommend using RoPE scaling techniques to handle long texts effectively. We have validated the model's performance on context lengths of up to 131,072 tokens using the YaRN method. YaRN is currently supported by several inference frameworks, e.g., and for local use, and for deployment. In general, there are two approaches to enabling YaRN for supported frameworks: - Modifying the model files: In the file, add the fields: For , you need to regenerate the GGUF file after the modification. - Passing command line arguments: For , you can use For , you can use For from , you can use > [!IMPORTANT] > If you encounter the following warning > > please upgrade . > [!NOTE] > All the notable open-source frameworks implement static YaRN, which means the scaling factor remains constant regardless of input length, **potentially impacting performance on shorter texts.** > We advise adding the configuration only when processing long contexts is required. > It is also recommended to modify the as needed. For example, if the typical context length for your application is 65,536 tokens, it would be better to set as 2.0. > [!NOTE] > The default in is set to 40,960. This allocation includes reserving 32,768 tokens for outputs and 8,192 tokens for typical prompts, which is sufficient for most scenarios involving short text processing. If the average context length does not exceed 32,768 tokens, we do not recommend enabling YaRN in this scenario, as it may potentially degrade model performance. > [!TIP] > The endpoint provided by Alibaba Model Studio supports dynamic YaRN by default and no extra configuration is needed. ## Best Practices To achieve optimal performance, we recommend the following settings: 1. **Sampling Parameters**: - For thinking mode (), use , , , and . **DO NOT use greedy decoding**, as it can lead to performance degradation and endless repetitions. - For non-thinking mode (), we suggest using , , , and . - For supported frameworks, you can adjust the parameter between 0 and 2 to reduce endless repetitions. However, using a higher value may occasionally result in language mixing and a slight decrease in model performance. 2. **Adequate Output Length**: We recommend using an output length of 32,768 tokens for most queries. For benchmarking on highly complex problems, such as those found in math and programming competitions, we suggest setting the max output length to 38,912 tokens. This provides the model with sufficient space to generate detailed and comprehensive responses, thereby enhancing its overall performance. 3. **Standardize Output Format**: We recommend using prompts to standardize model outputs when benchmarking. - **Math Problems**: Include \"Please reason step by step, and put your final answer within \\boxed{}.\" in the prompt. - **Multiple-Choice Questions**: Add the following JSON structure to the prompt to standardize responses: \"Please show your choice in the field with only the choice letter, e.g., .\" 4. **No Thinking Content in History**: In multi-turn conversations, the historical model output should only include the final output part and does not need to include the thinking content. It is implemented in the provided chat template in Jinja2. However, for frameworks that do not directly use the Jinja2 chat template, it is up to the developers to ensure that the best practice is followed. ### Citation If you find our work helpful, feel free to give us a cite.", + "model_explanation_gemini": "A 32.8B-parameter causal language model designed for advanced reasoning, multilingual tasks, and agent capabilities, featuring seamless switching between thinking (complex reasoning) and non-thinking (efficient dialogue) modes. \n\n**Features:** \n- **Dual-mode operation:** Toggle between thinking (logic/math/coding) and non-thinking (general dialogue) modes. \n- **Enhanced reasoning:** Outperforms predecessors (QwQ, Qwen2.5) in math, code," +} \ No newline at end of file diff --git a/model_data_json/Qwen_Qwen3-4B.json b/model_data_json/Qwen_Qwen3-4B.json new file mode 100644 index 0000000000000000000000000000000000000000..4b0612fae95c3c5e4451376458f1d673f4f941dd --- /dev/null +++ b/model_data_json/Qwen_Qwen3-4B.json @@ -0,0 +1,20 @@ +{ + "model_id": "Qwen/Qwen3-4B", + "downloads": 81038, + "tags": [ + "transformers", + "safetensors", + "qwen3", + "text-generation", + "conversational", + "arxiv:2309.00071", + "base_model:Qwen/Qwen3-4B-Base", + "base_model:finetune:Qwen/Qwen3-4B-Base", + "license:apache-2.0", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- library_name: transformers license: apache-2.0 license_link: pipeline_tag: text-generation base_model: - Qwen/Qwen3-4B-Base --- # Qwen3-4B \"Chat\" ## Qwen3 Highlights Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features: - **Uniquely support of seamless switching between thinking mode** (for complex logical reasoning, math, and coding) and **non-thinking mode** (for efficient, general-purpose dialogue) **within single model**, ensuring optimal performance across various scenarios. - **Significantly enhancement in its reasoning capabilities**, surpassing previous QwQ (in thinking mode) and Qwen2.5 instruct models (in non-thinking mode) on mathematics, code generation, and commonsense logical reasoning. - **Superior human preference alignment**, excelling in creative writing, role-playing, multi-turn dialogues, and instruction following, to deliver a more natural, engaging, and immersive conversational experience. - **Expertise in agent capabilities**, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open-source models in complex agent-based tasks. - **Support of 100+ languages and dialects** with strong capabilities for **multilingual instruction following** and **translation**. ## Model Overview **Qwen3-4B** has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parameters: 4.0B - Number of Paramaters (Non-Embedding): 3.6B - Number of Layers: 36 - Number of Attention Heads (GQA): 32 for Q and 8 for KV - Context Length: 32,768 natively and 131,072 tokens with YaRN. For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our blog, GitHub, and Documentation. > [!TIP] > If you encounter significant endless repetitions, please refer to the Best Practices section for optimal sampling parameters, and set the `transformerstransformerstransformers<4.51.0sglang>=0.4.6.post1vllm>=0.8.5enable_thinkingenable_thinking=Trueenable_thinking=Truetokenizer.apply_chat_template...Temperature=0.6TopP=0.95TopK=20MinP=0generation_config.jsonenable_thinking=False...Temperature=0.7TopP=0.8TopK=20MinP=0enable_thinking=True/think/no_thinkenable_thinking=True/think/no_think...enable_thinking=False/think/no_think...transformersllama.cppvllmsglangconfig.jsonrope_scalingllama.cppvllmsglangllama-serverllama.cpptransformers>=4.51.0rope_scalingfactorfactormax_position_embeddingsconfig.jsonenable_thinking=TrueTemperature=0.6TopP=0.95TopK=20MinP=0enable_thinking=FalseTemperature=0.7TopP=0.8TopK=20MinP=0presence_penaltyanswer\"answer\": \"C\"`.\" 4. **No Thinking Content in History**: In multi-turn conversations, the historical model output should only include the final output part and does not need to include the thinking content. It is implemented in the provided chat template in Jinja2. However, for frameworks that do not directly use the Jinja2 chat template, it is up to the developers to ensure that the best practice is followed. ### Citation If you find our work helpful, feel free to give us a cite.", + "model_explanation_gemini": "A 4B-parameter causal language model specializing in reasoning, multilingual tasks, and agent capabilities, with unique switching between thinking (complex reasoning) and non-thinking (general dialogue) modes. \n\n**Features**: \n- **Dual-mode operation**: Toggles between thinking (logic/math/coding) and non-thinking (general dialogue) modes. \n- **Enhanced reasoning**: Outperforms previous Qwen models in math, code, and logical tasks. \n- **Human-aligned interactions**:" +} \ No newline at end of file diff --git a/model_data_json/Qwen_Qwen3-8B.json b/model_data_json/Qwen_Qwen3-8B.json new file mode 100644 index 0000000000000000000000000000000000000000..03330d720add2e7f9b721380f6bce195034338af --- /dev/null +++ b/model_data_json/Qwen_Qwen3-8B.json @@ -0,0 +1,20 @@ +{ + "model_id": "Qwen/Qwen3-8B", + "downloads": 78129, + "tags": [ + "transformers", + "safetensors", + "qwen3", + "text-generation", + "conversational", + "arxiv:2309.00071", + "base_model:Qwen/Qwen3-8B-Base", + "base_model:finetune:Qwen/Qwen3-8B-Base", + "license:apache-2.0", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- library_name: transformers license: apache-2.0 license_link: pipeline_tag: text-generation base_model: - Qwen/Qwen3-8B-Base --- # Qwen3-8B \"Chat\" ## Qwen3 Highlights Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features: - **Uniquely support of seamless switching between thinking mode** (for complex logical reasoning, math, and coding) and **non-thinking mode** (for efficient, general-purpose dialogue) **within single model**, ensuring optimal performance across various scenarios. - **Significantly enhancement in its reasoning capabilities**, surpassing previous QwQ (in thinking mode) and Qwen2.5 instruct models (in non-thinking mode) on mathematics, code generation, and commonsense logical reasoning. - **Superior human preference alignment**, excelling in creative writing, role-playing, multi-turn dialogues, and instruction following, to deliver a more natural, engaging, and immersive conversational experience. - **Expertise in agent capabilities**, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open-source models in complex agent-based tasks. - **Support of 100+ languages and dialects** with strong capabilities for **multilingual instruction following** and **translation**. ## Model Overview **Qwen3-8B** has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parameters: 8.2B - Number of Paramaters (Non-Embedding): 6.95B - Number of Layers: 36 - Number of Attention Heads (GQA): 32 for Q and 8 for KV - Context Length: 32,768 natively and 131,072 tokens with YaRN. For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our blog, GitHub, and Documentation. ## Quickstart The code of Qwen3 has been in the latest Hugging Face and we advise you to use the latest version of . With , you will encounter the following error: The following contains a code snippet illustrating how to use the model generate content based on given inputs. For deployment, you can use or or to create an OpenAI-compatible API endpoint: - SGLang: - vLLM: For local use, applications such as Ollama, LMStudio, MLX-LM, llama.cpp, and KTransformers have also supported Qwen3. ## Switching Between Thinking and Non-Thinking Mode > [!TIP] > The switch is also available in APIs created by SGLang and vLLM. > Please refer to our documentation for SGLang and vLLM users. ### By default, Qwen3 has thinking capabilities enabled, similar to QwQ-32B. This means the model will use its reasoning abilities to enhance the quality of generated responses. For example, when explicitly setting or leaving it as the default value in , the model will engage its thinking mode. In this mode, the model will generate think content wrapped in a block, followed by the final response. > [!NOTE] > For thinking mode, use , , , and (the default setting in ). **DO NOT use greedy decoding**, as it can lead to performance degradation and endless repetitions. For more detailed guidance, please refer to the Best Practices section. ### We provide a hard switch to strictly disable the model's thinking behavior, aligning its functionality with the previous Qwen2.5-Instruct models. This mode is particularly useful in scenarios where disabling thinking is essential for enhancing efficiency. In this mode, the model will not generate any think content and will not include a block. > [!NOTE] > For non-thinking mode, we suggest using , , , and . For more detailed guidance, please refer to the Best Practices section. ### Advanced Usage: Switching Between Thinking and Non-Thinking Modes via User Input We provide a soft switch mechanism that allows users to dynamically control the model's behavior when . Specifically, you can add and to user prompts or system messages to switch the model's thinking mode from turn to turn. The model will follow the most recent instruction in multi-turn conversations. Here is an example of a multi-turn conversation: > [!NOTE] > For API compatibility, when , regardless of whether the user uses or , the model will always output a block wrapped in . However, the content inside this block may be empty if thinking is disabled. > When , the soft switches are not valid. Regardless of any or tags input by the user, the model will not generate think content and will not include a block. ## Agentic Use Qwen3 excels in tool calling capabilities. We recommend using Qwen-Agent to make the best use of agentic ability of Qwen3. Qwen-Agent encapsulates tool-calling templates and tool-calling parsers internally, greatly reducing coding complexity. To define the available tools, you can use the MCP configuration file, use the integrated tool of Qwen-Agent, or integrate other tools by yourself. ## Processing Long Texts Qwen3 natively supports context lengths of up to 32,768 tokens. For conversations where the total length (including both input and output) significantly exceeds this limit, we recommend using RoPE scaling techniques to handle long texts effectively. We have validated the model's performance on context lengths of up to 131,072 tokens using the YaRN method. YaRN is currently supported by several inference frameworks, e.g., and for local use, and for deployment. In general, there are two approaches to enabling YaRN for supported frameworks: - Modifying the model files: In the file, add the fields: For , you need to regenerate the GGUF file after the modification. - Passing command line arguments: For , you can use For , you can use For from , you can use > [!IMPORTANT] > If you encounter the following warning > > please upgrade . > [!NOTE] > All the notable open-source frameworks implement static YaRN, which means the scaling factor remains constant regardless of input length, **potentially impacting performance on shorter texts.** > We advise adding the configuration only when processing long contexts is required. > It is also recommended to modify the as needed. For example, if the typical context length for your application is 65,536 tokens, it would be better to set as 2.0. > [!NOTE] > The default in is set to 40,960. This allocation includes reserving 32,768 tokens for outputs and 8,192 tokens for typical prompts, which is sufficient for most scenarios involving short text processing. If the average context length does not exceed 32,768 tokens, we do not recommend enabling YaRN in this scenario, as it may potentially degrade model performance. > [!TIP] > The endpoint provided by Alibaba Model Studio supports dynamic YaRN by default and no extra configuration is needed. ## Best Practices To achieve optimal performance, we recommend the following settings: 1. **Sampling Parameters**: - For thinking mode (), use , , , and . **DO NOT use greedy decoding**, as it can lead to performance degradation and endless repetitions. - For non-thinking mode (), we suggest using , , , and . - For supported frameworks, you can adjust the parameter between 0 and 2 to reduce endless repetitions. However, using a higher value may occasionally result in language mixing and a slight decrease in model performance. 2. **Adequate Output Length**: We recommend using an output length of 32,768 tokens for most queries. For benchmarking on highly complex problems, such as those found in math and programming competitions, we suggest setting the max output length to 38,912 tokens. This provides the model with sufficient space to generate detailed and comprehensive responses, thereby enhancing its overall performance. 3. **Standardize Output Format**: We recommend using prompts to standardize model outputs when benchmarking. - **Math Problems**: Include \"Please reason step by step, and put your final answer within \\boxed{}.\" in the prompt. - **Multiple-Choice Questions**: Add the following JSON structure to the prompt to standardize responses: \"Please show your choice in the field with only the choice letter, e.g., .\" 4. **No Thinking Content in History**: In multi-turn conversations, the historical model output should only include the final output part and does not need to include the thinking content. It is implemented in the provided chat template in Jinja2. However, for frameworks that do not directly use the Jinja2 chat template, it is up to the developers to ensure that the best practice is followed. ### Citation If you find our work helpful, feel free to give us a cite.", + "model_explanation_gemini": "\"Qwen3-8B is an 8.2B-parameter causal language model supporting seamless switching between thinking mode (for complex reasoning, math, coding) and non-thinking mode (efficient dialogue), excelling in multilingual tasks, agent capabilities, and human-aligned interactions.\"\n\n**Features**: \n- **Dual-mode operation**: Toggle between thinking (reasoning-focused) and non-thinking (general dialogue) modes. \n- **Enhanced reasoning**: Outperforms QwQ and" +} \ No newline at end of file diff --git a/model_data_json/RedHatAI_Mistral-Small-24B-Instruct-2501-FP8-dynamic.json b/model_data_json/RedHatAI_Mistral-Small-24B-Instruct-2501-FP8-dynamic.json new file mode 100644 index 0000000000000000000000000000000000000000..1f187cac8982eaa1ad18d2258d02fcbe78762018 --- /dev/null +++ b/model_data_json/RedHatAI_Mistral-Small-24B-Instruct-2501-FP8-dynamic.json @@ -0,0 +1,25 @@ +{ + "model_id": "RedHatAI/Mistral-Small-24B-Instruct-2501-FP8-dynamic", + "downloads": 81938, + "tags": [ + "transformers", + "safetensors", + "mistral", + "text-generation", + "mistral-small", + "fp8", + "vllm", + "conversational", + "en", + "base_model:mistralai/Mistral-Small-24B-Instruct-2501", + "base_model:quantized:mistralai/Mistral-Small-24B-Instruct-2501", + "license:apache-2.0", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "compressed-tensors", + "region:us" + ], + "description": "--- license: apache-2.0 language: - en tags: - mistral - mistral-small - fp8 - vllm base_model: mistralai/Mistral-Small-24B-Instruct-2501 library_name: transformers --- # Mistral-Small-24B-Instruct-2501-FP8-Dynamic ## Model Overview - **Model Architecture:** Mistral-Small-24B-Instruct-2501 - **Input:** Text - **Output:** Text - **Model Optimizations:** - **Weight quantization:** FP8 - **Activation quantization:** FP8 - **Release Date:** 3/1/2025 - **Version:** 1.0 - **Model Developers:** Neural Magic Quantized version of Mistral-Small-24B-Instruct-2501. It achieves an average score of 78.88 on the OpenLLM benchmark (version 1), whereas the unquantized model achieves 79.45. ### Model Optimizations This model was obtained by quantizing the weights and activations to FP8 data type, ready for inference with vLLM. This optimization reduces the number of bits per parameter from 16 to 8, reducing the disk size and GPU memory requirements by approximately 50%. Only the weights and activations of the linear operators within transformers blocks are quantized. ## Deployment ### Use with vLLM This model can be deployed efficiently using the vLLM backend, as shown in the example below. vLLM also supports OpenAI-compatible serving. See the documentation for more details. ## Creation This model was created with llm-compressor by running the code snippet below. ## Evaluation The model was evaluated on OpenLLM Leaderboard V1 and V2, using the following commands: OpenLLM Leaderboard V1: OpenLLM Leaderboard V2: ### Accuracy #### OpenLLM Leaderboard V1 evaluation scores | Metric | mistralai/Mistral-Small-24B-Instruct-2501 | nm-testing/Mistral-Small-24B-Instruct-2501-FP8-dynamic | |-----------------------------------------|:---------------------------------:|:-------------------------------------------:| | ARC-Challenge (Acc-Norm, 25-shot) | 72.18 | 71.76 | | GSM8K (Strict-Match, 5-shot) | 90.14 | 89.01 | | HellaSwag (Acc-Norm, 10-shot) | 85.05 | 84.65 | | MMLU (Acc, 5-shot) | 80.69 | 80.55 | | TruthfulQA (MC2, 0-shot) | 65.55 | 64.85 | | Winogrande (Acc, 5-shot) | 83.11 | 82.48 | | **Average Score** | **79.45** | **78.88** | | **Recovery (%)** | **100.00** | **99.28** | #### OpenLLM Leaderboard V2 evaluation scores | Metric | mistralai/Mistral-Small-24B-Instruct-2501 | nm-testing/Mistral-Small-24B-Instruct-2501-FP8-dynamic | |---------------------------------------------------------|:---------------------------------:|:-------------------------------------------:| | IFEval (Inst-and-Prompt Level Strict Acc, 0-shot) | 73.27 | 73.53 | | BBH (Acc-Norm, 3-shot) | 45.18 | 44.39 | | MMLU-Pro (Acc, 5-shot) | 38.83 | 37.28 | | **Average Score** | **52.42** | **51.73** | | **Recovery (%)** | **100.00** | **98.68** | | Math-Hard (Exact-Match, 4-shot) | 6.35 | 2.99 | | GPQA (Acc-Norm, 0-shot) | 8.29 | 6.97 | | MUSR (Acc-Norm, 0-shot) | 7.84 | 8.04 | Results on Math-Hard, GPQA, and MUSR are not considred for accuracy recovery calculation because the unquantized model has close to random prediction accuracy (6.35, 8.29, 7.84) which doesn't provide a reliable baseline for recovery calculation.", + "model_explanation_gemini": "A quantized FP8 version of Mistral-Small-24B-Instruct-2501 optimized for efficient inference with reduced memory and disk requirements while maintaining near-original accuracy (78.88 vs. 79.45 on OpenLLM benchmark). \n\n**Features:** \n- **Architecture:** Mistral-Small-24B-Instruct-2501 \n- **Quantization:** FP8 weights and activations (50% smaller size/memory) \n- **Task:** Text-to" +} \ No newline at end of file diff --git a/model_data_json/Revai_reverb-diarization-v1.json b/model_data_json/Revai_reverb-diarization-v1.json new file mode 100644 index 0000000000000000000000000000000000000000..bed44eb5d7b4735b7734cfcb8ff4a4df52dd8e16 --- /dev/null +++ b/model_data_json/Revai_reverb-diarization-v1.json @@ -0,0 +1,15 @@ +{ + "model_id": "Revai/reverb-diarization-v1", + "downloads": 175090, + "tags": [ + "pyannote-audio", + "pytorch", + "reverb", + "automatic-speech-recognition", + "arxiv:2410.03930", + "license:other", + "region:us" + ], + "description": "--- license: other library_name: pyannote-audio tags: - reverb pipeline_tag: automatic-speech-recognition --- Details on the model, it's performance, and more available on Arxiv. For more information on how to run this diarization model see Reverb diarization V1 provides a 16.5% relative improvement in WDER (Word Diarization Error Rate) compared to the baseline pyannote3.0 model, evaluated on over 1,250,000 tokens across five different test suites. | Test suite | WDER | |---------| --------| |earnings21 | 0.047 | |rev16 | 0.077| # Usage # Cite this Model If you use this model please use the following citation: # License See LICENSE for details.", + "model_explanation_gemini": "\"Revai_reverb-diarization-v1 is a speech diarization model that improves word-level speaker attribution accuracy, reducing WDER by 16.5% compared to pyannote3.0.\"\n\nFeatures: \n- 16.5% lower WDER than pyannote3.0 \n- Evaluated on 1.25M+ tokens across 5 test suites \n- Specific performance metrics: 0.047 WDER (earnings21), 0.077" +} \ No newline at end of file diff --git a/model_data_json/Rostlab_prot_t5_xl_uniref50.json b/model_data_json/Rostlab_prot_t5_xl_uniref50.json new file mode 100644 index 0000000000000000000000000000000000000000..98dfaf63461c0b14f58b141c6d247ca078792430 --- /dev/null +++ b/model_data_json/Rostlab_prot_t5_xl_uniref50.json @@ -0,0 +1,18 @@ +{ + "model_id": "Rostlab/prot_t5_xl_uniref50", + "downloads": 73585, + "tags": [ + "transformers", + "pytorch", + "t5", + "text2text-generation", + "protein language model", + "dataset:UniRef50", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- tags: - protein language model datasets: - UniRef50 --- # ProtT5-XL-UniRef50 model Pretrained model on protein sequences using a masked language modeling (MLM) objective. It was introduced in this paper and first released in this repository. This model is trained on uppercase amino acids: it only works with capital letter amino acids. ## Model description ProtT5-XL-UniRef50 is based on the model and was pretrained on a large corpus of protein sequences in a self-supervised fashion. This means it was pretrained on the raw protein sequences only, with no humans labelling them in any way (which is why it can use lots of publicly available data) with an automatic process to generate inputs and labels from those protein sequences. One important difference between this T5 model and the original T5 version is the denosing objective. The original T5-3B model was pretrained using a span denosing objective, while this model was pre-trained with a Bart-like MLM denosing objective. The masking probability is consistent with the original T5 training by randomly masking 15% of the amino acids in the input. It has been shown that the features extracted from this self-supervised model (LM-embeddings) captured important biophysical properties governing protein shape. shape. This implied learning some of the grammar of the language of life realized in protein sequences. ## Intended uses & limitations The model could be used for protein feature extraction or to be fine-tuned on downstream tasks. We have noticed in some tasks on can gain more accuracy by fine-tuning the model rather than using it as a feature extractor. We have also noticed that for feature extraction, its better to use the feature extracted from the encoder not from the decoder. ### How to use Here is how to use this model to extract the features of a given protein sequence in PyTorch: ## Training data The ProtT5-XL-UniRef50 model was pretrained on UniRef50, a dataset consisting of 45 million protein sequences. ## Training procedure ### Preprocessing The protein sequences are uppercased and tokenized using a single space and a vocabulary size of 21. The rare amino acids \"U,Z,O,B\" were mapped to \"X\". The inputs of the model are then of the form: The preprocessing step was performed on the fly, by cutting and padding the protein sequences up to 512 tokens. The details of the masking procedure for each sequence are as follows: - 15% of the amino acids are masked. - In 90% of the cases, the masked amino acids are replaced by token. - In 10% of the cases, the masked amino acids are replaced by a random amino acid (different) from the one they replace. ### Pretraining The model was trained on a single TPU Pod V2-256 for 991.5 thousand steps in total, using sequence length 512 (batch size 2k). It was trained using ProtT5-XL-BFD model as an initial checkpoint, rather than training from scratch. It has a total of approximately 3B parameters and was trained using the encoder-decoder architecture. The optimizer used is AdaFactor with inverse square root learning rate schedule for pre-training. ## Evaluation results When the model is used for feature extraction, this model achieves the following results: Test results : | Task/Dataset | secondary structure (3-states) | secondary structure (8-states) | Localization | Membrane | |:-----:|:-----:|:-----:|:-----:|:-----:| | CASP12 | 81 | 70 | | | | TS115 | 87 | 77 | | | | CB513 | 86 | 74 | | | | DeepLoc | | | 81 | 91 | ### BibTeX entry and citation info > Created by Ahmed Elnaggar/@Elnaggar_AI | LinkedIn", + "model_explanation_gemini": "Pretrained on UniRef50 protein sequences using masked language modeling, ProtT5-XL-UniRef50 extracts protein features or fine-tunes for downstream tasks, capturing biophysical properties through self-supervised learning on uppercase amino acids. \n\n**Features:** \n- Masked language modeling (MLM) objective (15% masking) \n- Trained on 45M UniRef50 protein sequences (uppercased) \n- 3B parameters, encoder-decoder architecture \n- Input:" +} \ No newline at end of file diff --git a/model_data_json/Snowflake_snowflake-arctic-embed-m-v2.0.json b/model_data_json/Snowflake_snowflake-arctic-embed-m-v2.0.json new file mode 100644 index 0000000000000000000000000000000000000000..6dc8d44ef9c1b5ff15a74deedd00efce1cf37b26 --- /dev/null +++ b/model_data_json/Snowflake_snowflake-arctic-embed-m-v2.0.json @@ -0,0 +1,99 @@ +{ + "model_id": "Snowflake/snowflake-arctic-embed-m-v2.0", + "downloads": 68866, + "tags": [ + "sentence-transformers", + "onnx", + "safetensors", + "gte", + "feature-extraction", + "sentence-similarity", + "mteb", + "arctic", + "snowflake-arctic-embed", + "transformers.js", + "custom_code", + "af", + "ar", + "az", + "be", + "bg", + "bn", + "ca", + "ceb", + "cs", + "cy", + "da", + "de", + "el", + "en", + "es", + "et", + "eu", + "fa", + "fi", + "fr", + "gl", + "gu", + "he", + "hi", + "hr", + "ht", + "hu", + "hy", + "id", + "is", + "it", + "ja", + "jv", + "ka", + "kk", + "km", + "kn", + "ko", + "ky", + "lo", + "lt", + "lv", + "mk", + "ml", + "mn", + "mr", + "ms", + "my", + "ne", + "nl", + "pa", + "pl", + "pt", + "qu", + "ro", + "ru", + "si", + "sk", + "sl", + "so", + "sq", + "sr", + "sv", + "sw", + "ta", + "te", + "th", + "tl", + "tr", + "uk", + "ur", + "vi", + "yo", + "zh", + "arxiv:2412.04506", + "license:apache-2.0", + "model-index", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- pipeline_tag: sentence-similarity tags: - sentence-transformers - feature-extraction - sentence-similarity - mteb - arctic - snowflake-arctic-embed - transformers.js license: apache-2.0 language: - af - ar - az - be - bg - bn - ca - ceb - cs - cy - da - de - el - en - es - et - eu - fa - fi - fr - gl - gu - he - hi - hr - ht - hu - hy - id - is - it - ja - jv - ka - kk - km - kn - ko - ky - lo - lt - lv - mk - ml - mn - mr - ms - my - ne - nl - pa - pl - pt - qu - ro - ru - si - sk - sl - so - sq - sr - sv - sw - ta - te - th - tl - tr - uk - ur - vi - yo - zh model-index: - name: snowflake-arctic-embed-m-v2.0 results: - dataset: config: en-ext name: MTEB AmazonCounterfactualClassification (en-ext) revision: e8379541af4e31359cca9fbcf4b00f2671dba205 split: test type: mteb/amazon_counterfactual metrics: - type: accuracy value: 66.6867 - type: f1 value: 55.0373 - type: f1_weighted value: 73.07430000000001 - type: ap value: 18.077399999999997 - type: ap_weighted value: 18.077399999999997 - type: main_score value: 66.6867 task: type: Classification - dataset: config: en name: MTEB AmazonCounterfactualClassification (en) revision: e8379541af4e31359cca9fbcf4b00f2671dba205 split: test type: mteb/amazon_counterfactual metrics: - type: accuracy value: 66.194 - type: f1 value: 60.854299999999995 - type: f1_weighted value: 69.57339999999999 - type: ap value: 30.279099999999996 - type: ap_weighted value: 30.279099999999996 - type: main_score value: 66.194 task: type: Classification - dataset: config: default name: MTEB AmazonPolarityClassification (default) revision: e2d317d38cd51312af73b3d32a06d1a08b442046 split: test type: mteb/amazon_polarity metrics: - type: accuracy value: 70.3589 - type: f1 value: 70.0409 - type: f1_weighted value: 70.0409 - type: ap value: 64.81949999999999 - type: ap_weighted value: 64.81949999999999 - type: main_score value: 70.3589 task: type: Classification - dataset: config: en name: MTEB AmazonReviewsClassification (en) revision: 1399c76144fd37290681b995c656ef9b2e06e26d split: test type: mteb/amazon_reviews_multi metrics: - type: accuracy value: 33.766 - type: f1 value: 33.3656 - type: f1_weighted value: 33.3656 - type: main_score value: 33.766 task: type: Classification - dataset: config: default name: MTEB ArguAna (default) revision: c22ab2a51041ffd869aaddef7af8d8215647e41a split: test type: mteb/arguana metrics: - type: ndcg_at_1 value: 33.144 - type: ndcg_at_3 value: 47.909 - type: ndcg_at_5 value: 52.932 - type: ndcg_at_10 value: 58.011 - type: ndcg_at_20 value: 60.168 - type: ndcg_at_100 value: 60.928000000000004 - type: ndcg_at_1000 value: 61.046 - type: map_at_1 value: 33.144 - type: map_at_3 value: 44.156 - type: map_at_5 value: 46.951 - type: map_at_10 value: 49.071999999999996 - type: map_at_20 value: 49.692 - type: map_at_100 value: 49.809 - type: map_at_1000 value: 49.815 - type: recall_at_1 value: 33.144 - type: recall_at_3 value: 58.819 - type: recall_at_5 value: 70.982 - type: recall_at_10 value: 86.558 - type: recall_at_20 value: 94.879 - type: recall_at_100 value: 98.791 - type: recall_at_1000 value: 99.644 - type: precision_at_1 value: 33.144 - type: precision_at_3 value: 19.606 - type: precision_at_5 value: 14.196 - type: precision_at_10 value: 8.656 - type: precision_at_20 value: 4.744000000000001 - type: precision_at_100 value: 0.988 - type: precision_at_1000 value: 0.1 - type: mrr_at_1 value: 33.4993 - type: mrr_at_3 value: 44.393100000000004 - type: mrr_at_5 value: 47.131299999999996 - type: mrr_at_10 value: 49.264599999999994 - type: mrr_at_20 value: 49.8707 - type: mrr_at_100 value: 49.987700000000004 - type: mrr_at_1000 value: 49.993700000000004 - type: nauc_ndcg_at_1_max value: -10.8287 - type: nauc_ndcg_at_1_std value: -17.1177 - type: nauc_ndcg_at_1_diff1 value: 14.4508 - type: nauc_ndcg_at_3_max value: -7.7004 - type: nauc_ndcg_at_3_std value: -16.6705 - type: nauc_ndcg_at_3_diff1 value: 10.0448 - type: nauc_ndcg_at_5_max value: -7.0436 - type: nauc_ndcg_at_5_std value: -15.8744 - type: nauc_ndcg_at_5_diff1 value: 9.1132 - type: nauc_ndcg_at_10_max value: -7.4729 - type: nauc_ndcg_at_10_std value: -14.9349 - type: nauc_ndcg_at_10_diff1 value: 8.527700000000001 - type: nauc_ndcg_at_20_max value: -6.997000000000001 - type: nauc_ndcg_at_20_std value: -14.688399999999998 - type: nauc_ndcg_at_20_diff1 value: 9.7605 - type: nauc_ndcg_at_100_max value: -7.5599 - type: nauc_ndcg_at_100_std value: -15.0565 - type: nauc_ndcg_at_100_diff1 value: 10.2688 - type: nauc_ndcg_at_1000_max value: -7.675800000000001 - type: nauc_ndcg_at_1000_std value: -15.223500000000001 - type: nauc_ndcg_at_1000_diff1 value: 10.32 - type: nauc_map_at_1_max value: -10.8287 - type: nauc_map_at_1_std value: -17.1177 - type: nauc_map_at_1_diff1 value: 14.4508 - type: nauc_map_at_3_max value: -8.5473 - type: nauc_map_at_3_std value: -16.6674 - type: nauc_map_at_3_diff1 value: 11.1004 - type: nauc_map_at_5_max value: -8.1927 - type: nauc_map_at_5_std value: -16.2275 - type: nauc_map_at_5_diff1 value: 10.678600000000001 - type: nauc_map_at_10_max value: -8.3855 - type: nauc_map_at_10_std value: -15.8309 - type: nauc_map_at_10_diff1 value: 10.5414 - type: nauc_map_at_20_max value: -8.277700000000001 - type: nauc_map_at_20_std value: -15.824 - type: nauc_map_at_20_diff1 value: 10.8494 - type: nauc_map_at_100_max value: -8.3178 - type: nauc_map_at_100_std value: -15.848300000000002 - type: nauc_map_at_100_diff1 value: 10.9384 - type: nauc_map_at_1000_max value: -8.319799999999999 - type: nauc_map_at_1000_std value: -15.8522 - type: nauc_map_at_1000_diff1 value: 10.9401 - type: nauc_recall_at_1_max value: -10.8287 - type: nauc_recall_at_1_std value: -17.1177 - type: nauc_recall_at_1_diff1 value: 14.4508 - type: nauc_recall_at_3_max value: -5.0587 - type: nauc_recall_at_3_std value: -16.730800000000002 - type: nauc_recall_at_3_diff1 value: 6.8079 - type: nauc_recall_at_5_max value: -2.6783 - type: nauc_recall_at_5_std value: -14.5046 - type: nauc_recall_at_5_diff1 value: 3.096 - type: nauc_recall_at_10_max value: -1.5855000000000001 - type: nauc_recall_at_10_std value: -8.2276 - type: nauc_recall_at_10_diff1 value: -6.1741 - type: nauc_recall_at_20_max value: 15.754299999999999 - type: nauc_recall_at_20_std value: 8.1974 - type: nauc_recall_at_20_diff1 value: -4.9207 - type: nauc_recall_at_100_max value: 20.4574 - type: nauc_recall_at_100_std value: 36.3741 - type: nauc_recall_at_100_diff1 value: -7.9483 - type: nauc_recall_at_1000_max value: 21.6023 - type: nauc_recall_at_1000_std value: 68.7296 - type: nauc_recall_at_1000_diff1 value: -24.9261 - type: nauc_precision_at_1_max value: -10.8287 - type: nauc_precision_at_1_std value: -17.1177 - type: nauc_precision_at_1_diff1 value: 14.4508 - type: nauc_precision_at_3_max value: -5.0587 - type: nauc_precision_at_3_std value: -16.730800000000002 - type: nauc_precision_at_3_diff1 value: 6.8079 - type: nauc_precision_at_5_max value: -2.6783 - type: nauc_precision_at_5_std value: -14.5046 - type: nauc_precision_at_5_diff1 value: 3.096 - type: nauc_precision_at_10_max value: -1.5855000000000001 - type: nauc_precision_at_10_std value: -8.2276 - type: nauc_precision_at_10_diff1 value: -6.1741 - type: nauc_precision_at_20_max value: 15.754299999999999 - type: nauc_precision_at_20_std value: 8.1974 - type: nauc_precision_at_20_diff1 value: -4.9207 - type: nauc_precision_at_100_max value: 20.4574 - type: nauc_precision_at_100_std value: 36.3741 - type: nauc_precision_at_100_diff1 value: -7.9483 - type: nauc_precision_at_1000_max value: 21.6023 - type: nauc_precision_at_1000_std value: 68.7296 - type: nauc_precision_at_1000_diff1 value: -24.9261 - type: nauc_mrr_at_1_max value: -11.251999999999999 - type: nauc_mrr_at_1_std value: -17.4386 - type: nauc_mrr_at_1_diff1 value: 13.414200000000001 - type: nauc_mrr_at_3_max value: -9.7985 - type: nauc_mrr_at_3_std value: -16.650000000000002 - type: nauc_mrr_at_3_diff1 value: 9.5099 - type: nauc_mrr_at_5_max value: -9.064 - type: nauc_mrr_at_5_std value: -16.4409 - type: nauc_mrr_at_5_diff1 value: 9.4773 - type: nauc_mrr_at_10_max value: -9.310400000000001 - type: nauc_mrr_at_10_std value: -16.0546 - type: nauc_mrr_at_10_diff1 value: 9.2528 - type: nauc_mrr_at_20_max value: -9.223099999999999 - type: nauc_mrr_at_20_std value: -16.0659 - type: nauc_mrr_at_20_diff1 value: 9.5259 - type: nauc_mrr_at_100_max value: -9.2678 - type: nauc_mrr_at_100_std value: -16.0911 - type: nauc_mrr_at_100_diff1 value: 9.608600000000001 - type: nauc_mrr_at_1000_max value: -9.2699 - type: nauc_mrr_at_1000_std value: -16.095100000000002 - type: nauc_mrr_at_1000_diff1 value: 9.6099 - type: main_score value: 58.011 task: type: Retrieval - dataset: config: default name: MTEB ArxivClusteringP2P (default) revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d split: test type: mteb/arxiv-clustering-p2p metrics: - type: v_measure value: 44.684400000000004 - type: v_measure_std value: 13.5064 - type: main_score value: 44.684400000000004 task: type: Clustering - dataset: config: default name: MTEB ArxivClusteringS2S (default) revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53 split: test type: mteb/arxiv-clustering-s2s metrics: - type: v_measure value: 35.0503 - type: v_measure_std value: 13.9543 - type: main_score value: 35.0503 task: type: Clustering - dataset: config: default name: MTEB AskUbuntuDupQuestions (default) revision: 2000358ca161889fa9c082cb41daa8dcfb161a54 split: test type: mteb/askubuntudupquestions-reranking metrics: - type: map value: 60.648500000000006 - type: mrr value: 74.528 - type: nAUC_map_max value: 19.4239 - type: nAUC_map_std value: 20.0729 - type: nAUC_map_diff1 value: 10.0382 - type: nAUC_mrr_max value: 30.693199999999997 - type: nAUC_mrr_std value: 27.1279 - type: nAUC_mrr_diff1 value: 23.0291 - type: main_score value: 60.648500000000006 task: type: Reranking - dataset: config: default name: MTEB BIOSSES (default) revision: d3fb88f8f02e40887cd149695127462bbcf29b4a split: test type: mteb/biosses-sts metrics: - type: pearson value: 89.5081 - type: spearman value: 87.0568 - type: cosine_pearson value: 89.5081 - type: cosine_spearman value: 87.0568 - type: manhattan_pearson value: 88.1247 - type: manhattan_spearman value: 87.2556 - type: euclidean_pearson value: 88.3266 - type: euclidean_spearman value: 87.0568 - type: main_score value: 87.0568 task: type: STS - dataset: config: default name: MTEB Banking77Classification (default) revision: 0fd18e25b25c072e09e0d92ab615fda904d66300 split: test type: mteb/banking77 metrics: - type: accuracy value: 80.18180000000001 - type: f1 value: 79.5538 - type: f1_weighted value: 79.5538 - type: main_score value: 80.18180000000001 task: type: Classification - dataset: config: default name: MTEB BiorxivClusteringP2P (default) revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40 split: test type: mteb/biorxiv-clustering-p2p metrics: - type: v_measure value: 36.0126 - type: v_measure_std value: 0.47019999999999995 - type: main_score value: 36.0126 task: type: Clustering - dataset: config: default name: MTEB BiorxivClusteringS2S (default) revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908 split: test type: mteb/biorxiv-clustering-s2s metrics: - type: v_measure value: 28.6331 - type: v_measure_std value: 0.8607999999999999 - type: main_score value: 28.6331 task: type: Clustering - dataset: config: default name: MTEB CQADupstackAndroidRetrieval (default) revision: f46a197baaae43b4f621051089b82a364682dfeb split: test type: mteb/cqadupstack-android metrics: - type: ndcg_at_1 value: 45.207 - type: ndcg_at_3 value: 51.31400000000001 - type: ndcg_at_5 value: 54.093999999999994 - type: ndcg_at_10 value: 56.31 - type: ndcg_at_20 value: 58.378 - type: ndcg_at_100 value: 61.307 - type: ndcg_at_1000 value: 62.724999999999994 - type: map_at_1 value: 37.732 - type: map_at_3 value: 46.263 - type: map_at_5 value: 48.553000000000004 - type: map_at_10 value: 49.984 - type: map_at_20 value: 50.888999999999996 - type: map_at_100 value: 51.568999999999996 - type: map_at_1000 value: 51.666999999999994 - type: recall_at_1 value: 37.732 - type: recall_at_3 value: 53.736 - type: recall_at_5 value: 60.95399999999999 - type: recall_at_10 value: 68.062 - type: recall_at_20 value: 75.149 - type: recall_at_100 value: 88.075 - type: recall_at_1000 value: 96.878 - type: precision_at_1 value: 45.207 - type: precision_at_3 value: 24.368000000000002 - type: precision_at_5 value: 17.854 - type: precision_at_10 value: 10.558 - type: precision_at_20 value: 6.23 - type: precision_at_100 value: 1.614 - type: precision_at_1000 value: 0.202 - type: mrr_at_1 value: 45.2074 - type: mrr_at_3 value: 52.9804 - type: mrr_at_5 value: 54.718599999999995 - type: mrr_at_10 value: 55.5713 - type: mrr_at_20 value: 55.94 - type: mrr_at_100 value: 56.21699999999999 - type: mrr_at_1000 value: 56.2504 - type: nauc_ndcg_at_1_max value: 43.7697 - type: nauc_ndcg_at_1_std value: -3.9530000000000003 - type: nauc_ndcg_at_1_diff1 value: 57.75320000000001 - type: nauc_ndcg_at_3_max value: 42.7238 - type: nauc_ndcg_at_3_std value: -3.5654 - type: nauc_ndcg_at_3_diff1 value: 53.552299999999995 - type: nauc_ndcg_at_5_max value: 43.115500000000004 - type: nauc_ndcg_at_5_std value: -2.1444 - type: nauc_ndcg_at_5_diff1 value: 53.130500000000005 - type: nauc_ndcg_at_10_max value: 43.0188 - type: nauc_ndcg_at_10_std value: -3.1515 - type: nauc_ndcg_at_10_diff1 value: 53.593199999999996 - type: nauc_ndcg_at_20_max value: 43.4617 - type: nauc_ndcg_at_20_std value: -2.9284 - type: nauc_ndcg_at_20_diff1 value: 53.28000000000001 - type: nauc_ndcg_at_100_max value: 44.0704 - type: nauc_ndcg_at_100_std value: -0.5772 - type: nauc_ndcg_at_100_diff1 value: 53.439899999999994 - type: nauc_ndcg_at_1000_max value: 44.256099999999996 - type: nauc_ndcg_at_1000_std value: -1.1407 - type: nauc_ndcg_at_1000_diff1 value: 53.8728 - type: nauc_map_at_1_max value: 36.613800000000005 - type: nauc_map_at_1_std value: -5.8014 - type: nauc_map_at_1_diff1 value: 59.0186 - type: nauc_map_at_3_max value: 40.8666 - type: nauc_map_at_3_std value: -4.886299999999999 - type: nauc_map_at_3_diff1 value: 55.324600000000004 - type: nauc_map_at_5_max value: 41.9942 - type: nauc_map_at_5_std value: -3.9361 - type: nauc_map_at_5_diff1 value: 54.8805 - type: nauc_map_at_10_max value: 42.1621 - type: nauc_map_at_10_std value: -4.3264 - type: nauc_map_at_10_diff1 value: 55.0133 - type: nauc_map_at_20_max value: 42.5837 - type: nauc_map_at_20_std value: -3.8526 - type: nauc_map_at_20_diff1 value: 54.895700000000005 - type: nauc_map_at_100_max value: 42.7645 - type: nauc_map_at_100_std value: -3.4568000000000003 - type: nauc_map_at_100_diff1 value: 54.98030000000001 - type: nauc_map_at_1000_max value: 42.7915 - type: nauc_map_at_1000_std value: -3.4715999999999996 - type: nauc_map_at_1000_diff1 value: 55.0117 - type: nauc_recall_at_1_max value: 36.613800000000005 - type: nauc_recall_at_1_std value: -5.8014 - type: nauc_recall_at_1_diff1 value: 59.0186 - type: nauc_recall_at_3_max value: 39.3588 - type: nauc_recall_at_3_std value: -3.29 - type: nauc_recall_at_3_diff1 value: 50.1633 - type: nauc_recall_at_5_max value: 39.7596 - type: nauc_recall_at_5_std value: 0.4483 - type: nauc_recall_at_5_diff1 value: 47.598600000000005 - type: nauc_recall_at_10_max value: 37.5367 - type: nauc_recall_at_10_std value: -2.5935 - type: nauc_recall_at_10_diff1 value: 46.824799999999996 - type: nauc_recall_at_20_max value: 38.521100000000004 - type: nauc_recall_at_20_std value: -2.5774 - type: nauc_recall_at_20_diff1 value: 44.099 - type: nauc_recall_at_100_max value: 44.043 - type: nauc_recall_at_100_std value: 22.724 - type: nauc_recall_at_100_diff1 value: 40.4973 - type: nauc_recall_at_1000_max value: 59.780100000000004 - type: nauc_recall_at_1000_std value: 52.512 - type: nauc_recall_at_1000_diff1 value: 45.2841 - type: nauc_precision_at_1_max value: 43.7697 - type: nauc_precision_at_1_std value: -3.9530000000000003 - type: nauc_precision_at_1_diff1 value: 57.75320000000001 - type: nauc_precision_at_3_max value: 37.486000000000004 - type: nauc_precision_at_3_std value: -1.0619 - type: nauc_precision_at_3_diff1 value: 28.264699999999998 - type: nauc_precision_at_5_max value: 31.613599999999998 - type: nauc_precision_at_5_std value: 3.6863 - type: nauc_precision_at_5_diff1 value: 16.0838 - type: nauc_precision_at_10_max value: 23.4082 - type: nauc_precision_at_10_std value: 3.3977 - type: nauc_precision_at_10_diff1 value: 7.3632 - type: nauc_precision_at_20_max value: 16.7236 - type: nauc_precision_at_20_std value: 5.7516 - type: nauc_precision_at_20_diff1 value: -0.8460000000000001 - type: nauc_precision_at_100_max value: 3.9043 - type: nauc_precision_at_100_std value: 7.7799 - type: nauc_precision_at_100_diff1 value: -11.0756 - type: nauc_precision_at_1000_max value: -7.728 - type: nauc_precision_at_1000_std value: -1.9303000000000001 - type: nauc_precision_at_1000_diff1 value: -17.025000000000002 - type: nauc_mrr_at_1_max value: 43.7697 - type: nauc_mrr_at_1_std value: -3.9530000000000003 - type: nauc_mrr_at_1_diff1 value: 57.75320000000001 - type: nauc_mrr_at_3_max value: 44.8007 - type: nauc_mrr_at_3_std value: -2.9754 - type: nauc_mrr_at_3_diff1 value: 53.7928 - type: nauc_mrr_at_5_max value: 44.860499999999995 - type: nauc_mrr_at_5_std value: -1.7683 - type: nauc_mrr_at_5_diff1 value: 53.5852 - type: nauc_mrr_at_10_max value: 44.8025 - type: nauc_mrr_at_10_std value: -2.1691 - type: nauc_mrr_at_10_diff1 value: 53.880300000000005 - type: nauc_mrr_at_20_max value: 44.7838 - type: nauc_mrr_at_20_std value: -2.3529 - type: nauc_mrr_at_20_diff1 value: 53.890499999999996 - type: nauc_mrr_at_100_max value: 44.7905 - type: nauc_mrr_at_100_std value: -2.1931 - type: nauc_mrr_at_100_diff1 value: 53.9458 - type: nauc_mrr_at_1000_max value: 44.7943 - type: nauc_mrr_at_1000_std value: -2.2006 - type: nauc_mrr_at_1000_diff1 value: 53.954800000000006 - type: main_score value: 56.31 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackEnglishRetrieval (default) revision: ad9991cb51e31e31e430383c75ffb2885547b5f0 split: test type: mteb/cqadupstack-english metrics: - type: ndcg_at_1 value: 44.840999999999994 - type: ndcg_at_3 value: 49.217 - type: ndcg_at_5 value: 50.934000000000005 - type: ndcg_at_10 value: 53.142999999999994 - type: ndcg_at_20 value: 54.778000000000006 - type: ndcg_at_100 value: 57.241 - type: ndcg_at_1000 value: 58.967999999999996 - type: map_at_1 value: 35.675000000000004 - type: map_at_3 value: 44.017 - type: map_at_5 value: 45.786 - type: map_at_10 value: 47.204 - type: map_at_20 value: 47.946 - type: map_at_100 value: 48.564 - type: map_at_1000 value: 48.684 - type: recall_at_1 value: 35.675000000000004 - type: recall_at_3 value: 50.641000000000005 - type: recall_at_5 value: 55.897 - type: recall_at_10 value: 62.873999999999995 - type: recall_at_20 value: 68.766 - type: recall_at_100 value: 79.90899999999999 - type: recall_at_1000 value: 90.78399999999999 - type: precision_at_1 value: 44.840999999999994 - type: precision_at_3 value: 23.843 - type: precision_at_5 value: 16.637 - type: precision_at_10 value: 9.968 - type: precision_at_20 value: 5.863 - type: precision_at_100 value: 1.562 - type: precision_at_1000 value: 0.197 - type: mrr_at_1 value: 44.840799999999994 - type: mrr_at_3 value: 51.634800000000006 - type: mrr_at_5 value: 52.746300000000005 - type: mrr_at_10 value: 53.6323 - type: mrr_at_20 value: 53.9565 - type: mrr_at_100 value: 54.198 - type: mrr_at_1000 value: 54.234899999999996 - type: nauc_ndcg_at_1_max value: 50.3827 - type: nauc_ndcg_at_1_std value: -0.8129000000000001 - type: nauc_ndcg_at_1_diff1 value: 59.7518 - type: nauc_ndcg_at_3_max value: 49.6676 - type: nauc_ndcg_at_3_std value: -2.1006 - type: nauc_ndcg_at_3_diff1 value: 52.7373 - type: nauc_ndcg_at_5_max value: 50.5186 - type: nauc_ndcg_at_5_std value: -1.5242 - type: nauc_ndcg_at_5_diff1 value: 53.234300000000005 - type: nauc_ndcg_at_10_max value: 50.5247 - type: nauc_ndcg_at_10_std value: -1.2392 - type: nauc_ndcg_at_10_diff1 value: 53.1045 - type: nauc_ndcg_at_20_max value: 51.3292 - type: nauc_ndcg_at_20_std value: -0.06570000000000001 - type: nauc_ndcg_at_20_diff1 value: 53.48349999999999 - type: nauc_ndcg_at_100_max value: 51.588100000000004 - type: nauc_ndcg_at_100_std value: 1.9398 - type: nauc_ndcg_at_100_diff1 value: 52.755399999999995 - type: nauc_ndcg_at_1000_max value: 51.5558 - type: nauc_ndcg_at_1000_std value: 2.3446000000000002 - type: nauc_ndcg_at_1000_diff1 value: 52.9377 - type: nauc_map_at_1_max value: 40.0957 - type: nauc_map_at_1_std value: -11.972 - type: nauc_map_at_1_diff1 value: 61.88249999999999 - type: nauc_map_at_3_max value: 45.6088 - type: nauc_map_at_3_std value: -9.249699999999999 - type: nauc_map_at_3_diff1 value: 56.260299999999994 - type: nauc_map_at_5_max value: 47.2279 - type: nauc_map_at_5_std value: -7.407500000000001 - type: nauc_map_at_5_diff1 value: 55.7894 - type: nauc_map_at_10_max value: 48.0167 - type: nauc_map_at_10_std value: -6.1371 - type: nauc_map_at_10_diff1 value: 55.4646 - type: nauc_map_at_20_max value: 48.6024 - type: nauc_map_at_20_std value: -5.1559 - type: nauc_map_at_20_diff1 value: 55.338100000000004 - type: nauc_map_at_100_max value: 48.993700000000004 - type: nauc_map_at_100_std value: -4.1873000000000005 - type: nauc_map_at_100_diff1 value: 55.1214 - type: nauc_map_at_1000_max value: 49.054500000000004 - type: nauc_map_at_1000_std value: -4.0072 - type: nauc_map_at_1000_diff1 value: 55.109300000000005 - type: nauc_recall_at_1_max value: 40.0957 - type: nauc_recall_at_1_std value: -11.972 - type: nauc_recall_at_1_diff1 value: 61.88249999999999 - type: nauc_recall_at_3_max value: 44.188 - type: nauc_recall_at_3_std value: -8.3756 - type: nauc_recall_at_3_diff1 value: 48.6817 - type: nauc_recall_at_5_max value: 46.6706 - type: nauc_recall_at_5_std value: -4.1561 - type: nauc_recall_at_5_diff1 value: 47.6738 - type: nauc_recall_at_10_max value: 47.614200000000004 - type: nauc_recall_at_10_std value: -1.1676 - type: nauc_recall_at_10_diff1 value: 45.628099999999996 - type: nauc_recall_at_20_max value: 51.490100000000005 - type: nauc_recall_at_20_std value: 5.111000000000001 - type: nauc_recall_at_20_diff1 value: 45.730199999999996 - type: nauc_recall_at_100_max value: 54.0635 - type: nauc_recall_at_100_std value: 19.8381 - type: nauc_recall_at_100_diff1 value: 39.1924 - type: nauc_recall_at_1000_max value: 56.3672 - type: nauc_recall_at_1000_std value: 33.9274 - type: nauc_recall_at_1000_diff1 value: 38.1103 - type: nauc_precision_at_1_max value: 50.3827 - type: nauc_precision_at_1_std value: -0.8129000000000001 - type: nauc_precision_at_1_diff1 value: 59.7518 - type: nauc_precision_at_3_max value: 46.281299999999995 - type: nauc_precision_at_3_std value: 14.7166 - type: nauc_precision_at_3_diff1 value: 24.211 - type: nauc_precision_at_5_max value: 44.466899999999995 - type: nauc_precision_at_5_std value: 22.5103 - type: nauc_precision_at_5_diff1 value: 15.746099999999998 - type: nauc_precision_at_10_max value: 38.0804 - type: nauc_precision_at_10_std value: 29.677999999999997 - type: nauc_precision_at_10_diff1 value: 4.886299999999999 - type: nauc_precision_at_20_max value: 32.302 - type: nauc_precision_at_20_std value: 34.8443 - type: nauc_precision_at_20_diff1 value: -2.9212 - type: nauc_precision_at_100_max value: 21.4725 - type: nauc_precision_at_100_std value: 41.8747 - type: nauc_precision_at_100_diff1 value: -14.976600000000001 - type: nauc_precision_at_1000_max value: 10.3891 - type: nauc_precision_at_1000_std value: 39.4181 - type: nauc_precision_at_1000_diff1 value: -21.9914 - type: nauc_mrr_at_1_max value: 50.3827 - type: nauc_mrr_at_1_std value: -0.8129000000000001 - type: nauc_mrr_at_1_diff1 value: 59.7518 - type: nauc_mrr_at_3_max value: 51.9937 - type: nauc_mrr_at_3_std value: 2.1604 - type: nauc_mrr_at_3_diff1 value: 54.58539999999999 - type: nauc_mrr_at_5_max value: 52.39319999999999 - type: nauc_mrr_at_5_std value: 2.8171 - type: nauc_mrr_at_5_diff1 value: 54.825100000000006 - type: nauc_mrr_at_10_max value: 52.2047 - type: nauc_mrr_at_10_std value: 2.6525 - type: nauc_mrr_at_10_diff1 value: 54.703500000000005 - type: nauc_mrr_at_20_max value: 52.251999999999995 - type: nauc_mrr_at_20_std value: 2.7842 - type: nauc_mrr_at_20_diff1 value: 54.76689999999999 - type: nauc_mrr_at_100_max value: 52.2776 - type: nauc_mrr_at_100_std value: 2.9701999999999997 - type: nauc_mrr_at_100_diff1 value: 54.712799999999994 - type: nauc_mrr_at_1000_max value: 52.274699999999996 - type: nauc_mrr_at_1000_std value: 2.9652000000000003 - type: nauc_mrr_at_1000_diff1 value: 54.7296 - type: main_score value: 53.142999999999994 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackGamingRetrieval (default) revision: 4885aa143210c98657558c04aaf3dc47cfb54340 split: test type: mteb/cqadupstack-gaming metrics: - type: ndcg_at_1 value: 53.542 - type: ndcg_at_3 value: 60.098 - type: ndcg_at_5 value: 62.515 - type: ndcg_at_10 value: 65.315 - type: ndcg_at_20 value: 66.683 - type: ndcg_at_100 value: 68.47800000000001 - type: ndcg_at_1000 value: 69.329 - type: map_at_1 value: 47.135 - type: map_at_3 value: 56.548 - type: map_at_5 value: 58.306000000000004 - type: map_at_10 value: 59.819 - type: map_at_20 value: 60.328 - type: map_at_100 value: 60.653999999999996 - type: map_at_1000 value: 60.699000000000005 - type: recall_at_1 value: 47.135 - type: recall_at_3 value: 64.371 - type: recall_at_5 value: 70.293 - type: recall_at_10 value: 78.346 - type: recall_at_20 value: 83.369 - type: recall_at_100 value: 92.04599999999999 - type: recall_at_1000 value: 97.933 - type: precision_at_1 value: 53.542 - type: precision_at_3 value: 26.395000000000003 - type: precision_at_5 value: 17.806 - type: precision_at_10 value: 10.238 - type: precision_at_20 value: 5.586 - type: precision_at_100 value: 1.266 - type: precision_at_1000 value: 0.13799999999999998 - type: mrr_at_1 value: 53.5423 - type: mrr_at_3 value: 60.595600000000005 - type: mrr_at_5 value: 61.931000000000004 - type: mrr_at_10 value: 62.8406 - type: mrr_at_20 value: 63.1667 - type: mrr_at_100 value: 63.347699999999996 - type: mrr_at_1000 value: 63.368100000000005 - type: nauc_ndcg_at_1_max value: 50.004599999999996 - type: nauc_ndcg_at_1_std value: -4.3123000000000005 - type: nauc_ndcg_at_1_diff1 value: 61.1973 - type: nauc_ndcg_at_3_max value: 48.65 - type: nauc_ndcg_at_3_std value: -6.0419 - type: nauc_ndcg_at_3_diff1 value: 56.712700000000005 - type: nauc_ndcg_at_5_max value: 50.0908 - type: nauc_ndcg_at_5_std value: -4.4674 - type: nauc_ndcg_at_5_diff1 value: 56.216 - type: nauc_ndcg_at_10_max value: 50.578 - type: nauc_ndcg_at_10_std value: -2.661 - type: nauc_ndcg_at_10_diff1 value: 55.9162 - type: nauc_ndcg_at_20_max value: 51.3801 - type: nauc_ndcg_at_20_std value: -0.8059999999999999 - type: nauc_ndcg_at_20_diff1 value: 55.8654 - type: nauc_ndcg_at_100_max value: 51.4594 - type: nauc_ndcg_at_100_std value: -0.3524 - type: nauc_ndcg_at_100_diff1 value: 56.131699999999995 - type: nauc_ndcg_at_1000_max value: 51.6105 - type: nauc_ndcg_at_1000_std value: -0.8832 - type: nauc_ndcg_at_1000_diff1 value: 56.6507 - type: nauc_map_at_1_max value: 42.7316 - type: nauc_map_at_1_std value: -6.979100000000001 - type: nauc_map_at_1_diff1 value: 61.6382 - type: nauc_map_at_3_max value: 47.6139 - type: nauc_map_at_3_std value: -7.0931 - type: nauc_map_at_3_diff1 value: 58.2923 - type: nauc_map_at_5_max value: 48.6039 - type: nauc_map_at_5_std value: -5.9601 - type: nauc_map_at_5_diff1 value: 57.7052 - type: nauc_map_at_10_max value: 49.2631 - type: nauc_map_at_10_std value: -4.808 - type: nauc_map_at_10_diff1 value: 57.5979 - type: nauc_map_at_20_max value: 49.6783 - type: nauc_map_at_20_std value: -4.0106 - type: nauc_map_at_20_diff1 value: 57.5781 - type: nauc_map_at_100_max value: 49.775000000000006 - type: nauc_map_at_100_std value: -3.8082 - type: nauc_map_at_100_diff1 value: 57.6013 - type: nauc_map_at_1000_max value: 49.8135 - type: nauc_map_at_1000_std value: -3.7974 - type: nauc_map_at_1000_diff1 value: 57.6323 - type: nauc_recall_at_1_max value: 42.7316 - type: nauc_recall_at_1_std value: -6.979100000000001 - type: nauc_recall_at_1_diff1 value: 61.6382 - type: nauc_recall_at_3_max value: 46.1138 - type: nauc_recall_at_3_std value: -8.6906 - type: nauc_recall_at_3_diff1 value: 52.6263 - type: nauc_recall_at_5_max value: 49.074200000000005 - type: nauc_recall_at_5_std value: -4.5975 - type: nauc_recall_at_5_diff1 value: 49.994 - type: nauc_recall_at_10_max value: 49.696 - type: nauc_recall_at_10_std value: 2.049 - type: nauc_recall_at_10_diff1 value: 46.7897 - type: nauc_recall_at_20_max value: 54.03980000000001 - type: nauc_recall_at_20_std value: 14.4898 - type: nauc_recall_at_20_diff1 value: 43.8642 - type: nauc_recall_at_100_max value: 57.23629999999999 - type: nauc_recall_at_100_std value: 32.6507 - type: nauc_recall_at_100_diff1 value: 38.4662 - type: nauc_recall_at_1000_max value: 81.5918 - type: nauc_recall_at_1000_std value: 67.0848 - type: nauc_recall_at_1000_diff1 value: 40.5123 - type: nauc_precision_at_1_max value: 50.004599999999996 - type: nauc_precision_at_1_std value: -4.3123000000000005 - type: nauc_precision_at_1_diff1 value: 61.1973 - type: nauc_precision_at_3_max value: 41.0359 - type: nauc_precision_at_3_std value: 2.2363 - type: nauc_precision_at_3_diff1 value: 26.9914 - type: nauc_precision_at_5_max value: 38.3114 - type: nauc_precision_at_5_std value: 8.7643 - type: nauc_precision_at_5_diff1 value: 17.0673 - type: nauc_precision_at_10_max value: 31.1391 - type: nauc_precision_at_10_std value: 17.1411 - type: nauc_precision_at_10_diff1 value: 4.9287 - type: nauc_precision_at_20_max value: 27.7595 - type: nauc_precision_at_20_std value: 25.470399999999998 - type: nauc_precision_at_20_diff1 value: -2.6803 - type: nauc_precision_at_100_max value: 18.2146 - type: nauc_precision_at_100_std value: 29.244300000000003 - type: nauc_precision_at_100_diff1 value: -13.083 - type: nauc_precision_at_1000_max value: 13.5621 - type: nauc_precision_at_1000_std value: 26.3405 - type: nauc_precision_at_1000_diff1 value: -15.398200000000001 - type: nauc_mrr_at_1_max value: 50.004599999999996 - type: nauc_mrr_at_1_std value: -4.3123000000000005 - type: nauc_mrr_at_1_diff1 value: 61.1973 - type: nauc_mrr_at_3_max value: 50.114599999999996 - type: nauc_mrr_at_3_std value: -4.7759 - type: nauc_mrr_at_3_diff1 value: 57.9624 - type: nauc_mrr_at_5_max value: 50.956900000000005 - type: nauc_mrr_at_5_std value: -3.7144999999999997 - type: nauc_mrr_at_5_diff1 value: 57.784400000000005 - type: nauc_mrr_at_10_max value: 50.8112 - type: nauc_mrr_at_10_std value: -3.3526 - type: nauc_mrr_at_10_diff1 value: 57.674499999999995 - type: nauc_mrr_at_20_max value: 50.9425 - type: nauc_mrr_at_20_std value: -2.9598 - type: nauc_mrr_at_20_diff1 value: 57.6704 - type: nauc_mrr_at_100_max value: 50.901799999999994 - type: nauc_mrr_at_100_std value: -3.0112 - type: nauc_mrr_at_100_diff1 value: 57.736200000000004 - type: nauc_mrr_at_1000_max value: 50.901399999999995 - type: nauc_mrr_at_1000_std value: -3.0314 - type: nauc_mrr_at_1000_diff1 value: 57.747400000000006 - type: main_score value: 65.315 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackGisRetrieval (default) revision: 5003b3064772da1887988e05400cf3806fe491f2 split: test type: mteb/cqadupstack-gis metrics: - type: ndcg_at_1 value: 33.898 - type: ndcg_at_3 value: 39.875 - type: ndcg_at_5 value: 42.455999999999996 - type: ndcg_at_10 value: 45.4 - type: ndcg_at_20 value: 47.831 - type: ndcg_at_100 value: 50.428 - type: ndcg_at_1000 value: 52.037 - type: map_at_1 value: 31.357000000000003 - type: map_at_3 value: 37.358999999999995 - type: map_at_5 value: 38.948 - type: map_at_10 value: 40.243 - type: map_at_20 value: 40.98 - type: map_at_100 value: 41.349999999999994 - type: map_at_1000 value: 41.418 - type: recall_at_1 value: 31.357000000000003 - type: recall_at_3 value: 44.324000000000005 - type: recall_at_5 value: 50.449 - type: recall_at_10 value: 59.17400000000001 - type: recall_at_20 value: 68.272 - type: recall_at_100 value: 81.672 - type: recall_at_1000 value: 93.572 - type: precision_at_1 value: 33.898 - type: precision_at_3 value: 16.648 - type: precision_at_5 value: 11.503 - type: precision_at_10 value: 6.847 - type: precision_at_20 value: 3.9890000000000003 - type: precision_at_100 value: 0.9809999999999999 - type: precision_at_1000 value: 0.11499999999999999 - type: mrr_at_1 value: 33.8983 - type: mrr_at_3 value: 39.8117 - type: mrr_at_5 value: 41.2354 - type: mrr_at_10 value: 42.4212 - type: mrr_at_20 value: 43.0404 - type: mrr_at_100 value: 43.3429 - type: mrr_at_1000 value: 43.3894 - type: nauc_ndcg_at_1_max value: 36.1482 - type: nauc_ndcg_at_1_std value: -4.471 - type: nauc_ndcg_at_1_diff1 value: 44.1333 - type: nauc_ndcg_at_3_max value: 35.404 - type: nauc_ndcg_at_3_std value: -4.487 - type: nauc_ndcg_at_3_diff1 value: 40.3399 - type: nauc_ndcg_at_5_max value: 35.0036 - type: nauc_ndcg_at_5_std value: -4.0964 - type: nauc_ndcg_at_5_diff1 value: 38.2164 - type: nauc_ndcg_at_10_max value: 34.7255 - type: nauc_ndcg_at_10_std value: -2.9356 - type: nauc_ndcg_at_10_diff1 value: 37.3216 - type: nauc_ndcg_at_20_max value: 35.5433 - type: nauc_ndcg_at_20_std value: -1.8858 - type: nauc_ndcg_at_20_diff1 value: 36.6106 - type: nauc_ndcg_at_100_max value: 35.9643 - type: nauc_ndcg_at_100_std value: -1.6303 - type: nauc_ndcg_at_100_diff1 value: 37.515100000000004 - type: nauc_ndcg_at_1000_max value: 35.9222 - type: nauc_ndcg_at_1000_std value: -2.1452999999999998 - type: nauc_ndcg_at_1000_diff1 value: 37.472100000000005 - type: nauc_map_at_1_max value: 32.413599999999995 - type: nauc_map_at_1_std value: -7.391300000000001 - type: nauc_map_at_1_diff1 value: 45.5299 - type: nauc_map_at_3_max value: 34.1688 - type: nauc_map_at_3_std value: -5.6375 - type: nauc_map_at_3_diff1 value: 41.5371 - type: nauc_map_at_5_max value: 34.2057 - type: nauc_map_at_5_std value: -5.4512 - type: nauc_map_at_5_diff1 value: 40.3839 - type: nauc_map_at_10_max value: 34.3355 - type: nauc_map_at_10_std value: -4.7743 - type: nauc_map_at_10_diff1 value: 40.1027 - type: nauc_map_at_20_max value: 34.638400000000004 - type: nauc_map_at_20_std value: -4.4951 - type: nauc_map_at_20_diff1 value: 39.8905 - type: nauc_map_at_100_max value: 34.6621 - type: nauc_map_at_100_std value: -4.4568 - type: nauc_map_at_100_diff1 value: 39.9854 - type: nauc_map_at_1000_max value: 34.6674 - type: nauc_map_at_1000_std value: -4.4651000000000005 - type: nauc_map_at_1000_diff1 value: 39.9739 - type: nauc_recall_at_1_max value: 32.413599999999995 - type: nauc_recall_at_1_std value: -7.391300000000001 - type: nauc_recall_at_1_diff1 value: 45.5299 - type: nauc_recall_at_3_max value: 34.374500000000005 - type: nauc_recall_at_3_std value: -3.8977999999999997 - type: nauc_recall_at_3_diff1 value: 36.9855 - type: nauc_recall_at_5_max value: 33.5608 - type: nauc_recall_at_5_std value: -2.9009 - type: nauc_recall_at_5_diff1 value: 31.9638 - type: nauc_recall_at_10_max value: 32.1813 - type: nauc_recall_at_10_std value: 0.8024999999999999 - type: nauc_recall_at_10_diff1 value: 28.3153 - type: nauc_recall_at_20_max value: 35.0617 - type: nauc_recall_at_20_std value: 6.531199999999999 - type: nauc_recall_at_20_diff1 value: 23.6762 - type: nauc_recall_at_100_max value: 38.9147 - type: nauc_recall_at_100_std value: 12.4753 - type: nauc_recall_at_100_diff1 value: 26.1627 - type: nauc_recall_at_1000_max value: 45.8191 - type: nauc_recall_at_1000_std value: 17.1419 - type: nauc_recall_at_1000_diff1 value: 13.2284 - type: nauc_precision_at_1_max value: 36.1482 - type: nauc_precision_at_1_std value: -4.471 - type: nauc_precision_at_1_diff1 value: 44.1333 - type: nauc_precision_at_3_max value: 38.315 - type: nauc_precision_at_3_std value: -0.16019999999999998 - type: nauc_precision_at_3_diff1 value: 32.4158 - type: nauc_precision_at_5_max value: 36.3912 - type: nauc_precision_at_5_std value: 0.9605 - type: nauc_precision_at_5_diff1 value: 25.7513 - type: nauc_precision_at_10_max value: 34.043 - type: nauc_precision_at_10_std value: 5.6308 - type: nauc_precision_at_10_diff1 value: 20.5638 - type: nauc_precision_at_20_max value: 34.5796 - type: nauc_precision_at_20_std value: 10.0006 - type: nauc_precision_at_20_diff1 value: 13.069500000000001 - type: nauc_precision_at_100_max value: 27.5607 - type: nauc_precision_at_100_std value: 13.173399999999999 - type: nauc_precision_at_100_diff1 value: 6.1834 - type: nauc_precision_at_1000_max value: 15.5825 - type: nauc_precision_at_1000_std value: 9.9148 - type: nauc_precision_at_1000_diff1 value: -8.7873 - type: nauc_mrr_at_1_max value: 36.1482 - type: nauc_mrr_at_1_std value: -4.471 - type: nauc_mrr_at_1_diff1 value: 44.1333 - type: nauc_mrr_at_3_max value: 37.059799999999996 - type: nauc_mrr_at_3_std value: -2.7984999999999998 - type: nauc_mrr_at_3_diff1 value: 40.3801 - type: nauc_mrr_at_5_max value: 36.921 - type: nauc_mrr_at_5_std value: -2.5107 - type: nauc_mrr_at_5_diff1 value: 39.3331 - type: nauc_mrr_at_10_max value: 36.5977 - type: nauc_mrr_at_10_std value: -2.3744 - type: nauc_mrr_at_10_diff1 value: 38.851200000000006 - type: nauc_mrr_at_20_max value: 36.7083 - type: nauc_mrr_at_20_std value: -2.164 - type: nauc_mrr_at_20_diff1 value: 38.729200000000006 - type: nauc_mrr_at_100_max value: 36.7448 - type: nauc_mrr_at_100_std value: -2.1399999999999997 - type: nauc_mrr_at_100_diff1 value: 38.8403 - type: nauc_mrr_at_1000_max value: 36.742200000000004 - type: nauc_mrr_at_1000_std value: -2.1506999999999996 - type: nauc_mrr_at_1000_diff1 value: 38.8393 - type: main_score value: 45.4 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackMathematicaRetrieval (default) revision: 90fceea13679c63fe563ded68f3b6f06e50061de split: test type: mteb/cqadupstack-mathematica metrics: - type: ndcg_at_1 value: 25.124000000000002 - type: ndcg_at_3 value: 29.798000000000002 - type: ndcg_at_5 value: 32.112 - type: ndcg_at_10 value: 34.926 - type: ndcg_at_20 value: 37.317 - type: ndcg_at_100 value: 40.903 - type: ndcg_at_1000 value: 43.18 - type: map_at_1 value: 20.279 - type: map_at_3 value: 26.551000000000002 - type: map_at_5 value: 28.051 - type: map_at_10 value: 29.37 - type: map_at_20 value: 30.085 - type: map_at_100 value: 30.668 - type: map_at_1000 value: 30.774 - type: recall_at_1 value: 20.279 - type: recall_at_3 value: 33.043 - type: recall_at_5 value: 38.991 - type: recall_at_10 value: 47.355999999999995 - type: recall_at_20 value: 55.873 - type: recall_at_100 value: 72.90100000000001 - type: recall_at_1000 value: 88.678 - type: precision_at_1 value: 25.124000000000002 - type: precision_at_3 value: 14.221 - type: precision_at_5 value: 10.323 - type: precision_at_10 value: 6.381 - type: precision_at_20 value: 3.8739999999999997 - type: precision_at_100 value: 1.082 - type: precision_at_1000 value: 0.13999999999999999 - type: mrr_at_1 value: 25.1244 - type: mrr_at_3 value: 31.3847 - type: mrr_at_5 value: 32.9768 - type: mrr_at_10 value: 34.1348 - type: mrr_at_20 value: 34.7501 - type: mrr_at_100 value: 35.1367 - type: mrr_at_1000 value: 35.191 - type: nauc_ndcg_at_1_max value: 27.160600000000002 - type: nauc_ndcg_at_1_std value: 1.7711999999999999 - type: nauc_ndcg_at_1_diff1 value: 39.8547 - type: nauc_ndcg_at_3_max value: 23.7332 - type: nauc_ndcg_at_3_std value: 0.4508 - type: nauc_ndcg_at_3_diff1 value: 34.3668 - type: nauc_ndcg_at_5_max value: 24.6552 - type: nauc_ndcg_at_5_std value: 1.7423000000000002 - type: nauc_ndcg_at_5_diff1 value: 34.8806 - type: nauc_ndcg_at_10_max value: 24.3869 - type: nauc_ndcg_at_10_std value: 1.3054 - type: nauc_ndcg_at_10_diff1 value: 33.7015 - type: nauc_ndcg_at_20_max value: 24.449 - type: nauc_ndcg_at_20_std value: 2.4919000000000002 - type: nauc_ndcg_at_20_diff1 value: 32.9483 - type: nauc_ndcg_at_100_max value: 25.3655 - type: nauc_ndcg_at_100_std value: 2.7169 - type: nauc_ndcg_at_100_diff1 value: 32.8817 - type: nauc_ndcg_at_1000_max value: 25.524599999999996 - type: nauc_ndcg_at_1000_std value: 3.1405000000000003 - type: nauc_ndcg_at_1000_diff1 value: 32.7208 - type: nauc_map_at_1_max value: 24.9051 - type: nauc_map_at_1_std value: 2.788 - type: nauc_map_at_1_diff1 value: 38.9946 - type: nauc_map_at_3_max value: 23.061 - type: nauc_map_at_3_std value: 1.0529 - type: nauc_map_at_3_diff1 value: 35.0109 - type: nauc_map_at_5_max value: 23.704800000000002 - type: nauc_map_at_5_std value: 1.7375999999999998 - type: nauc_map_at_5_diff1 value: 35.2714 - type: nauc_map_at_10_max value: 23.7351 - type: nauc_map_at_10_std value: 1.5004 - type: nauc_map_at_10_diff1 value: 34.8483 - type: nauc_map_at_20_max value: 23.7699 - type: nauc_map_at_20_std value: 1.8925999999999998 - type: nauc_map_at_20_diff1 value: 34.6198 - type: nauc_map_at_100_max value: 23.962600000000002 - type: nauc_map_at_100_std value: 1.9238000000000002 - type: nauc_map_at_100_diff1 value: 34.7253 - type: nauc_map_at_1000_max value: 23.965 - type: nauc_map_at_1000_std value: 1.9339 - type: nauc_map_at_1000_diff1 value: 34.719899999999996 - type: nauc_recall_at_1_max value: 24.9051 - type: nauc_recall_at_1_std value: 2.788 - type: nauc_recall_at_1_diff1 value: 38.9946 - type: nauc_recall_at_3_max value: 21.8415 - type: nauc_recall_at_3_std value: 0.5292 - type: nauc_recall_at_3_diff1 value: 30.811 - type: nauc_recall_at_5_max value: 23.8237 - type: nauc_recall_at_5_std value: 2.5335 - type: nauc_recall_at_5_diff1 value: 31.928800000000003 - type: nauc_recall_at_10_max value: 22.5541 - type: nauc_recall_at_10_std value: 0.9076000000000001 - type: nauc_recall_at_10_diff1 value: 27.8364 - type: nauc_recall_at_20_max value: 22.0853 - type: nauc_recall_at_20_std value: 4.9954 - type: nauc_recall_at_20_diff1 value: 24.2376 - type: nauc_recall_at_100_max value: 26.4301 - type: nauc_recall_at_100_std value: 8.5471 - type: nauc_recall_at_100_diff1 value: 19.2131 - type: nauc_recall_at_1000_max value: 36.3726 - type: nauc_recall_at_1000_std value: 26.9247 - type: nauc_recall_at_1000_diff1 value: 3.8798 - type: nauc_precision_at_1_max value: 27.160600000000002 - type: nauc_precision_at_1_std value: 1.7711999999999999 - type: nauc_precision_at_1_diff1 value: 39.8547 - type: nauc_precision_at_3_max value: 23.8679 - type: nauc_precision_at_3_std value: -1.052 - type: nauc_precision_at_3_diff1 value: 29.999100000000002 - type: nauc_precision_at_5_max value: 24.7345 - type: nauc_precision_at_5_std value: 1.3604 - type: nauc_precision_at_5_diff1 value: 29.8611 - type: nauc_precision_at_10_max value: 21.5396 - type: nauc_precision_at_10_std value: -1.0137 - type: nauc_precision_at_10_diff1 value: 23.519000000000002 - type: nauc_precision_at_20_max value: 18.4431 - type: nauc_precision_at_20_std value: 1.5350000000000001 - type: nauc_precision_at_20_diff1 value: 16.5031 - type: nauc_precision_at_100_max value: 13.9255 - type: nauc_precision_at_100_std value: -0.48650000000000004 - type: nauc_precision_at_100_diff1 value: 7.700799999999999 - type: nauc_precision_at_1000_max value: 3.6421 - type: nauc_precision_at_1000_std value: -4.7682 - type: nauc_precision_at_1000_diff1 value: -1.4256 - type: nauc_mrr_at_1_max value: 27.160600000000002 - type: nauc_mrr_at_1_std value: 1.7711999999999999 - type: nauc_mrr_at_1_diff1 value: 39.8547 - type: nauc_mrr_at_3_max value: 25.44 - type: nauc_mrr_at_3_std value: 0.08639999999999999 - type: nauc_mrr_at_3_diff1 value: 35.381800000000005 - type: nauc_mrr_at_5_max value: 26.011899999999997 - type: nauc_mrr_at_5_std value: 0.6948 - type: nauc_mrr_at_5_diff1 value: 36.246 - type: nauc_mrr_at_10_max value: 25.8141 - type: nauc_mrr_at_10_std value: 0.5511 - type: nauc_mrr_at_10_diff1 value: 35.7313 - type: nauc_mrr_at_20_max value: 25.805899999999998 - type: nauc_mrr_at_20_std value: 0.8933 - type: nauc_mrr_at_20_diff1 value: 35.4972 - type: nauc_mrr_at_100_max value: 25.909 - type: nauc_mrr_at_100_std value: 0.8796999999999999 - type: nauc_mrr_at_100_diff1 value: 35.5299 - type: nauc_mrr_at_1000_max value: 25.910800000000002 - type: nauc_mrr_at_1000_std value: 0.9046000000000001 - type: nauc_mrr_at_1000_diff1 value: 35.522999999999996 - type: main_score value: 34.926 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackPhysicsRetrieval (default) revision: 79531abbd1fb92d06c6d6315a0cbbbf5bb247ea4 split: test type: mteb/cqadupstack-physics metrics: - type: ndcg_at_1 value: 42.059999999999995 - type: ndcg_at_3 value: 46.461999999999996 - type: ndcg_at_5 value: 48.662 - type: ndcg_at_10 value: 50.925 - type: ndcg_at_20 value: 53.120999999999995 - type: ndcg_at_100 value: 56.189 - type: ndcg_at_1000 value: 57.972 - type: map_at_1 value: 33.919 - type: map_at_3 value: 41.858000000000004 - type: map_at_5 value: 43.629 - type: map_at_10 value: 45.01 - type: map_at_20 value: 45.781 - type: map_at_100 value: 46.372 - type: map_at_1000 value: 46.477000000000004 - type: recall_at_1 value: 33.919 - type: recall_at_3 value: 49.153999999999996 - type: recall_at_5 value: 55.422000000000004 - type: recall_at_10 value: 62.204 - type: recall_at_20 value: 69.819 - type: recall_at_100 value: 83.67599999999999 - type: recall_at_1000 value: 95.093 - type: precision_at_1 value: 42.059999999999995 - type: precision_at_3 value: 22.201 - type: precision_at_5 value: 15.342 - type: precision_at_10 value: 9.038 - type: precision_at_20 value: 5.244999999999999 - type: precision_at_100 value: 1.348 - type: precision_at_1000 value: 0.168 - type: mrr_at_1 value: 42.0597 - type: mrr_at_3 value: 49.005500000000005 - type: mrr_at_5 value: 50.3673 - type: mrr_at_10 value: 51.14959999999999 - type: mrr_at_20 value: 51.656 - type: mrr_at_100 value: 51.969 - type: mrr_at_1000 value: 52.0088 - type: nauc_ndcg_at_1_max value: 39.321400000000004 - type: nauc_ndcg_at_1_std value: -3.3204 - type: nauc_ndcg_at_1_diff1 value: 50.999300000000005 - type: nauc_ndcg_at_3_max value: 37.6896 - type: nauc_ndcg_at_3_std value: -4.7356 - type: nauc_ndcg_at_3_diff1 value: 48.0551 - type: nauc_ndcg_at_5_max value: 36.9149 - type: nauc_ndcg_at_5_std value: -5.8358 - type: nauc_ndcg_at_5_diff1 value: 48.4085 - type: nauc_ndcg_at_10_max value: 36.9047 - type: nauc_ndcg_at_10_std value: -5.1284 - type: nauc_ndcg_at_10_diff1 value: 48.3356 - type: nauc_ndcg_at_20_max value: 36.9876 - type: nauc_ndcg_at_20_std value: -4.0274 - type: nauc_ndcg_at_20_diff1 value: 48.0203 - type: nauc_ndcg_at_100_max value: 38.472899999999996 - type: nauc_ndcg_at_100_std value: -1.1645 - type: nauc_ndcg_at_100_diff1 value: 47.734 - type: nauc_ndcg_at_1000_max value: 38.828 - type: nauc_ndcg_at_1000_std value: -1.5388000000000002 - type: nauc_ndcg_at_1000_diff1 value: 47.8951 - type: nauc_map_at_1_max value: 32.8495 - type: nauc_map_at_1_std value: -11.1224 - type: nauc_map_at_1_diff1 value: 52.8561 - type: nauc_map_at_3_max value: 35.2472 - type: nauc_map_at_3_std value: -7.8861 - type: nauc_map_at_3_diff1 value: 49.2087 - type: nauc_map_at_5_max value: 35.5165 - type: nauc_map_at_5_std value: -7.8567 - type: nauc_map_at_5_diff1 value: 49.3185 - type: nauc_map_at_10_max value: 36.2371 - type: nauc_map_at_10_std value: -6.7322999999999995 - type: nauc_map_at_10_diff1 value: 49.3669 - type: nauc_map_at_20_max value: 36.3245 - type: nauc_map_at_20_std value: -6.2256 - type: nauc_map_at_20_diff1 value: 49.242999999999995 - type: nauc_map_at_100_max value: 36.6375 - type: nauc_map_at_100_std value: -5.694599999999999 - type: nauc_map_at_100_diff1 value: 49.1942 - type: nauc_map_at_1000_max value: 36.6734 - type: nauc_map_at_1000_std value: -5.6653 - type: nauc_map_at_1000_diff1 value: 49.1813 - type: nauc_recall_at_1_max value: 32.8495 - type: nauc_recall_at_1_std value: -11.1224 - type: nauc_recall_at_1_diff1 value: 52.8561 - type: nauc_recall_at_3_max value: 33.2098 - type: nauc_recall_at_3_std value: -7.4756 - type: nauc_recall_at_3_diff1 value: 44.6512 - type: nauc_recall_at_5_max value: 32.0734 - type: nauc_recall_at_5_std value: -8.552 - type: nauc_recall_at_5_diff1 value: 43.2098 - type: nauc_recall_at_10_max value: 32.452999999999996 - type: nauc_recall_at_10_std value: -5.631 - type: nauc_recall_at_10_diff1 value: 42.4641 - type: nauc_recall_at_20_max value: 31.660300000000003 - type: nauc_recall_at_20_std value: -1.5259 - type: nauc_recall_at_20_diff1 value: 40.5356 - type: nauc_recall_at_100_max value: 40.3906 - type: nauc_recall_at_100_std value: 22.5792 - type: nauc_recall_at_100_diff1 value: 36.2667 - type: nauc_recall_at_1000_max value: 61.422399999999996 - type: nauc_recall_at_1000_std value: 46.7038 - type: nauc_recall_at_1000_diff1 value: 36.4218 - type: nauc_precision_at_1_max value: 39.321400000000004 - type: nauc_precision_at_1_std value: -3.3204 - type: nauc_precision_at_1_diff1 value: 50.999300000000005 - type: nauc_precision_at_3_max value: 35.7839 - type: nauc_precision_at_3_std value: 7.773199999999999 - type: nauc_precision_at_3_diff1 value: 29.8081 - type: nauc_precision_at_5_max value: 32.7723 - type: nauc_precision_at_5_std value: 9.8457 - type: nauc_precision_at_5_diff1 value: 24.9104 - type: nauc_precision_at_10_max value: 30.6076 - type: nauc_precision_at_10_std value: 16.5018 - type: nauc_precision_at_10_diff1 value: 17.5733 - type: nauc_precision_at_20_max value: 25.8982 - type: nauc_precision_at_20_std value: 20.4936 - type: nauc_precision_at_20_diff1 value: 9.4253 - type: nauc_precision_at_100_max value: 20.5147 - type: nauc_precision_at_100_std value: 28.0537 - type: nauc_precision_at_100_diff1 value: -3.5682 - type: nauc_precision_at_1000_max value: 8.9834 - type: nauc_precision_at_1000_std value: 21.330099999999998 - type: nauc_precision_at_1000_diff1 value: -13.9467 - type: nauc_mrr_at_1_max value: 39.321400000000004 - type: nauc_mrr_at_1_std value: -3.3204 - type: nauc_mrr_at_1_diff1 value: 50.999300000000005 - type: nauc_mrr_at_3_max value: 39.537099999999995 - type: nauc_mrr_at_3_std value: -1.8964999999999999 - type: nauc_mrr_at_3_diff1 value: 48.790499999999994 - type: nauc_mrr_at_5_max value: 39.5914 - type: nauc_mrr_at_5_std value: -2.1046 - type: nauc_mrr_at_5_diff1 value: 48.674099999999996 - type: nauc_mrr_at_10_max value: 39.4877 - type: nauc_mrr_at_10_std value: -2.1155 - type: nauc_mrr_at_10_diff1 value: 48.5082 - type: nauc_mrr_at_20_max value: 39.5837 - type: nauc_mrr_at_20_std value: -1.8568999999999998 - type: nauc_mrr_at_20_diff1 value: 48.4835 - type: nauc_mrr_at_100_max value: 39.6439 - type: nauc_mrr_at_100_std value: -1.6681000000000001 - type: nauc_mrr_at_100_diff1 value: 48.4452 - type: nauc_mrr_at_1000_max value: 39.6426 - type: nauc_mrr_at_1000_std value: -1.6824 - type: nauc_mrr_at_1000_diff1 value: 48.4594 - type: main_score value: 50.925 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackProgrammersRetrieval (default) revision: 6184bc1440d2dbc7612be22b50686b8826d22b32 split: test type: mteb/cqadupstack-programmers metrics: - type: ndcg_at_1 value: 38.812999999999995 - type: ndcg_at_3 value: 43.126999999999995 - type: ndcg_at_5 value: 45.269999999999996 - type: ndcg_at_10 value: 48.181000000000004 - type: ndcg_at_20 value: 50.475 - type: ndcg_at_100 value: 53.378 - type: ndcg_at_1000 value: 55.372 - type: map_at_1 value: 31.228 - type: map_at_3 value: 38.727000000000004 - type: map_at_5 value: 40.544000000000004 - type: map_at_10 value: 42.022999999999996 - type: map_at_20 value: 42.815 - type: map_at_100 value: 43.336000000000006 - type: map_at_1000 value: 43.434 - type: recall_at_1 value: 31.228 - type: recall_at_3 value: 46.075 - type: recall_at_5 value: 52.065 - type: recall_at_10 value: 60.86 - type: recall_at_20 value: 68.916 - type: recall_at_100 value: 82.49600000000001 - type: recall_at_1000 value: 95.914 - type: precision_at_1 value: 38.812999999999995 - type: precision_at_3 value: 20.51 - type: precision_at_5 value: 14.405999999999999 - type: precision_at_10 value: 8.676 - type: precision_at_20 value: 5.08 - type: precision_at_100 value: 1.3 - type: precision_at_1000 value: 0.165 - type: mrr_at_1 value: 38.812799999999996 - type: mrr_at_3 value: 45.3957 - type: mrr_at_5 value: 46.8113 - type: mrr_at_10 value: 47.9132 - type: mrr_at_20 value: 48.4148 - type: mrr_at_100 value: 48.694900000000004 - type: mrr_at_1000 value: 48.74 - type: nauc_ndcg_at_1_max value: 46.951100000000004 - type: nauc_ndcg_at_1_std value: 4.750299999999999 - type: nauc_ndcg_at_1_diff1 value: 50.353300000000004 - type: nauc_ndcg_at_3_max value: 44.852 - type: nauc_ndcg_at_3_std value: 5.976 - type: nauc_ndcg_at_3_diff1 value: 44.8003 - type: nauc_ndcg_at_5_max value: 44.7999 - type: nauc_ndcg_at_5_std value: 7.138799999999999 - type: nauc_ndcg_at_5_diff1 value: 43.786 - type: nauc_ndcg_at_10_max value: 45.272800000000004 - type: nauc_ndcg_at_10_std value: 8.318200000000001 - type: nauc_ndcg_at_10_diff1 value: 43.5412 - type: nauc_ndcg_at_20_max value: 45.9439 - type: nauc_ndcg_at_20_std value: 9.5894 - type: nauc_ndcg_at_20_diff1 value: 43.635400000000004 - type: nauc_ndcg_at_100_max value: 46.555800000000005 - type: nauc_ndcg_at_100_std value: 11.4897 - type: nauc_ndcg_at_100_diff1 value: 43.2953 - type: nauc_ndcg_at_1000_max value: 46.4671 - type: nauc_ndcg_at_1000_std value: 10.198500000000001 - type: nauc_ndcg_at_1000_diff1 value: 43.9655 - type: nauc_map_at_1_max value: 41.2881 - type: nauc_map_at_1_std value: -1.7105 - type: nauc_map_at_1_diff1 value: 52.340900000000005 - type: nauc_map_at_3_max value: 43.2779 - type: nauc_map_at_3_std value: 3.1361 - type: nauc_map_at_3_diff1 value: 46.899499999999996 - type: nauc_map_at_5_max value: 44.034600000000005 - type: nauc_map_at_5_std value: 4.376 - type: nauc_map_at_5_diff1 value: 46.1768 - type: nauc_map_at_10_max value: 44.495200000000004 - type: nauc_map_at_10_std value: 5.1069 - type: nauc_map_at_10_diff1 value: 45.8036 - type: nauc_map_at_20_max value: 44.9796 - type: nauc_map_at_20_std value: 5.6501 - type: nauc_map_at_20_diff1 value: 45.8538 - type: nauc_map_at_100_max value: 45.178000000000004 - type: nauc_map_at_100_std value: 6.1053999999999995 - type: nauc_map_at_100_diff1 value: 45.7785 - type: nauc_map_at_1000_max value: 45.169599999999996 - type: nauc_map_at_1000_std value: 6.0758 - type: nauc_map_at_1000_diff1 value: 45.794200000000004 - type: nauc_recall_at_1_max value: 41.2881 - type: nauc_recall_at_1_std value: -1.7105 - type: nauc_recall_at_1_diff1 value: 52.340900000000005 - type: nauc_recall_at_3_max value: 40.213100000000004 - type: nauc_recall_at_3_std value: 5.0584 - type: nauc_recall_at_3_diff1 value: 39.8885 - type: nauc_recall_at_5_max value: 40.629799999999996 - type: nauc_recall_at_5_std value: 9.2891 - type: nauc_recall_at_5_diff1 value: 36.7529 - type: nauc_recall_at_10_max value: 41.1258 - type: nauc_recall_at_10_std value: 14.056 - type: nauc_recall_at_10_diff1 value: 34.416000000000004 - type: nauc_recall_at_20_max value: 42.2647 - type: nauc_recall_at_20_std value: 19.0659 - type: nauc_recall_at_20_diff1 value: 33.9025 - type: nauc_recall_at_100_max value: 45.4518 - type: nauc_recall_at_100_std value: 38.2567 - type: nauc_recall_at_100_diff1 value: 27.418300000000002 - type: nauc_recall_at_1000_max value: 52.1153 - type: nauc_recall_at_1000_std value: 54.8108 - type: nauc_recall_at_1000_diff1 value: 28.122200000000003 - type: nauc_precision_at_1_max value: 46.951100000000004 - type: nauc_precision_at_1_std value: 4.750299999999999 - type: nauc_precision_at_1_diff1 value: 50.353300000000004 - type: nauc_precision_at_3_max value: 43.3769 - type: nauc_precision_at_3_std value: 15.2362 - type: nauc_precision_at_3_diff1 value: 29.4925 - type: nauc_precision_at_5_max value: 40.0531 - type: nauc_precision_at_5_std value: 18.0719 - type: nauc_precision_at_5_diff1 value: 21.4607 - type: nauc_precision_at_10_max value: 34.558 - type: nauc_precision_at_10_std value: 20.2349 - type: nauc_precision_at_10_diff1 value: 13.0483 - type: nauc_precision_at_20_max value: 30.3112 - type: nauc_precision_at_20_std value: 23.7865 - type: nauc_precision_at_20_diff1 value: 6.678000000000001 - type: nauc_precision_at_100_max value: 15.782599999999999 - type: nauc_precision_at_100_std value: 23.3508 - type: nauc_precision_at_100_diff1 value: -5.356199999999999 - type: nauc_precision_at_1000_max value: -1.203 - type: nauc_precision_at_1000_std value: 9.2771 - type: nauc_precision_at_1000_diff1 value: -12.0167 - type: nauc_mrr_at_1_max value: 46.951100000000004 - type: nauc_mrr_at_1_std value: 4.750299999999999 - type: nauc_mrr_at_1_diff1 value: 50.353300000000004 - type: nauc_mrr_at_3_max value: 47.1661 - type: nauc_mrr_at_3_std value: 7.985 - type: nauc_mrr_at_3_diff1 value: 45.5407 - type: nauc_mrr_at_5_max value: 46.7954 - type: nauc_mrr_at_5_std value: 8.615200000000002 - type: nauc_mrr_at_5_diff1 value: 44.767 - type: nauc_mrr_at_10_max value: 46.874500000000005 - type: nauc_mrr_at_10_std value: 8.9973 - type: nauc_mrr_at_10_diff1 value: 44.7807 - type: nauc_mrr_at_20_max value: 46.8582 - type: nauc_mrr_at_20_std value: 9.1312 - type: nauc_mrr_at_20_diff1 value: 44.7926 - type: nauc_mrr_at_100_max value: 46.9119 - type: nauc_mrr_at_100_std value: 9.2225 - type: nauc_mrr_at_100_diff1 value: 44.7972 - type: nauc_mrr_at_1000_max value: 46.9139 - type: nauc_mrr_at_1000_std value: 9.1867 - type: nauc_mrr_at_1000_diff1 value: 44.8208 - type: main_score value: 48.181000000000004 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackRetrieval (default) revision: CQADupstackRetrieval_is_a_combined_dataset split: test type: CQADupstackRetrieval_is_a_combined_dataset metrics: - type: main_score value: 47.198 - type: ndcg_at_10 value: 47.198 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackStatsRetrieval (default) revision: 65ac3a16b8e91f9cee4c9828cc7c335575432a2a split: test type: mteb/cqadupstack-stats metrics: - type: ndcg_at_1 value: 32.515 - type: ndcg_at_3 value: 36.754999999999995 - type: ndcg_at_5 value: 38.461 - type: ndcg_at_10 value: 41.113 - type: ndcg_at_20 value: 42.744 - type: ndcg_at_100 value: 45.607 - type: ndcg_at_1000 value: 47.769 - type: map_at_1 value: 28.877999999999997 - type: map_at_3 value: 34.111000000000004 - type: map_at_5 value: 35.296 - type: map_at_10 value: 36.516 - type: map_at_20 value: 37.031 - type: map_at_100 value: 37.455 - type: map_at_1000 value: 37.54 - type: recall_at_1 value: 28.877999999999997 - type: recall_at_3 value: 39.823 - type: recall_at_5 value: 44.074000000000005 - type: recall_at_10 value: 52.138 - type: recall_at_20 value: 58.268 - type: recall_at_100 value: 72.675 - type: recall_at_1000 value: 88.49900000000001 - type: precision_at_1 value: 32.515 - type: precision_at_3 value: 15.491 - type: precision_at_5 value: 10.613 - type: precision_at_10 value: 6.411 - type: precision_at_20 value: 3.604 - type: precision_at_100 value: 0.9390000000000001 - type: precision_at_1000 value: 0.121 - type: mrr_at_1 value: 32.5153 - type: mrr_at_3 value: 37.5256 - type: mrr_at_5 value: 38.507200000000005 - type: mrr_at_10 value: 39.6489 - type: mrr_at_20 value: 40.0734 - type: mrr_at_100 value: 40.408899999999996 - type: mrr_at_1000 value: 40.470600000000005 - type: nauc_ndcg_at_1_max value: 46.9541 - type: nauc_ndcg_at_1_std value: -0.6345 - type: nauc_ndcg_at_1_diff1 value: 56.4747 - type: nauc_ndcg_at_3_max value: 44.595600000000005 - type: nauc_ndcg_at_3_std value: -0.6883 - type: nauc_ndcg_at_3_diff1 value: 51.176100000000005 - type: nauc_ndcg_at_5_max value: 45.0672 - type: nauc_ndcg_at_5_std value: 0.7248 - type: nauc_ndcg_at_5_diff1 value: 50.6661 - type: nauc_ndcg_at_10_max value: 45.3702 - type: nauc_ndcg_at_10_std value: 3.7225 - type: nauc_ndcg_at_10_diff1 value: 48.5914 - type: nauc_ndcg_at_20_max value: 45.134800000000006 - type: nauc_ndcg_at_20_std value: 3.4250999999999996 - type: nauc_ndcg_at_20_diff1 value: 48.0876 - type: nauc_ndcg_at_100_max value: 45.848 - type: nauc_ndcg_at_100_std value: 5.0007 - type: nauc_ndcg_at_100_diff1 value: 48.4221 - type: nauc_ndcg_at_1000_max value: 46.0472 - type: nauc_ndcg_at_1000_std value: 4.8727 - type: nauc_ndcg_at_1000_diff1 value: 48.7787 - type: nauc_map_at_1_max value: 44.2723 - type: nauc_map_at_1_std value: -4.1624 - type: nauc_map_at_1_diff1 value: 56.3666 - type: nauc_map_at_3_max value: 44.368 - type: nauc_map_at_3_std value: -2.2338 - type: nauc_map_at_3_diff1 value: 52.662299999999995 - type: nauc_map_at_5_max value: 44.9376 - type: nauc_map_at_5_std value: -0.9258000000000001 - type: nauc_map_at_5_diff1 value: 52.2675 - type: nauc_map_at_10_max value: 45.162600000000005 - type: nauc_map_at_10_std value: 0.5709 - type: nauc_map_at_10_diff1 value: 51.2702 - type: nauc_map_at_20_max value: 45.088899999999995 - type: nauc_map_at_20_std value: 0.5163 - type: nauc_map_at_20_diff1 value: 51.1058 - type: nauc_map_at_100_max value: 45.203700000000005 - type: nauc_map_at_100_std value: 0.7443 - type: nauc_map_at_100_diff1 value: 51.1744 - type: nauc_map_at_1000_max value: 45.2121 - type: nauc_map_at_1000_std value: 0.7443 - type: nauc_map_at_1000_diff1 value: 51.186699999999995 - type: nauc_recall_at_1_max value: 44.2723 - type: nauc_recall_at_1_std value: -4.1624 - type: nauc_recall_at_1_diff1 value: 56.3666 - type: nauc_recall_at_3_max value: 41.484700000000004 - type: nauc_recall_at_3_std value: -1.5438 - type: nauc_recall_at_3_diff1 value: 47.3155 - type: nauc_recall_at_5_max value: 42.7926 - type: nauc_recall_at_5_std value: 2.2485999999999997 - type: nauc_recall_at_5_diff1 value: 45.7287 - type: nauc_recall_at_10_max value: 43.3757 - type: nauc_recall_at_10_std value: 11.1774 - type: nauc_recall_at_10_diff1 value: 38.699 - type: nauc_recall_at_20_max value: 41.9806 - type: nauc_recall_at_20_std value: 9.8464 - type: nauc_recall_at_20_diff1 value: 36.209599999999995 - type: nauc_recall_at_100_max value: 44.935399999999994 - type: nauc_recall_at_100_std value: 22.2528 - type: nauc_recall_at_100_diff1 value: 33.9811 - type: nauc_recall_at_1000_max value: 48.0178 - type: nauc_recall_at_1000_std value: 35.6656 - type: nauc_recall_at_1000_diff1 value: 27.0609 - type: nauc_precision_at_1_max value: 46.9541 - type: nauc_precision_at_1_std value: -0.6345 - type: nauc_precision_at_1_diff1 value: 56.4747 - type: nauc_precision_at_3_max value: 44.8235 - type: nauc_precision_at_3_std value: 6.392399999999999 - type: nauc_precision_at_3_diff1 value: 43.4139 - type: nauc_precision_at_5_max value: 44.1627 - type: nauc_precision_at_5_std value: 12.5801 - type: nauc_precision_at_5_diff1 value: 38.3975 - type: nauc_precision_at_10_max value: 42.2932 - type: nauc_precision_at_10_std value: 21.9445 - type: nauc_precision_at_10_diff1 value: 28.898200000000003 - type: nauc_precision_at_20_max value: 38.3815 - type: nauc_precision_at_20_std value: 21.2644 - type: nauc_precision_at_20_diff1 value: 22.902900000000002 - type: nauc_precision_at_100_max value: 30.0629 - type: nauc_precision_at_100_std value: 25.7938 - type: nauc_precision_at_100_diff1 value: 13.500599999999999 - type: nauc_precision_at_1000_max value: 16.1509 - type: nauc_precision_at_1000_std value: 22.168599999999998 - type: nauc_precision_at_1000_diff1 value: -0.5865 - type: nauc_mrr_at_1_max value: 46.9541 - type: nauc_mrr_at_1_std value: -0.6345 - type: nauc_mrr_at_1_diff1 value: 56.4747 - type: nauc_mrr_at_3_max value: 45.571 - type: nauc_mrr_at_3_std value: 0.5652 - type: nauc_mrr_at_3_diff1 value: 52.2878 - type: nauc_mrr_at_5_max value: 45.9243 - type: nauc_mrr_at_5_std value: 1.4102 - type: nauc_mrr_at_5_diff1 value: 52.0197 - type: nauc_mrr_at_10_max value: 46.090599999999995 - type: nauc_mrr_at_10_std value: 2.5422000000000002 - type: nauc_mrr_at_10_diff1 value: 51.1523 - type: nauc_mrr_at_20_max value: 46.0581 - type: nauc_mrr_at_20_std value: 2.4245 - type: nauc_mrr_at_20_diff1 value: 51.1149 - type: nauc_mrr_at_100_max value: 46.138200000000005 - type: nauc_mrr_at_100_std value: 2.5852 - type: nauc_mrr_at_100_diff1 value: 51.19200000000001 - type: nauc_mrr_at_1000_max value: 46.134 - type: nauc_mrr_at_1000_std value: 2.5724 - type: nauc_mrr_at_1000_diff1 value: 51.20099999999999 - type: main_score value: 41.113 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackTexRetrieval (default) revision: 46989137a86843e03a6195de44b09deda022eec7 split: test type: mteb/cqadupstack-tex metrics: - type: ndcg_at_1 value: 26.358999999999998 - type: ndcg_at_3 value: 30.921 - type: ndcg_at_5 value: 33.083 - type: ndcg_at_10 value: 35.669000000000004 - type: ndcg_at_20 value: 37.486999999999995 - type: ndcg_at_100 value: 40.897 - type: ndcg_at_1000 value: 43.492999999999995 - type: map_at_1 value: 21.644 - type: map_at_3 value: 27.638 - type: map_at_5 value: 29.181 - type: map_at_10 value: 30.429000000000002 - type: map_at_20 value: 31.018 - type: map_at_100 value: 31.557000000000002 - type: map_at_1000 value: 31.676 - type: recall_at_1 value: 21.644 - type: recall_at_3 value: 33.727000000000004 - type: recall_at_5 value: 39.402 - type: recall_at_10 value: 47.166000000000004 - type: recall_at_20 value: 53.818 - type: recall_at_100 value: 70.625 - type: recall_at_1000 value: 88.848 - type: precision_at_1 value: 26.358999999999998 - type: precision_at_3 value: 14.602 - type: precision_at_5 value: 10.509 - type: precision_at_10 value: 6.468999999999999 - type: precision_at_20 value: 3.7969999999999997 - type: precision_at_100 value: 1.0619999999999998 - type: precision_at_1000 value: 0.147 - type: mrr_at_1 value: 26.3593 - type: mrr_at_3 value: 32.2379 - type: mrr_at_5 value: 33.5559 - type: mrr_at_10 value: 34.6105 - type: mrr_at_20 value: 35.0733 - type: mrr_at_100 value: 35.4832 - type: mrr_at_1000 value: 35.5508 - type: nauc_ndcg_at_1_max value: 38.821 - type: nauc_ndcg_at_1_std value: -0.9577 - type: nauc_ndcg_at_1_diff1 value: 49.477900000000005 - type: nauc_ndcg_at_3_max value: 36.9651 - type: nauc_ndcg_at_3_std value: 0.5652 - type: nauc_ndcg_at_3_diff1 value: 42.9649 - type: nauc_ndcg_at_5_max value: 36.9433 - type: nauc_ndcg_at_5_std value: 1.4069 - type: nauc_ndcg_at_5_diff1 value: 41.3321 - type: nauc_ndcg_at_10_max value: 37.0556 - type: nauc_ndcg_at_10_std value: 1.983 - type: nauc_ndcg_at_10_diff1 value: 40.6062 - type: nauc_ndcg_at_20_max value: 37.621 - type: nauc_ndcg_at_20_std value: 3.1833 - type: nauc_ndcg_at_20_diff1 value: 40.0768 - type: nauc_ndcg_at_100_max value: 37.5859 - type: nauc_ndcg_at_100_std value: 4.4883 - type: nauc_ndcg_at_100_diff1 value: 39.6131 - type: nauc_ndcg_at_1000_max value: 37.9037 - type: nauc_ndcg_at_1000_std value: 4.3155 - type: nauc_ndcg_at_1000_diff1 value: 40.393 - type: nauc_map_at_1_max value: 34.2335 - type: nauc_map_at_1_std value: -2.5663 - type: nauc_map_at_1_diff1 value: 49.3827 - type: nauc_map_at_3_max value: 35.1539 - type: nauc_map_at_3_std value: -0.4655 - type: nauc_map_at_3_diff1 value: 44.0299 - type: nauc_map_at_5_max value: 35.546499999999995 - type: nauc_map_at_5_std value: -0.0021 - type: nauc_map_at_5_diff1 value: 43.0138 - type: nauc_map_at_10_max value: 35.904799999999994 - type: nauc_map_at_10_std value: 0.367 - type: nauc_map_at_10_diff1 value: 42.762699999999995 - type: nauc_map_at_20_max value: 36.1855 - type: nauc_map_at_20_std value: 0.7818 - type: nauc_map_at_20_diff1 value: 42.6084 - type: nauc_map_at_100_max value: 36.2406 - type: nauc_map_at_100_std value: 0.9825999999999999 - type: nauc_map_at_100_diff1 value: 42.5375 - type: nauc_map_at_1000_max value: 36.2732 - type: nauc_map_at_1000_std value: 0.9912000000000001 - type: nauc_map_at_1000_diff1 value: 42.5821 - type: nauc_recall_at_1_max value: 34.2335 - type: nauc_recall_at_1_std value: -2.5663 - type: nauc_recall_at_1_diff1 value: 49.3827 - type: nauc_recall_at_3_max value: 34.2402 - type: nauc_recall_at_3_std value: 1.3011 - type: nauc_recall_at_3_diff1 value: 38.5403 - type: nauc_recall_at_5_max value: 34.2169 - type: nauc_recall_at_5_std value: 3.0383 - type: nauc_recall_at_5_diff1 value: 34.3078 - type: nauc_recall_at_10_max value: 34.2267 - type: nauc_recall_at_10_std value: 4.7303 - type: nauc_recall_at_10_diff1 value: 31.2869 - type: nauc_recall_at_20_max value: 35.6281 - type: nauc_recall_at_20_std value: 8.940199999999999 - type: nauc_recall_at_20_diff1 value: 28.655599999999996 - type: nauc_recall_at_100_max value: 34.0961 - type: nauc_recall_at_100_std value: 18.096799999999998 - type: nauc_recall_at_100_diff1 value: 22.490199999999998 - type: nauc_recall_at_1000_max value: 37.3724 - type: nauc_recall_at_1000_std value: 29.723699999999997 - type: nauc_recall_at_1000_diff1 value: 18.9603 - type: nauc_precision_at_1_max value: 38.821 - type: nauc_precision_at_1_std value: -0.9577 - type: nauc_precision_at_1_diff1 value: 49.477900000000005 - type: nauc_precision_at_3_max value: 38.9589 - type: nauc_precision_at_3_std value: 3.6894000000000005 - type: nauc_precision_at_3_diff1 value: 34.869499999999995 - type: nauc_precision_at_5_max value: 37.9132 - type: nauc_precision_at_5_std value: 6.1095 - type: nauc_precision_at_5_diff1 value: 28.7686 - type: nauc_precision_at_10_max value: 35.5564 - type: nauc_precision_at_10_std value: 7.4825 - type: nauc_precision_at_10_diff1 value: 24.0663 - type: nauc_precision_at_20_max value: 34.3717 - type: nauc_precision_at_20_std value: 10.989 - type: nauc_precision_at_20_diff1 value: 19.0117 - type: nauc_precision_at_100_max value: 25.595000000000002 - type: nauc_precision_at_100_std value: 13.692499999999999 - type: nauc_precision_at_100_diff1 value: 9.7287 - type: nauc_precision_at_1000_max value: 15.6194 - type: nauc_precision_at_1000_std value: 7.9235 - type: nauc_precision_at_1000_diff1 value: 3.5067 - type: nauc_mrr_at_1_max value: 38.821 - type: nauc_mrr_at_1_std value: -0.9577 - type: nauc_mrr_at_1_diff1 value: 49.477900000000005 - type: nauc_mrr_at_3_max value: 39.365899999999996 - type: nauc_mrr_at_3_std value: 0.8999999999999999 - type: nauc_mrr_at_3_diff1 value: 44.8801 - type: nauc_mrr_at_5_max value: 39.339400000000005 - type: nauc_mrr_at_5_std value: 1.6056000000000001 - type: nauc_mrr_at_5_diff1 value: 43.9725 - type: nauc_mrr_at_10_max value: 39.245200000000004 - type: nauc_mrr_at_10_std value: 1.6921 - type: nauc_mrr_at_10_diff1 value: 43.6805 - type: nauc_mrr_at_20_max value: 39.283699999999996 - type: nauc_mrr_at_20_std value: 1.9199000000000002 - type: nauc_mrr_at_20_diff1 value: 43.5636 - type: nauc_mrr_at_100_max value: 39.293299999999995 - type: nauc_mrr_at_100_std value: 2.0535 - type: nauc_mrr_at_100_diff1 value: 43.5431 - type: nauc_mrr_at_1000_max value: 39.299299999999995 - type: nauc_mrr_at_1000_std value: 2.0467 - type: nauc_mrr_at_1000_diff1 value: 43.5649 - type: main_score value: 35.669000000000004 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackUnixRetrieval (default) revision: 6c6430d3a6d36f8d2a829195bc5dc94d7e063e53 split: test type: mteb/cqadupstack-unix metrics: - type: ndcg_at_1 value: 37.407000000000004 - type: ndcg_at_3 value: 43.179 - type: ndcg_at_5 value: 45.540000000000006 - type: ndcg_at_10 value: 48.189 - type: ndcg_at_20 value: 50.308 - type: ndcg_at_100 value: 53.15800000000001 - type: ndcg_at_1000 value: 55.108999999999995 - type: map_at_1 value: 32.314 - type: map_at_3 value: 39.757 - type: map_at_5 value: 41.448 - type: map_at_10 value: 42.742999999999995 - type: map_at_20 value: 43.438 - type: map_at_100 value: 43.909 - type: map_at_1000 value: 44.005 - type: recall_at_1 value: 32.314 - type: recall_at_3 value: 46.852 - type: recall_at_5 value: 53.15 - type: recall_at_10 value: 60.748000000000005 - type: recall_at_20 value: 68.30199999999999 - type: recall_at_100 value: 81.846 - type: recall_at_1000 value: 94.92399999999999 - type: precision_at_1 value: 37.407000000000004 - type: precision_at_3 value: 19.59 - type: precision_at_5 value: 13.544999999999998 - type: precision_at_10 value: 8.013 - type: precision_at_20 value: 4.627 - type: precision_at_100 value: 1.172 - type: precision_at_1000 value: 0.14400000000000002 - type: mrr_at_1 value: 37.4067 - type: mrr_at_3 value: 43.9832 - type: mrr_at_5 value: 45.4291 - type: mrr_at_10 value: 46.4308 - type: mrr_at_20 value: 46.9435 - type: mrr_at_100 value: 47.2549 - type: mrr_at_1000 value: 47.3064 - type: nauc_ndcg_at_1_max value: 49.5683 - type: nauc_ndcg_at_1_std value: -4.5333 - type: nauc_ndcg_at_1_diff1 value: 59.0792 - type: nauc_ndcg_at_3_max value: 46.881 - type: nauc_ndcg_at_3_std value: -1.9335000000000002 - type: nauc_ndcg_at_3_diff1 value: 50.6091 - type: nauc_ndcg_at_5_max value: 46.596399999999996 - type: nauc_ndcg_at_5_std value: -1.6747 - type: nauc_ndcg_at_5_diff1 value: 50.731 - type: nauc_ndcg_at_10_max value: 47.119699999999995 - type: nauc_ndcg_at_10_std value: -1.8790999999999998 - type: nauc_ndcg_at_10_diff1 value: 50.4398 - type: nauc_ndcg_at_20_max value: 46.931400000000004 - type: nauc_ndcg_at_20_std value: -1.2184 - type: nauc_ndcg_at_20_diff1 value: 50.2302 - type: nauc_ndcg_at_100_max value: 47.4715 - type: nauc_ndcg_at_100_std value: 0.512 - type: nauc_ndcg_at_100_diff1 value: 49.831399999999995 - type: nauc_ndcg_at_1000_max value: 47.4049 - type: nauc_ndcg_at_1000_std value: -0.07730000000000001 - type: nauc_ndcg_at_1000_diff1 value: 50.045399999999994 - type: nauc_map_at_1_max value: 46.3138 - type: nauc_map_at_1_std value: -6.1365 - type: nauc_map_at_1_diff1 value: 59.1901 - type: nauc_map_at_3_max value: 46.4225 - type: nauc_map_at_3_std value: -3.3928 - type: nauc_map_at_3_diff1 value: 53.0394 - type: nauc_map_at_5_max value: 46.634 - type: nauc_map_at_5_std value: -2.8697 - type: nauc_map_at_5_diff1 value: 52.837500000000006 - type: nauc_map_at_10_max value: 46.9634 - type: nauc_map_at_10_std value: -2.8736 - type: nauc_map_at_10_diff1 value: 52.62670000000001 - type: nauc_map_at_20_max value: 46.943 - type: nauc_map_at_20_std value: -2.7709 - type: nauc_map_at_20_diff1 value: 52.525299999999994 - type: nauc_map_at_100_max value: 47.072 - type: nauc_map_at_100_std value: -2.4186 - type: nauc_map_at_100_diff1 value: 52.4223 - type: nauc_map_at_1000_max value: 47.058299999999996 - type: nauc_map_at_1000_std value: -2.4274 - type: nauc_map_at_1000_diff1 value: 52.410000000000004 - type: nauc_recall_at_1_max value: 46.3138 - type: nauc_recall_at_1_std value: -6.1365 - type: nauc_recall_at_1_diff1 value: 59.1901 - type: nauc_recall_at_3_max value: 43.556 - type: nauc_recall_at_3_std value: -1.0473 - type: nauc_recall_at_3_diff1 value: 45.3836 - type: nauc_recall_at_5_max value: 42.8197 - type: nauc_recall_at_5_std value: 0.364 - type: nauc_recall_at_5_diff1 value: 44.0828 - type: nauc_recall_at_10_max value: 43.5287 - type: nauc_recall_at_10_std value: -0.16999999999999998 - type: nauc_recall_at_10_diff1 value: 42.2532 - type: nauc_recall_at_20_max value: 41.9415 - type: nauc_recall_at_20_std value: 3.0739 - type: nauc_recall_at_20_diff1 value: 40.6138 - type: nauc_recall_at_100_max value: 43.648199999999996 - type: nauc_recall_at_100_std value: 17.8151 - type: nauc_recall_at_100_diff1 value: 34.7435 - type: nauc_recall_at_1000_max value: 42.9288 - type: nauc_recall_at_1000_std value: 34.9874 - type: nauc_recall_at_1000_diff1 value: 21.8361 - type: nauc_precision_at_1_max value: 49.5683 - type: nauc_precision_at_1_std value: -4.5333 - type: nauc_precision_at_1_diff1 value: 59.0792 - type: nauc_precision_at_3_max value: 40.726 - type: nauc_precision_at_3_std value: 3.6327 - type: nauc_precision_at_3_diff1 value: 32.726 - type: nauc_precision_at_5_max value: 37.575599999999994 - type: nauc_precision_at_5_std value: 5.4281999999999995 - type: nauc_precision_at_5_diff1 value: 26.8851 - type: nauc_precision_at_10_max value: 31.7382 - type: nauc_precision_at_10_std value: 4.0767999999999995 - type: nauc_precision_at_10_diff1 value: 18.174799999999998 - type: nauc_precision_at_20_max value: 25.4159 - type: nauc_precision_at_20_std value: 6.0251 - type: nauc_precision_at_20_diff1 value: 10.059800000000001 - type: nauc_precision_at_100_max value: 13.5296 - type: nauc_precision_at_100_std value: 14.0608 - type: nauc_precision_at_100_diff1 value: -7.792000000000001 - type: nauc_precision_at_1000_max value: -3.7522 - type: nauc_precision_at_1000_std value: 7.536099999999999 - type: nauc_precision_at_1000_diff1 value: -21.2683 - type: nauc_mrr_at_1_max value: 49.5683 - type: nauc_mrr_at_1_std value: -4.5333 - type: nauc_mrr_at_1_diff1 value: 59.0792 - type: nauc_mrr_at_3_max value: 48.3581 - type: nauc_mrr_at_3_std value: -1.8857 - type: nauc_mrr_at_3_diff1 value: 52.5945 - type: nauc_mrr_at_5_max value: 48.2651 - type: nauc_mrr_at_5_std value: -1.5519 - type: nauc_mrr_at_5_diff1 value: 52.323699999999995 - type: nauc_mrr_at_10_max value: 48.346000000000004 - type: nauc_mrr_at_10_std value: -1.7543 - type: nauc_mrr_at_10_diff1 value: 52.278999999999996 - type: nauc_mrr_at_20_max value: 48.2692 - type: nauc_mrr_at_20_std value: -1.5904000000000003 - type: nauc_mrr_at_20_diff1 value: 52.27460000000001 - type: nauc_mrr_at_100_max value: 48.273700000000005 - type: nauc_mrr_at_100_std value: -1.4659 - type: nauc_mrr_at_100_diff1 value: 52.278400000000005 - type: nauc_mrr_at_1000_max value: 48.2811 - type: nauc_mrr_at_1000_std value: -1.4881 - type: nauc_mrr_at_1000_diff1 value: 52.298500000000004 - type: main_score value: 48.189 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackWebmastersRetrieval (default) revision: 160c094312a0e1facb97e55eeddb698c0abe3571 split: test type: mteb/cqadupstack-webmasters metrics: - type: ndcg_at_1 value: 38.141999999999996 - type: ndcg_at_3 value: 42.689 - type: ndcg_at_5 value: 44.318999999999996 - type: ndcg_at_10 value: 47.303 - type: ndcg_at_20 value: 49.236000000000004 - type: ndcg_at_100 value: 53.09700000000001 - type: ndcg_at_1000 value: 55.117000000000004 - type: map_at_1 value: 32.468 - type: map_at_3 value: 38.573 - type: map_at_5 value: 39.926 - type: map_at_10 value: 41.482 - type: map_at_20 value: 42.370000000000005 - type: map_at_100 value: 43.204 - type: map_at_1000 value: 43.425999999999995 - type: recall_at_1 value: 32.468 - type: recall_at_3 value: 44.241 - type: recall_at_5 value: 49.177 - type: recall_at_10 value: 57.63399999999999 - type: recall_at_20 value: 64.724 - type: recall_at_100 value: 83.817 - type: recall_at_1000 value: 95.91 - type: precision_at_1 value: 38.141999999999996 - type: precision_at_3 value: 19.499 - type: precision_at_5 value: 13.478000000000002 - type: precision_at_10 value: 8.774999999999999 - type: precision_at_20 value: 5.455 - type: precision_at_100 value: 1.6760000000000002 - type: precision_at_1000 value: 0.251 - type: mrr_at_1 value: 38.1423 - type: mrr_at_3 value: 44.005300000000005 - type: mrr_at_5 value: 45.1515 - type: mrr_at_10 value: 46.3542 - type: mrr_at_20 value: 46.7589 - type: mrr_at_100 value: 47.185100000000006 - type: mrr_at_1000 value: 47.2249 - type: nauc_ndcg_at_1_max value: 47.905300000000004 - type: nauc_ndcg_at_1_std value: 7.8307 - type: nauc_ndcg_at_1_diff1 value: 51.3311 - type: nauc_ndcg_at_3_max value: 46.8119 - type: nauc_ndcg_at_3_std value: 6.993099999999999 - type: nauc_ndcg_at_3_diff1 value: 48.3281 - type: nauc_ndcg_at_5_max value: 47.5687 - type: nauc_ndcg_at_5_std value: 8.7295 - type: nauc_ndcg_at_5_diff1 value: 49.106300000000005 - type: nauc_ndcg_at_10_max value: 47.3786 - type: nauc_ndcg_at_10_std value: 8.9795 - type: nauc_ndcg_at_10_diff1 value: 47.5348 - type: nauc_ndcg_at_20_max value: 47.9792 - type: nauc_ndcg_at_20_std value: 10.2734 - type: nauc_ndcg_at_20_diff1 value: 48.3578 - type: nauc_ndcg_at_100_max value: 48.5313 - type: nauc_ndcg_at_100_std value: 11.2393 - type: nauc_ndcg_at_100_diff1 value: 47.497299999999996 - type: nauc_ndcg_at_1000_max value: 48.4189 - type: nauc_ndcg_at_1000_std value: 10.857700000000001 - type: nauc_ndcg_at_1000_diff1 value: 47.9808 - type: nauc_map_at_1_max value: 45.0797 - type: nauc_map_at_1_std value: 1.9601 - type: nauc_map_at_1_diff1 value: 55.33050000000001 - type: nauc_map_at_3_max value: 46.6641 - type: nauc_map_at_3_std value: 3.9848000000000003 - type: nauc_map_at_3_diff1 value: 51.4752 - type: nauc_map_at_5_max value: 47.2652 - type: nauc_map_at_5_std value: 5.0378 - type: nauc_map_at_5_diff1 value: 51.3051 - type: nauc_map_at_10_max value: 47.3629 - type: nauc_map_at_10_std value: 5.4796 - type: nauc_map_at_10_diff1 value: 50.43450000000001 - type: nauc_map_at_20_max value: 47.5858 - type: nauc_map_at_20_std value: 6.4494 - type: nauc_map_at_20_diff1 value: 50.3333 - type: nauc_map_at_100_max value: 47.6506 - type: nauc_map_at_100_std value: 7.1591000000000005 - type: nauc_map_at_100_diff1 value: 50.138000000000005 - type: nauc_map_at_1000_max value: 47.516999999999996 - type: nauc_map_at_1000_std value: 7.2322 - type: nauc_map_at_1000_diff1 value: 50.132299999999994 - type: nauc_recall_at_1_max value: 45.0797 - type: nauc_recall_at_1_std value: 1.9601 - type: nauc_recall_at_1_diff1 value: 55.33050000000001 - type: nauc_recall_at_3_max value: 44.9897 - type: nauc_recall_at_3_std value: 5.6308 - type: nauc_recall_at_3_diff1 value: 46.6793 - type: nauc_recall_at_5_max value: 46.6283 - type: nauc_recall_at_5_std value: 9.998999999999999 - type: nauc_recall_at_5_diff1 value: 45.9247 - type: nauc_recall_at_10_max value: 44.714 - type: nauc_recall_at_10_std value: 10.8319 - type: nauc_recall_at_10_diff1 value: 40.291900000000005 - type: nauc_recall_at_20_max value: 46.361200000000004 - type: nauc_recall_at_20_std value: 17.9809 - type: nauc_recall_at_20_diff1 value: 42.4004 - type: nauc_recall_at_100_max value: 48.9864 - type: nauc_recall_at_100_std value: 31.7118 - type: nauc_recall_at_100_diff1 value: 30.9676 - type: nauc_recall_at_1000_max value: 59.9606 - type: nauc_recall_at_1000_std value: 64.66229999999999 - type: nauc_recall_at_1000_diff1 value: 27.669 - type: nauc_precision_at_1_max value: 47.905300000000004 - type: nauc_precision_at_1_std value: 7.8307 - type: nauc_precision_at_1_diff1 value: 51.3311 - type: nauc_precision_at_3_max value: 38.4644 - type: nauc_precision_at_3_std value: 11.7975 - type: nauc_precision_at_3_diff1 value: 27.7451 - type: nauc_precision_at_5_max value: 36.8955 - type: nauc_precision_at_5_std value: 17.702399999999997 - type: nauc_precision_at_5_diff1 value: 24.6268 - type: nauc_precision_at_10_max value: 26.5975 - type: nauc_precision_at_10_std value: 22.3993 - type: nauc_precision_at_10_diff1 value: 8.6213 - type: nauc_precision_at_20_max value: 17.3127 - type: nauc_precision_at_20_std value: 24.7139 - type: nauc_precision_at_20_diff1 value: 1.3941000000000001 - type: nauc_precision_at_100_max value: -0.882 - type: nauc_precision_at_100_std value: 24.5949 - type: nauc_precision_at_100_diff1 value: -10.3409 - type: nauc_precision_at_1000_max value: -15.3829 - type: nauc_precision_at_1000_std value: 15.4108 - type: nauc_precision_at_1000_diff1 value: -19.8547 - type: nauc_mrr_at_1_max value: 47.905300000000004 - type: nauc_mrr_at_1_std value: 7.8307 - type: nauc_mrr_at_1_diff1 value: 51.3311 - type: nauc_mrr_at_3_max value: 46.6702 - type: nauc_mrr_at_3_std value: 8.4343 - type: nauc_mrr_at_3_diff1 value: 47.7232 - type: nauc_mrr_at_5_max value: 47.439 - type: nauc_mrr_at_5_std value: 9.8287 - type: nauc_mrr_at_5_diff1 value: 48.2284 - type: nauc_mrr_at_10_max value: 47.477000000000004 - type: nauc_mrr_at_10_std value: 9.9349 - type: nauc_mrr_at_10_diff1 value: 47.7388 - type: nauc_mrr_at_20_max value: 47.5871 - type: nauc_mrr_at_20_std value: 10.137400000000001 - type: nauc_mrr_at_20_diff1 value: 47.949000000000005 - type: nauc_mrr_at_100_max value: 47.5206 - type: nauc_mrr_at_100_std value: 10.0871 - type: nauc_mrr_at_100_diff1 value: 47.875299999999996 - type: nauc_mrr_at_1000_max value: 47.5212 - type: nauc_mrr_at_1000_std value: 10.0739 - type: nauc_mrr_at_1000_diff1 value: 47.8953 - type: main_score value: 47.303 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackWordpressRetrieval (default) revision: 4ffe81d471b1924886b33c7567bfb200e9eec5c4 split: test type: mteb/cqadupstack-wordpress metrics: - type: ndcg_at_1 value: 29.759999999999998 - type: ndcg_at_3 value: 33.824 - type: ndcg_at_5 value: 36.766 - type: ndcg_at_10 value: 39.902 - type: ndcg_at_20 value: 41.618 - type: ndcg_at_100 value: 44.983000000000004 - type: ndcg_at_1000 value: 46.938 - type: map_at_1 value: 27.181 - type: map_at_3 value: 31.526 - type: map_at_5 value: 33.397 - type: map_at_10 value: 34.766999999999996 - type: map_at_20 value: 35.244 - type: map_at_100 value: 35.757 - type: map_at_1000 value: 35.836 - type: recall_at_1 value: 27.181 - type: recall_at_3 value: 37.19 - type: recall_at_5 value: 44.153999999999996 - type: recall_at_10 value: 53.705000000000005 - type: recall_at_20 value: 60.22 - type: recall_at_100 value: 77.39200000000001 - type: recall_at_1000 value: 91.77 - type: precision_at_1 value: 29.759999999999998 - type: precision_at_3 value: 13.925 - type: precision_at_5 value: 10.24 - type: precision_at_10 value: 6.265999999999999 - type: precision_at_20 value: 3.549 - type: precision_at_100 value: 0.9520000000000001 - type: precision_at_1000 value: 0.122 - type: mrr_at_1 value: 29.7597 - type: mrr_at_3 value: 34.4732 - type: mrr_at_5 value: 35.915 - type: mrr_at_10 value: 37.1488 - type: mrr_at_20 value: 37.637100000000004 - type: mrr_at_100 value: 38.0403 - type: mrr_at_1000 value: 38.096999999999994 - type: nauc_ndcg_at_1_max value: 35.7865 - type: nauc_ndcg_at_1_std value: 1.9512 - type: nauc_ndcg_at_1_diff1 value: 54.9311 - type: nauc_ndcg_at_3_max value: 32.6952 - type: nauc_ndcg_at_3_std value: 6.2215 - type: nauc_ndcg_at_3_diff1 value: 48.2731 - type: nauc_ndcg_at_5_max value: 33.893 - type: nauc_ndcg_at_5_std value: 5.418 - type: nauc_ndcg_at_5_diff1 value: 47.5903 - type: nauc_ndcg_at_10_max value: 31.5442 - type: nauc_ndcg_at_10_std value: 6.4778 - type: nauc_ndcg_at_10_diff1 value: 46.1388 - type: nauc_ndcg_at_20_max value: 31.613200000000003 - type: nauc_ndcg_at_20_std value: 7.0572 - type: nauc_ndcg_at_20_diff1 value: 46.5949 - type: nauc_ndcg_at_100_max value: 32.8054 - type: nauc_ndcg_at_100_std value: 9.4452 - type: nauc_ndcg_at_100_diff1 value: 46.8179 - type: nauc_ndcg_at_1000_max value: 33.0064 - type: nauc_ndcg_at_1000_std value: 8.8104 - type: nauc_ndcg_at_1000_diff1 value: 47.4082 - type: nauc_map_at_1_max value: 32.9731 - type: nauc_map_at_1_std value: 0.6048 - type: nauc_map_at_1_diff1 value: 53.8662 - type: nauc_map_at_3_max value: 32.1607 - type: nauc_map_at_3_std value: 4.4275 - type: nauc_map_at_3_diff1 value: 49.648900000000005 - type: nauc_map_at_5_max value: 33.0496 - type: nauc_map_at_5_std value: 4.3251 - type: nauc_map_at_5_diff1 value: 49.1433 - type: nauc_map_at_10_max value: 32.2061 - type: nauc_map_at_10_std value: 4.7649 - type: nauc_map_at_10_diff1 value: 48.5962 - type: nauc_map_at_20_max value: 32.2822 - type: nauc_map_at_20_std value: 4.8831 - type: nauc_map_at_20_diff1 value: 48.766799999999996 - type: nauc_map_at_100_max value: 32.521699999999996 - type: nauc_map_at_100_std value: 5.2962 - type: nauc_map_at_100_diff1 value: 48.7986 - type: nauc_map_at_1000_max value: 32.5074 - type: nauc_map_at_1000_std value: 5.2721 - type: nauc_map_at_1000_diff1 value: 48.803000000000004 - type: nauc_recall_at_1_max value: 32.9731 - type: nauc_recall_at_1_std value: 0.6048 - type: nauc_recall_at_1_diff1 value: 53.8662 - type: nauc_recall_at_3_max value: 29.308699999999998 - type: nauc_recall_at_3_std value: 7.6516 - type: nauc_recall_at_3_diff1 value: 42.4534 - type: nauc_recall_at_5_max value: 32.1131 - type: nauc_recall_at_5_std value: 6.260599999999999 - type: nauc_recall_at_5_diff1 value: 40.5131 - type: nauc_recall_at_10_max value: 24.2332 - type: nauc_recall_at_10_std value: 9.7985 - type: nauc_recall_at_10_diff1 value: 34.911500000000004 - type: nauc_recall_at_20_max value: 23.692 - type: nauc_recall_at_20_std value: 12.088799999999999 - type: nauc_recall_at_20_diff1 value: 35.8843 - type: nauc_recall_at_100_max value: 27.729300000000002 - type: nauc_recall_at_100_std value: 31.9796 - type: nauc_recall_at_100_diff1 value: 32.5991 - type: nauc_recall_at_1000_max value: 32.483200000000004 - type: nauc_recall_at_1000_std value: 48.2299 - type: nauc_recall_at_1000_diff1 value: 35.8086 - type: nauc_precision_at_1_max value: 35.7865 - type: nauc_precision_at_1_std value: 1.9512 - type: nauc_precision_at_1_diff1 value: 54.9311 - type: nauc_precision_at_3_max value: 35.729 - type: nauc_precision_at_3_std value: 12.873499999999998 - type: nauc_precision_at_3_diff1 value: 43.6572 - type: nauc_precision_at_5_max value: 35.9285 - type: nauc_precision_at_5_std value: 11.120099999999999 - type: nauc_precision_at_5_diff1 value: 37.458999999999996 - type: nauc_precision_at_10_max value: 29.4037 - type: nauc_precision_at_10_std value: 16.1533 - type: nauc_precision_at_10_diff1 value: 30.7829 - type: nauc_precision_at_20_max value: 28.733700000000002 - type: nauc_precision_at_20_std value: 19.4687 - type: nauc_precision_at_20_diff1 value: 29.154999999999998 - type: nauc_precision_at_100_max value: 28.109099999999998 - type: nauc_precision_at_100_std value: 31.4104 - type: nauc_precision_at_100_diff1 value: 17.7183 - type: nauc_precision_at_1000_max value: 5.8763000000000005 - type: nauc_precision_at_1000_std value: 18.5651 - type: nauc_precision_at_1000_diff1 value: -0.5546 - type: nauc_mrr_at_1_max value: 35.7865 - type: nauc_mrr_at_1_std value: 1.9512 - type: nauc_mrr_at_1_diff1 value: 54.9311 - type: nauc_mrr_at_3_max value: 35.371 - type: nauc_mrr_at_3_std value: 6.447700000000001 - type: nauc_mrr_at_3_diff1 value: 50.998900000000006 - type: nauc_mrr_at_5_max value: 36.2682 - type: nauc_mrr_at_5_std value: 5.8895 - type: nauc_mrr_at_5_diff1 value: 50.72879999999999 - type: nauc_mrr_at_10_max value: 35.1719 - type: nauc_mrr_at_10_std value: 6.074199999999999 - type: nauc_mrr_at_10_diff1 value: 50.087 - type: nauc_mrr_at_20_max value: 35.0608 - type: nauc_mrr_at_20_std value: 6.2545 - type: nauc_mrr_at_20_diff1 value: 50.1754 - type: nauc_mrr_at_100_max value: 35.1314 - type: nauc_mrr_at_100_std value: 6.417299999999999 - type: nauc_mrr_at_100_diff1 value: 50.1819 - type: nauc_mrr_at_1000_max value: 35.124 - type: nauc_mrr_at_1000_std value: 6.3942 - type: nauc_mrr_at_1000_diff1 value: 50.1926 - type: main_score value: 39.902 task: type: Retrieval - dataset: config: default name: MTEB ClimateFEVER (default) revision: 47f2ac6acb640fc46020b02a5b59fdda04d39380 split: test type: mteb/climate-fever metrics: - type: ndcg_at_1 value: 40.129999999999995 - type: ndcg_at_3 value: 33.11 - type: ndcg_at_5 value: 34.721999999999994 - type: ndcg_at_10 value: 38.314 - type: ndcg_at_20 value: 41.006 - type: ndcg_at_100 value: 44.651 - type: ndcg_at_1000 value: 47.262 - type: map_at_1 value: 17.72 - type: map_at_3 value: 24.807000000000002 - type: map_at_5 value: 26.931 - type: map_at_10 value: 28.923 - type: map_at_20 value: 29.970999999999997 - type: map_at_100 value: 30.720999999999997 - type: map_at_1000 value: 30.866 - type: recall_at_1 value: 17.72 - type: recall_at_3 value: 29.421000000000003 - type: recall_at_5 value: 35.089 - type: recall_at_10 value: 42.962 - type: recall_at_20 value: 50.46000000000001 - type: recall_at_100 value: 64.39399999999999 - type: recall_at_1000 value: 78.93599999999999 - type: precision_at_1 value: 40.129999999999995 - type: precision_at_3 value: 24.407999999999998 - type: precision_at_5 value: 17.954 - type: precision_at_10 value: 11.375 - type: precision_at_20 value: 6.857 - type: precision_at_100 value: 1.812 - type: precision_at_1000 value: 0.231 - type: mrr_at_1 value: 40.130300000000005 - type: mrr_at_3 value: 48.7296 - type: mrr_at_5 value: 50.3583 - type: mrr_at_10 value: 51.415299999999995 - type: mrr_at_20 value: 51.831700000000005 - type: mrr_at_100 value: 52.0518 - type: mrr_at_1000 value: 52.0826 - type: nauc_ndcg_at_1_max value: 40.104299999999995 - type: nauc_ndcg_at_1_std value: 18.0912 - type: nauc_ndcg_at_1_diff1 value: 37.8955 - type: nauc_ndcg_at_3_max value: 42.9593 - type: nauc_ndcg_at_3_std value: 19.1131 - type: nauc_ndcg_at_3_diff1 value: 30.6546 - type: nauc_ndcg_at_5_max value: 44.351 - type: nauc_ndcg_at_5_std value: 21.026500000000002 - type: nauc_ndcg_at_5_diff1 value: 29.723100000000002 - type: nauc_ndcg_at_10_max value: 45.1246 - type: nauc_ndcg_at_10_std value: 23.4349 - type: nauc_ndcg_at_10_diff1 value: 29.488599999999998 - type: nauc_ndcg_at_20_max value: 45.2818 - type: nauc_ndcg_at_20_std value: 24.904899999999998 - type: nauc_ndcg_at_20_diff1 value: 28.9215 - type: nauc_ndcg_at_100_max value: 46.7221 - type: nauc_ndcg_at_100_std value: 28.011799999999997 - type: nauc_ndcg_at_100_diff1 value: 29.6544 - type: nauc_ndcg_at_1000_max value: 46.7951 - type: nauc_ndcg_at_1000_std value: 28.5671 - type: nauc_ndcg_at_1000_diff1 value: 29.7716 - type: nauc_map_at_1_max value: 41.754400000000004 - type: nauc_map_at_1_std value: 11.7817 - type: nauc_map_at_1_diff1 value: 39.7588 - type: nauc_map_at_3_max value: 43.086 - type: nauc_map_at_3_std value: 16.2776 - type: nauc_map_at_3_diff1 value: 31.2632 - type: nauc_map_at_5_max value: 43.8303 - type: nauc_map_at_5_std value: 18.2317 - type: nauc_map_at_5_diff1 value: 30.451099999999997 - type: nauc_map_at_10_max value: 44.1511 - type: nauc_map_at_10_std value: 19.9622 - type: nauc_map_at_10_diff1 value: 30.1447 - type: nauc_map_at_20_max value: 44.2367 - type: nauc_map_at_20_std value: 20.6727 - type: nauc_map_at_20_diff1 value: 29.7979 - type: nauc_map_at_100_max value: 44.6514 - type: nauc_map_at_100_std value: 21.451999999999998 - type: nauc_map_at_100_diff1 value: 29.9572 - type: nauc_map_at_1000_max value: 44.6665 - type: nauc_map_at_1000_std value: 21.507 - type: nauc_map_at_1000_diff1 value: 29.9788 - type: nauc_recall_at_1_max value: 41.754400000000004 - type: nauc_recall_at_1_std value: 11.7817 - type: nauc_recall_at_1_diff1 value: 39.7588 - type: nauc_recall_at_3_max value: 42.1306 - type: nauc_recall_at_3_std value: 17.397299999999998 - type: nauc_recall_at_3_diff1 value: 26.3229 - type: nauc_recall_at_5_max value: 41.9516 - type: nauc_recall_at_5_std value: 20.566699999999997 - type: nauc_recall_at_5_diff1 value: 23.4934 - type: nauc_recall_at_10_max value: 41.260400000000004 - type: nauc_recall_at_10_std value: 24.0061 - type: nauc_recall_at_10_diff1 value: 21.6158 - type: nauc_recall_at_20_max value: 39.8437 - type: nauc_recall_at_20_std value: 26.892100000000003 - type: nauc_recall_at_20_diff1 value: 19.1214 - type: nauc_recall_at_100_max value: 42.9589 - type: nauc_recall_at_100_std value: 37.7833 - type: nauc_recall_at_100_diff1 value: 19.575899999999997 - type: nauc_recall_at_1000_max value: 43.292500000000004 - type: nauc_recall_at_1000_std value: 46.5189 - type: nauc_recall_at_1000_diff1 value: 16.3096 - type: nauc_precision_at_1_max value: 40.104299999999995 - type: nauc_precision_at_1_std value: 18.0912 - type: nauc_precision_at_1_diff1 value: 37.8955 - type: nauc_precision_at_3_max value: 37.2383 - type: nauc_precision_at_3_std value: 24.0517 - type: nauc_precision_at_3_diff1 value: 19.169800000000002 - type: nauc_precision_at_5_max value: 34.6764 - type: nauc_precision_at_5_std value: 26.4407 - type: nauc_precision_at_5_diff1 value: 14.188 - type: nauc_precision_at_10_max value: 31.1544 - type: nauc_precision_at_10_std value: 28.997099999999996 - type: nauc_precision_at_10_diff1 value: 11.4475 - type: nauc_precision_at_20_max value: 27.065499999999997 - type: nauc_precision_at_20_std value: 29.658099999999997 - type: nauc_precision_at_20_diff1 value: 7.388999999999999 - type: nauc_precision_at_100_max value: 22.5635 - type: nauc_precision_at_100_std value: 35.1885 - type: nauc_precision_at_100_diff1 value: 4.612900000000001 - type: nauc_precision_at_1000_max value: 9.4366 - type: nauc_precision_at_1000_std value: 29.399399999999996 - type: nauc_precision_at_1000_diff1 value: -2.8055 - type: nauc_mrr_at_1_max value: 40.104299999999995 - type: nauc_mrr_at_1_std value: 18.0912 - type: nauc_mrr_at_1_diff1 value: 37.8955 - type: nauc_mrr_at_3_max value: 43.088300000000004 - type: nauc_mrr_at_3_std value: 21.658 - type: nauc_mrr_at_3_diff1 value: 34.4445 - type: nauc_mrr_at_5_max value: 43.2876 - type: nauc_mrr_at_5_std value: 22.6188 - type: nauc_mrr_at_5_diff1 value: 34.143699999999995 - type: nauc_mrr_at_10_max value: 43.4627 - type: nauc_mrr_at_10_std value: 22.7775 - type: nauc_mrr_at_10_diff1 value: 34.3108 - type: nauc_mrr_at_20_max value: 43.5013 - type: nauc_mrr_at_20_std value: 22.825599999999998 - type: nauc_mrr_at_20_diff1 value: 34.4236 - type: nauc_mrr_at_100_max value: 43.543 - type: nauc_mrr_at_100_std value: 22.8566 - type: nauc_mrr_at_100_diff1 value: 34.5171 - type: nauc_mrr_at_1000_max value: 43.5287 - type: nauc_mrr_at_1000_std value: 22.8398 - type: nauc_mrr_at_1000_diff1 value: 34.5149 - type: main_score value: 38.314 task: type: Retrieval - dataset: config: default name: MTEB DBPedia (default) revision: c0f706b76e590d620bd6618b3ca8efdd34e2d659 split: test type: mteb/dbpedia metrics: - type: ndcg_at_1 value: 57.875 - type: ndcg_at_3 value: 48.424 - type: ndcg_at_5 value: 45.907 - type: ndcg_at_10 value: 43.881 - type: ndcg_at_20 value: 43.047000000000004 - type: ndcg_at_100 value: 47.892 - type: ndcg_at_1000 value: 55.175 - type: map_at_1 value: 9.705 - type: map_at_3 value: 14.984 - type: map_at_5 value: 17.579 - type: map_at_10 value: 20.901 - type: map_at_20 value: 24.244 - type: map_at_100 value: 29.263 - type: map_at_1000 value: 30.953000000000003 - type: recall_at_1 value: 9.705 - type: recall_at_3 value: 16.136 - type: recall_at_5 value: 20.4 - type: recall_at_10 value: 26.3 - type: recall_at_20 value: 33.719 - type: recall_at_100 value: 53.080000000000005 - type: recall_at_1000 value: 75.732 - type: precision_at_1 value: 70.75 - type: precision_at_3 value: 51.833 - type: precision_at_5 value: 44.2 - type: precision_at_10 value: 34.8 - type: precision_at_20 value: 26.174999999999997 - type: precision_at_100 value: 10.879999999999999 - type: precision_at_1000 value: 2.073 - type: mrr_at_1 value: 70.75 - type: mrr_at_3 value: 76.66669999999999 - type: mrr_at_5 value: 77.7667 - type: mrr_at_10 value: 78.2846 - type: mrr_at_20 value: 78.4431 - type: mrr_at_100 value: 78.5246 - type: mrr_at_1000 value: 78.5325 - type: nauc_ndcg_at_1_max value: 47.8626 - type: nauc_ndcg_at_1_std value: 29.184500000000003 - type: nauc_ndcg_at_1_diff1 value: 51.1817 - type: nauc_ndcg_at_3_max value: 40.4824 - type: nauc_ndcg_at_3_std value: 27.226899999999997 - type: nauc_ndcg_at_3_diff1 value: 29.3703 - type: nauc_ndcg_at_5_max value: 38.145 - type: nauc_ndcg_at_5_std value: 27.050600000000003 - type: nauc_ndcg_at_5_diff1 value: 27.043 - type: nauc_ndcg_at_10_max value: 36.7997 - type: nauc_ndcg_at_10_std value: 25.5961 - type: nauc_ndcg_at_10_diff1 value: 26.062800000000003 - type: nauc_ndcg_at_20_max value: 33.0901 - type: nauc_ndcg_at_20_std value: 21.3937 - type: nauc_ndcg_at_20_diff1 value: 24.8751 - type: nauc_ndcg_at_100_max value: 36.032199999999996 - type: nauc_ndcg_at_100_std value: 26.6399 - type: nauc_ndcg_at_100_diff1 value: 25.341399999999997 - type: nauc_ndcg_at_1000_max value: 42.1806 - type: nauc_ndcg_at_1000_std value: 36.6225 - type: nauc_ndcg_at_1000_diff1 value: 26.957700000000003 - type: nauc_map_at_1_max value: -1.8065000000000002 - type: nauc_map_at_1_std value: -23.1418 - type: nauc_map_at_1_diff1 value: 26.009700000000002 - type: nauc_map_at_3_max value: 4.5538 - type: nauc_map_at_3_std value: -19.7685 - type: nauc_map_at_3_diff1 value: 18.431900000000002 - type: nauc_map_at_5_max value: 7.6586 - type: nauc_map_at_5_std value: -15.1836 - type: nauc_map_at_5_diff1 value: 17.1768 - type: nauc_map_at_10_max value: 12.3345 - type: nauc_map_at_10_std value: -7.3311 - type: nauc_map_at_10_diff1 value: 16.467399999999998 - type: nauc_map_at_20_max value: 16.9535 - type: nauc_map_at_20_std value: 2.3999 - type: nauc_map_at_20_diff1 value: 16.1074 - type: nauc_map_at_100_max value: 24.238699999999998 - type: nauc_map_at_100_std value: 17.0193 - type: nauc_map_at_100_diff1 value: 17.179 - type: nauc_map_at_1000_max value: 26.147199999999998 - type: nauc_map_at_1000_std value: 20.597199999999997 - type: nauc_map_at_1000_diff1 value: 17.3145 - type: nauc_recall_at_1_max value: -1.8065000000000002 - type: nauc_recall_at_1_std value: -23.1418 - type: nauc_recall_at_1_diff1 value: 26.009700000000002 - type: nauc_recall_at_3_max value: 1.7474 - type: nauc_recall_at_3_std value: -21.331 - type: nauc_recall_at_3_diff1 value: 14.844899999999999 - type: nauc_recall_at_5_max value: 3.9203 - type: nauc_recall_at_5_std value: -17.225299999999997 - type: nauc_recall_at_5_diff1 value: 13.3026 - type: nauc_recall_at_10_max value: 7.484399999999999 - type: nauc_recall_at_10_std value: -10.879800000000001 - type: nauc_recall_at_10_diff1 value: 11.187 - type: nauc_recall_at_20_max value: 12.327499999999999 - type: nauc_recall_at_20_std value: -1.7592 - type: nauc_recall_at_20_diff1 value: 12.3485 - type: nauc_recall_at_100_max value: 26.868799999999997 - type: nauc_recall_at_100_std value: 23.4846 - type: nauc_recall_at_100_diff1 value: 16.4859 - type: nauc_recall_at_1000_max value: 35.4478 - type: nauc_recall_at_1000_std value: 42.7445 - type: nauc_recall_at_1000_diff1 value: 17.108 - type: nauc_precision_at_1_max value: 59.8572 - type: nauc_precision_at_1_std value: 39.1 - type: nauc_precision_at_1_diff1 value: 57.475 - type: nauc_precision_at_3_max value: 42.9945 - type: nauc_precision_at_3_std value: 41.5933 - type: nauc_precision_at_3_diff1 value: 12.3299 - type: nauc_precision_at_5_max value: 39.8975 - type: nauc_precision_at_5_std value: 46.3626 - type: nauc_precision_at_5_diff1 value: 7.990600000000001 - type: nauc_precision_at_10_max value: 37.501200000000004 - type: nauc_precision_at_10_std value: 51.9395 - type: nauc_precision_at_10_diff1 value: 4.8036 - type: nauc_precision_at_20_max value: 34.9806 - type: nauc_precision_at_20_std value: 53.513999999999996 - type: nauc_precision_at_20_diff1 value: 3.8808000000000002 - type: nauc_precision_at_100_max value: 29.6714 - type: nauc_precision_at_100_std value: 50.9404 - type: nauc_precision_at_100_diff1 value: 1.7782 - type: nauc_precision_at_1000_max value: 4.9528 - type: nauc_precision_at_1000_std value: 23.0701 - type: nauc_precision_at_1000_diff1 value: -11.6606 - type: nauc_mrr_at_1_max value: 59.8572 - type: nauc_mrr_at_1_std value: 39.1 - type: nauc_mrr_at_1_diff1 value: 57.475 - type: nauc_mrr_at_3_max value: 61.6508 - type: nauc_mrr_at_3_std value: 43.013400000000004 - type: nauc_mrr_at_3_diff1 value: 55.14170000000001 - type: nauc_mrr_at_5_max value: 61.8982 - type: nauc_mrr_at_5_std value: 42.4903 - type: nauc_mrr_at_5_diff1 value: 55.880300000000005 - type: nauc_mrr_at_10_max value: 61.6843 - type: nauc_mrr_at_10_std value: 42.8332 - type: nauc_mrr_at_10_diff1 value: 55.7773 - type: nauc_mrr_at_20_max value: 61.7877 - type: nauc_mrr_at_20_std value: 42.6655 - type: nauc_mrr_at_20_diff1 value: 55.9627 - type: nauc_mrr_at_100_max value: 61.755300000000005 - type: nauc_mrr_at_100_std value: 42.681799999999996 - type: nauc_mrr_at_100_diff1 value: 55.97410000000001 - type: nauc_mrr_at_1000_max value: 61.7454 - type: nauc_mrr_at_1000_std value: 42.6813 - type: nauc_mrr_at_1000_diff1 value: 55.9732 - type: main_score value: 43.881 task: type: Retrieval - dataset: config: default name: MTEB EmotionClassification (default) revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37 split: test type: mteb/emotion metrics: - type: accuracy value: 42.385 - type: f1 value: 38.2581 - type: f1_weighted value: 44.6657 - type: main_score value: 42.385 task: type: Classification - dataset: config: default name: MTEB FEVER (default) revision: bea83ef9e8fb933d90a2f1d5515737465d613e12 split: test type: mteb/fever metrics: - type: ndcg_at_1 value: 89.81400000000001 - type: ndcg_at_3 value: 90.789 - type: ndcg_at_5 value: 91.266 - type: ndcg_at_10 value: 91.552 - type: ndcg_at_20 value: 91.759 - type: ndcg_at_100 value: 92.04 - type: ndcg_at_1000 value: 92.264 - type: map_at_1 value: 83.343 - type: map_at_3 value: 88.293 - type: map_at_5 value: 88.709 - type: map_at_10 value: 88.895 - type: map_at_20 value: 88.985 - type: map_at_100 value: 89.046 - type: map_at_1000 value: 89.059 - type: recall_at_1 value: 83.343 - type: recall_at_3 value: 92.545 - type: recall_at_5 value: 93.944 - type: recall_at_10 value: 94.82300000000001 - type: recall_at_20 value: 95.48100000000001 - type: recall_at_100 value: 96.64 - type: recall_at_1000 value: 97.989 - type: precision_at_1 value: 89.81400000000001 - type: precision_at_3 value: 33.698 - type: precision_at_5 value: 20.602999999999998 - type: precision_at_10 value: 10.453 - type: precision_at_20 value: 5.299 - type: precision_at_100 value: 1.091 - type: precision_at_1000 value: 0.11299999999999999 - type: mrr_at_1 value: 89.81400000000001 - type: mrr_at_3 value: 93.7594 - type: mrr_at_5 value: 94.0144 - type: mrr_at_10 value: 94.073 - type: mrr_at_20 value: 94.0835 - type: mrr_at_100 value: 94.0871 - type: mrr_at_1000 value: 94.0873 - type: nauc_ndcg_at_1_max value: 23.8983 - type: nauc_ndcg_at_1_std value: -16.226 - type: nauc_ndcg_at_1_diff1 value: 78.4902 - type: nauc_ndcg_at_3_max value: 15.106 - type: nauc_ndcg_at_3_std value: -11.4 - type: nauc_ndcg_at_3_diff1 value: 41.9768 - type: nauc_ndcg_at_5_max value: 14.6485 - type: nauc_ndcg_at_5_std value: -9.5441 - type: nauc_ndcg_at_5_diff1 value: 39.7958 - type: nauc_ndcg_at_10_max value: 14.241100000000001 - type: nauc_ndcg_at_10_std value: -8.4259 - type: nauc_ndcg_at_10_diff1 value: 38.8701 - type: nauc_ndcg_at_20_max value: 14.211199999999998 - type: nauc_ndcg_at_20_std value: -7.916399999999999 - type: nauc_ndcg_at_20_diff1 value: 39.3907 - type: nauc_ndcg_at_100_max value: 14.871400000000001 - type: nauc_ndcg_at_100_std value: -7.4491000000000005 - type: nauc_ndcg_at_100_diff1 value: 40.7175 - type: nauc_ndcg_at_1000_max value: 15.386800000000001 - type: nauc_ndcg_at_1000_std value: -7.939100000000001 - type: nauc_ndcg_at_1000_diff1 value: 42.1499 - type: nauc_map_at_1_max value: 13.431199999999999 - type: nauc_map_at_1_std value: -10.2714 - type: nauc_map_at_1_diff1 value: 50.8151 - type: nauc_map_at_3_max value: 13.2276 - type: nauc_map_at_3_std value: -9.8315 - type: nauc_map_at_3_diff1 value: 39.6441 - type: nauc_map_at_5_max value: 13.4859 - type: nauc_map_at_5_std value: -9.284 - type: nauc_map_at_5_diff1 value: 39.4358 - type: nauc_map_at_10_max value: 13.578399999999998 - type: nauc_map_at_10_std value: -8.828800000000001 - type: nauc_map_at_10_diff1 value: 39.338499999999996 - type: nauc_map_at_20_max value: 13.600200000000001 - type: nauc_map_at_20_std value: -8.6524 - type: nauc_map_at_20_diff1 value: 39.5327 - type: nauc_map_at_100_max value: 13.7266 - type: nauc_map_at_100_std value: -8.583 - type: nauc_map_at_100_diff1 value: 39.749 - type: nauc_map_at_1000_max value: 13.7522 - type: nauc_map_at_1000_std value: -8.5978 - type: nauc_map_at_1000_diff1 value: 39.8105 - type: nauc_recall_at_1_max value: 13.431199999999999 - type: nauc_recall_at_1_std value: -10.2714 - type: nauc_recall_at_1_diff1 value: 50.8151 - type: nauc_recall_at_3_max value: 7.7703999999999995 - type: nauc_recall_at_3_std value: -7.5428999999999995 - type: nauc_recall_at_3_diff1 value: 14.6511 - type: nauc_recall_at_5_max value: 7.7514 - type: nauc_recall_at_5_std value: -0.9165 - type: nauc_recall_at_5_diff1 value: 5.1985 - type: nauc_recall_at_10_max value: 5.4695 - type: nauc_recall_at_10_std value: 4.8362 - type: nauc_recall_at_10_diff1 value: -2.3994 - type: nauc_recall_at_20_max value: 3.7693 - type: nauc_recall_at_20_std value: 9.4046 - type: nauc_recall_at_20_diff1 value: -5.3729 - type: nauc_recall_at_100_max value: 4.6496 - type: nauc_recall_at_100_std value: 19.605700000000002 - type: nauc_recall_at_100_diff1 value: -9.1885 - type: nauc_recall_at_1000_max value: 7.266 - type: nauc_recall_at_1000_std value: 25.461699999999997 - type: nauc_recall_at_1000_diff1 value: -11.698699999999999 - type: nauc_precision_at_1_max value: 23.8983 - type: nauc_precision_at_1_std value: -16.226 - type: nauc_precision_at_1_diff1 value: 78.4902 - type: nauc_precision_at_3_max value: 14.686399999999999 - type: nauc_precision_at_3_std value: -5.6663 - type: nauc_precision_at_3_diff1 value: 0.5428999999999999 - type: nauc_precision_at_5_max value: 12.9569 - type: nauc_precision_at_5_std value: 1.145 - type: nauc_precision_at_5_diff1 value: -10.0661 - type: nauc_precision_at_10_max value: 9.8558 - type: nauc_precision_at_10_std value: 6.1638 - type: nauc_precision_at_10_diff1 value: -14.3308 - type: nauc_precision_at_20_max value: 7.1591000000000005 - type: nauc_precision_at_20_std value: 8.4559 - type: nauc_precision_at_20_diff1 value: -12.226099999999999 - type: nauc_precision_at_100_max value: 7.6160000000000005 - type: nauc_precision_at_100_std value: 8.6876 - type: nauc_precision_at_100_diff1 value: -5.8182 - type: nauc_precision_at_1000_max value: 7.3231 - type: nauc_precision_at_1000_std value: 4.929399999999999 - type: nauc_precision_at_1000_diff1 value: -1.187 - type: nauc_mrr_at_1_max value: 23.8983 - type: nauc_mrr_at_1_std value: -16.226 - type: nauc_mrr_at_1_diff1 value: 78.4902 - type: nauc_mrr_at_3_max value: 25.2759 - type: nauc_mrr_at_3_std value: -20.4713 - type: nauc_mrr_at_3_diff1 value: 77.55030000000001 - type: nauc_mrr_at_5_max value: 25.709799999999998 - type: nauc_mrr_at_5_std value: -19.3177 - type: nauc_mrr_at_5_diff1 value: 77.7659 - type: nauc_mrr_at_10_max value: 25.4059 - type: nauc_mrr_at_10_std value: -19.128600000000002 - type: nauc_mrr_at_10_diff1 value: 77.78580000000001 - type: nauc_mrr_at_20_max value: 25.303399999999996 - type: nauc_mrr_at_20_std value: -19.137999999999998 - type: nauc_mrr_at_20_diff1 value: 77.7914 - type: nauc_mrr_at_100_max value: 25.2918 - type: nauc_mrr_at_100_std value: -19.1132 - type: nauc_mrr_at_100_diff1 value: 77.7997 - type: nauc_mrr_at_1000_max value: 25.2892 - type: nauc_mrr_at_1000_std value: -19.1172 - type: nauc_mrr_at_1000_diff1 value: 77.7992 - type: main_score value: 91.552 task: type: Retrieval - dataset: config: default name: MTEB FiQA2018 (default) revision: 27a168819829fe9bcd655c2df245fb19452e8e06 split: test type: mteb/fiqa metrics: - type: ndcg_at_1 value: 44.907000000000004 - type: ndcg_at_3 value: 40.095 - type: ndcg_at_5 value: 41.464 - type: ndcg_at_10 value: 43.958999999999996 - type: ndcg_at_20 value: 46.931 - type: ndcg_at_100 value: 50.656 - type: ndcg_at_1000 value: 53.474999999999994 - type: map_at_1 value: 22.846 - type: map_at_3 value: 31.533 - type: map_at_5 value: 34.175 - type: map_at_10 value: 36.105 - type: map_at_20 value: 37.232 - type: map_at_100 value: 37.993 - type: map_at_1000 value: 38.171 - type: recall_at_1 value: 22.846 - type: recall_at_3 value: 36.065000000000005 - type: recall_at_5 value: 42.754999999999995 - type: recall_at_10 value: 50.595 - type: recall_at_20 value: 59.85 - type: recall_at_100 value: 75.08 - type: recall_at_1000 value: 91.685 - type: precision_at_1 value: 44.907000000000004 - type: precision_at_3 value: 26.183 - type: precision_at_5 value: 19.29 - type: precision_at_10 value: 11.883000000000001 - type: precision_at_20 value: 7.191 - type: precision_at_100 value: 1.8870000000000002 - type: precision_at_1000 value: 0.23900000000000002 - type: mrr_at_1 value: 44.907399999999996 - type: mrr_at_3 value: 50.10289999999999 - type: mrr_at_5 value: 51.5303 - type: mrr_at_10 value: 52.61169999999999 - type: mrr_at_20 value: 53.13290000000001 - type: mrr_at_100 value: 53.3809 - type: mrr_at_1000 value: 53.4181 - type: nauc_ndcg_at_1_max value: 50.2672 - type: nauc_ndcg_at_1_std value: -5.858 - type: nauc_ndcg_at_1_diff1 value: 55.1067 - type: nauc_ndcg_at_3_max value: 40.9279 - type: nauc_ndcg_at_3_std value: -6.954000000000001 - type: nauc_ndcg_at_3_diff1 value: 43.9096 - type: nauc_ndcg_at_5_max value: 38.406400000000005 - type: nauc_ndcg_at_5_std value: -5.951 - type: nauc_ndcg_at_5_diff1 value: 42.9537 - type: nauc_ndcg_at_10_max value: 40.1602 - type: nauc_ndcg_at_10_std value: -3.486 - type: nauc_ndcg_at_10_diff1 value: 43.693 - type: nauc_ndcg_at_20_max value: 40.3159 - type: nauc_ndcg_at_20_std value: -1.6125 - type: nauc_ndcg_at_20_diff1 value: 43.0649 - type: nauc_ndcg_at_100_max value: 42.5543 - type: nauc_ndcg_at_100_std value: 0.133 - type: nauc_ndcg_at_100_diff1 value: 44.263799999999996 - type: nauc_ndcg_at_1000_max value: 43.520399999999995 - type: nauc_ndcg_at_1000_std value: -0.49300000000000005 - type: nauc_ndcg_at_1000_diff1 value: 44.550200000000004 - type: nauc_map_at_1_max value: 26.930300000000003 - type: nauc_map_at_1_std value: -6.8881 - type: nauc_map_at_1_diff1 value: 45.905499999999996 - type: nauc_map_at_3_max value: 32.3991 - type: nauc_map_at_3_std value: -8.1954 - type: nauc_map_at_3_diff1 value: 42.9392 - type: nauc_map_at_5_max value: 34.0031 - type: nauc_map_at_5_std value: -6.9963999999999995 - type: nauc_map_at_5_diff1 value: 42.7737 - type: nauc_map_at_10_max value: 36.38 - type: nauc_map_at_10_std value: -5.663 - type: nauc_map_at_10_diff1 value: 43.1583 - type: nauc_map_at_20_max value: 36.6981 - type: nauc_map_at_20_std value: -4.9736 - type: nauc_map_at_20_diff1 value: 42.924800000000005 - type: nauc_map_at_100_max value: 37.268699999999995 - type: nauc_map_at_100_std value: -4.6967 - type: nauc_map_at_100_diff1 value: 43.024 - type: nauc_map_at_1000_max value: 37.3818 - type: nauc_map_at_1000_std value: -4.7077 - type: nauc_map_at_1000_diff1 value: 43.0575 - type: nauc_recall_at_1_max value: 26.930300000000003 - type: nauc_recall_at_1_std value: -6.8881 - type: nauc_recall_at_1_diff1 value: 45.905499999999996 - type: nauc_recall_at_3_max value: 27.860200000000003 - type: nauc_recall_at_3_std value: -7.8473 - type: nauc_recall_at_3_diff1 value: 36.569 - type: nauc_recall_at_5_max value: 27.1751 - type: nauc_recall_at_5_std value: -5.0796 - type: nauc_recall_at_5_diff1 value: 33.9236 - type: nauc_recall_at_10_max value: 32.0004 - type: nauc_recall_at_10_std value: 1.0071 - type: nauc_recall_at_10_diff1 value: 33.1849 - type: nauc_recall_at_20_max value: 30.6595 - type: nauc_recall_at_20_std value: 7.3179 - type: nauc_recall_at_20_diff1 value: 29.751300000000004 - type: nauc_recall_at_100_max value: 35.9924 - type: nauc_recall_at_100_std value: 21.691399999999998 - type: nauc_recall_at_100_diff1 value: 31.397100000000002 - type: nauc_recall_at_1000_max value: 47.176899999999996 - type: nauc_recall_at_1000_std value: 37.8536 - type: nauc_recall_at_1000_diff1 value: 30.2447 - type: nauc_precision_at_1_max value: 50.2672 - type: nauc_precision_at_1_std value: -5.858 - type: nauc_precision_at_1_diff1 value: 55.1067 - type: nauc_precision_at_3_max value: 44.4071 - type: nauc_precision_at_3_std value: -4.4772 - type: nauc_precision_at_3_diff1 value: 32.6195 - type: nauc_precision_at_5_max value: 42.6336 - type: nauc_precision_at_5_std value: -0.9528 - type: nauc_precision_at_5_diff1 value: 27.821299999999997 - type: nauc_precision_at_10_max value: 45.5267 - type: nauc_precision_at_10_std value: 4.0484 - type: nauc_precision_at_10_diff1 value: 23.8886 - type: nauc_precision_at_20_max value: 41.7389 - type: nauc_precision_at_20_std value: 9.3544 - type: nauc_precision_at_20_diff1 value: 16.236700000000003 - type: nauc_precision_at_100_max value: 38.4564 - type: nauc_precision_at_100_std value: 12.544 - type: nauc_precision_at_100_diff1 value: 10.5924 - type: nauc_precision_at_1000_max value: 31.2525 - type: nauc_precision_at_1000_std value: 10.641399999999999 - type: nauc_precision_at_1000_diff1 value: 1.5966 - type: nauc_mrr_at_1_max value: 50.2672 - type: nauc_mrr_at_1_std value: -5.858 - type: nauc_mrr_at_1_diff1 value: 55.1067 - type: nauc_mrr_at_3_max value: 49.1124 - type: nauc_mrr_at_3_std value: -5.0685 - type: nauc_mrr_at_3_diff1 value: 51.1787 - type: nauc_mrr_at_5_max value: 48.5671 - type: nauc_mrr_at_5_std value: -4.6053999999999995 - type: nauc_mrr_at_5_diff1 value: 50.688599999999994 - type: nauc_mrr_at_10_max value: 49.2018 - type: nauc_mrr_at_10_std value: -3.8524000000000003 - type: nauc_mrr_at_10_diff1 value: 50.4746 - type: nauc_mrr_at_20_max value: 49.2589 - type: nauc_mrr_at_20_std value: -3.5479 - type: nauc_mrr_at_20_diff1 value: 50.4304 - type: nauc_mrr_at_100_max value: 49.3016 - type: nauc_mrr_at_100_std value: -3.5770999999999997 - type: nauc_mrr_at_100_diff1 value: 50.6172 - type: nauc_mrr_at_1000_max value: 49.2911 - type: nauc_mrr_at_1000_std value: -3.6117999999999997 - type: nauc_mrr_at_1000_diff1 value: 50.6268 - type: main_score value: 43.958999999999996 task: type: Retrieval - dataset: config: default name: MTEB HotpotQA (default) revision: ab518f4d6fcca38d87c25209f94beba119d02014 split: test type: mteb/hotpotqa metrics: - type: ndcg_at_1 value: 85.955 - type: ndcg_at_3 value: 68.83 - type: ndcg_at_5 value: 70.894 - type: ndcg_at_10 value: 72.399 - type: ndcg_at_20 value: 73.328 - type: ndcg_at_100 value: 74.765 - type: ndcg_at_1000 value: 75.87899999999999 - type: map_at_1 value: 42.978 - type: map_at_3 value: 61.568 - type: map_at_5 value: 63.241 - type: map_at_10 value: 64.18199999999999 - type: map_at_20 value: 64.562 - type: map_at_100 value: 64.865 - type: map_at_1000 value: 64.922 - type: recall_at_1 value: 42.978 - type: recall_at_3 value: 64.801 - type: recall_at_5 value: 68.866 - type: recall_at_10 value: 72.627 - type: recall_at_20 value: 75.625 - type: recall_at_100 value: 81.951 - type: recall_at_1000 value: 89.37899999999999 - type: precision_at_1 value: 85.955 - type: precision_at_3 value: 43.201 - type: precision_at_5 value: 27.546 - type: precision_at_10 value: 14.524999999999999 - type: precision_at_20 value: 7.562 - type: precision_at_100 value: 1.6389999999999998 - type: precision_at_1000 value: 0.179 - type: mrr_at_1 value: 85.9554 - type: mrr_at_3 value: 89.2753 - type: mrr_at_5 value: 89.6838 - type: mrr_at_10 value: 89.8559 - type: mrr_at_20 value: 89.92569999999999 - type: mrr_at_100 value: 89.96600000000001 - type: mrr_at_1000 value: 89.97070000000001 - type: nauc_ndcg_at_1_max value: 57.1837 - type: nauc_ndcg_at_1_std value: -4.2725 - type: nauc_ndcg_at_1_diff1 value: 74.8832 - type: nauc_ndcg_at_3_max value: 13.953399999999998 - type: nauc_ndcg_at_3_std value: 0.9547 - type: nauc_ndcg_at_3_diff1 value: 4.6952 - type: nauc_ndcg_at_5_max value: 12.1892 - type: nauc_ndcg_at_5_std value: 1.7878 - type: nauc_ndcg_at_5_diff1 value: 2.1255 - type: nauc_ndcg_at_10_max value: 11.4909 - type: nauc_ndcg_at_10_std value: 2.9917 - type: nauc_ndcg_at_10_diff1 value: 1.111 - type: nauc_ndcg_at_20_max value: 11.183800000000002 - type: nauc_ndcg_at_20_std value: 3.8205999999999998 - type: nauc_ndcg_at_20_diff1 value: 0.5191 - type: nauc_ndcg_at_100_max value: 11.4582 - type: nauc_ndcg_at_100_std value: 5.2234 - type: nauc_ndcg_at_100_diff1 value: 0.7051 - type: nauc_ndcg_at_1000_max value: 11.8891 - type: nauc_ndcg_at_1000_std value: 5.0018 - type: nauc_ndcg_at_1000_diff1 value: 1.3516 - type: nauc_map_at_1_max value: 57.1837 - type: nauc_map_at_1_std value: -4.2725 - type: nauc_map_at_1_diff1 value: 74.8832 - type: nauc_map_at_3_max value: 8.7588 - type: nauc_map_at_3_std value: 0.8586 - type: nauc_map_at_3_diff1 value: -2.1179 - type: nauc_map_at_5_max value: 7.8513 - type: nauc_map_at_5_std value: 1.4206999999999999 - type: nauc_map_at_5_diff1 value: -3.5381000000000005 - type: nauc_map_at_10_max value: 7.603999999999999 - type: nauc_map_at_10_std value: 2.0785 - type: nauc_map_at_10_diff1 value: -3.9354 - type: nauc_map_at_20_max value: 7.5393 - type: nauc_map_at_20_std value: 2.3233 - type: nauc_map_at_20_diff1 value: -4.0794999999999995 - type: nauc_map_at_100_max value: 7.593500000000001 - type: nauc_map_at_100_std value: 2.5528 - type: nauc_map_at_100_diff1 value: -4.0459000000000005 - type: nauc_map_at_1000_max value: 7.6116 - type: nauc_map_at_1000_std value: 2.5475000000000003 - type: nauc_map_at_1000_diff1 value: -4.0208 - type: nauc_recall_at_1_max value: 57.1837 - type: nauc_recall_at_1_std value: -4.2725 - type: nauc_recall_at_1_diff1 value: 74.8832 - type: nauc_recall_at_3_max value: 5.1265 - type: nauc_recall_at_3_std value: 2.3453999999999997 - type: nauc_recall_at_3_diff1 value: -9.5534 - type: nauc_recall_at_5_max value: 1.3988 - type: nauc_recall_at_5_std value: 3.8738 - type: nauc_recall_at_5_diff1 value: -14.770900000000001 - type: nauc_recall_at_10_max value: -1.1159999999999999 - type: nauc_recall_at_10_std value: 6.7406999999999995 - type: nauc_recall_at_10_diff1 value: -18.08 - type: nauc_recall_at_20_max value: -2.9072 - type: nauc_recall_at_20_std value: 9.6567 - type: nauc_recall_at_20_diff1 value: -21.197 - type: nauc_recall_at_100_max value: -4.4864 - type: nauc_recall_at_100_std value: 17.8761 - type: nauc_recall_at_100_diff1 value: -24.5792 - type: nauc_recall_at_1000_max value: -7.9052 - type: nauc_recall_at_1000_std value: 21.7637 - type: nauc_recall_at_1000_diff1 value: -30.4447 - type: nauc_precision_at_1_max value: 57.1837 - type: nauc_precision_at_1_std value: -4.2725 - type: nauc_precision_at_1_diff1 value: 74.8832 - type: nauc_precision_at_3_max value: 5.1265 - type: nauc_precision_at_3_std value: 2.3453999999999997 - type: nauc_precision_at_3_diff1 value: -9.5534 - type: nauc_precision_at_5_max value: 1.3988 - type: nauc_precision_at_5_std value: 3.8738 - type: nauc_precision_at_5_diff1 value: -14.770900000000001 - type: nauc_precision_at_10_max value: -1.1159999999999999 - type: nauc_precision_at_10_std value: 6.7406999999999995 - type: nauc_precision_at_10_diff1 value: -18.08 - type: nauc_precision_at_20_max value: -2.9072 - type: nauc_precision_at_20_std value: 9.6567 - type: nauc_precision_at_20_diff1 value: -21.197 - type: nauc_precision_at_100_max value: -4.4864 - type: nauc_precision_at_100_std value: 17.8761 - type: nauc_precision_at_100_diff1 value: -24.5792 - type: nauc_precision_at_1000_max value: -7.9052 - type: nauc_precision_at_1000_std value: 21.7637 - type: nauc_precision_at_1000_diff1 value: -30.4447 - type: nauc_mrr_at_1_max value: 57.1837 - type: nauc_mrr_at_1_std value: -4.2725 - type: nauc_mrr_at_1_diff1 value: 74.8832 - type: nauc_mrr_at_3_max value: 60.68019999999999 - type: nauc_mrr_at_3_std value: -2.5041 - type: nauc_mrr_at_3_diff1 value: 74.2505 - type: nauc_mrr_at_5_max value: 60.3928 - type: nauc_mrr_at_5_std value: -2.2979 - type: nauc_mrr_at_5_diff1 value: 74.27470000000001 - type: nauc_mrr_at_10_max value: 60.336800000000004 - type: nauc_mrr_at_10_std value: -2.308 - type: nauc_mrr_at_10_diff1 value: 74.4135 - type: nauc_mrr_at_20_max value: 60.317299999999996 - type: nauc_mrr_at_20_std value: -2.1652 - type: nauc_mrr_at_20_diff1 value: 74.3945 - type: nauc_mrr_at_100_max value: 60.283 - type: nauc_mrr_at_100_std value: -2.154 - type: nauc_mrr_at_100_diff1 value: 74.38040000000001 - type: nauc_mrr_at_1000_max value: 60.272099999999995 - type: nauc_mrr_at_1000_std value: -2.1783 - type: nauc_mrr_at_1000_diff1 value: 74.378 - type: main_score value: 72.399 task: type: Retrieval - dataset: config: default name: MTEB ImdbClassification (default) revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7 split: test type: mteb/imdb metrics: - type: accuracy value: 69.0916 - type: f1 value: 68.9866 - type: f1_weighted value: 68.9866 - type: ap value: 63.3215 - type: ap_weighted value: 63.3215 - type: main_score value: 69.0916 task: type: Classification - dataset: config: default name: MTEB MSMARCO (default) revision: c5a29a104738b98a9e76336939199e264163d4a0 split: dev type: mteb/msmarco metrics: - type: ndcg_at_1 value: 24.914 - type: ndcg_at_3 value: 36.479 - type: ndcg_at_5 value: 40.288000000000004 - type: ndcg_at_10 value: 44.043 - type: ndcg_at_20 value: 46.838 - type: ndcg_at_100 value: 49.626999999999995 - type: ndcg_at_1000 value: 50.665000000000006 - type: map_at_1 value: 24.223 - type: map_at_3 value: 33.348 - type: map_at_5 value: 35.494 - type: map_at_10 value: 37.077 - type: map_at_20 value: 37.867 - type: map_at_100 value: 38.279999999999994 - type: map_at_1000 value: 38.323 - type: recall_at_1 value: 24.223 - type: recall_at_3 value: 44.9 - type: recall_at_5 value: 54.010999999999996 - type: recall_at_10 value: 65.399 - type: recall_at_20 value: 76.248 - type: recall_at_100 value: 90.78 - type: recall_at_1000 value: 98.619 - type: precision_at_1 value: 24.914 - type: precision_at_3 value: 15.501000000000001 - type: precision_at_5 value: 11.238 - type: precision_at_10 value: 6.837 - type: precision_at_20 value: 3.9960000000000004 - type: precision_at_100 value: 0.959 - type: precision_at_1000 value: 0.105 - type: mrr_at_1 value: 24.914 - type: mrr_at_3 value: 34.0043 - type: mrr_at_5 value: 36.1089 - type: mrr_at_10 value: 37.6521 - type: mrr_at_20 value: 38.4106 - type: mrr_at_100 value: 38.7938 - type: mrr_at_1000 value: 38.8316 - type: nauc_ndcg_at_1_max value: 3.9297 - type: nauc_ndcg_at_1_std value: -22.016 - type: nauc_ndcg_at_1_diff1 value: 39.7204 - type: nauc_ndcg_at_3_max value: 4.7672 - type: nauc_ndcg_at_3_std value: -27.0359 - type: nauc_ndcg_at_3_diff1 value: 34.139 - type: nauc_ndcg_at_5_max value: 5.1921 - type: nauc_ndcg_at_5_std value: -28.6425 - type: nauc_ndcg_at_5_diff1 value: 33.671800000000005 - type: nauc_ndcg_at_10_max value: 5.3812999999999995 - type: nauc_ndcg_at_10_std value: -28.7602 - type: nauc_ndcg_at_10_diff1 value: 33.5856 - type: nauc_ndcg_at_20_max value: 5.7039 - type: nauc_ndcg_at_20_std value: -27.578000000000003 - type: nauc_ndcg_at_20_diff1 value: 33.9639 - type: nauc_ndcg_at_100_max value: 5.9491000000000005 - type: nauc_ndcg_at_100_std value: -25.562800000000003 - type: nauc_ndcg_at_100_diff1 value: 34.5177 - type: nauc_ndcg_at_1000_max value: 5.7685 - type: nauc_ndcg_at_1000_std value: -25.796400000000002 - type: nauc_ndcg_at_1000_diff1 value: 34.617 - type: nauc_map_at_1_max value: 3.8164 - type: nauc_map_at_1_std value: -22.1345 - type: nauc_map_at_1_diff1 value: 39.7682 - type: nauc_map_at_3_max value: 4.5438 - type: nauc_map_at_3_std value: -25.990299999999998 - type: nauc_map_at_3_diff1 value: 35.4211 - type: nauc_map_at_5_max value: 4.7521 - type: nauc_map_at_5_std value: -26.9187 - type: nauc_map_at_5_diff1 value: 35.1711 - type: nauc_map_at_10_max value: 4.8275 - type: nauc_map_at_10_std value: -26.962799999999998 - type: nauc_map_at_10_diff1 value: 35.1875 - type: nauc_map_at_20_max value: 4.9247 - type: nauc_map_at_20_std value: -26.622899999999998 - type: nauc_map_at_20_diff1 value: 35.308499999999995 - type: nauc_map_at_100_max value: 4.9704 - type: nauc_map_at_100_std value: -26.3156 - type: nauc_map_at_100_diff1 value: 35.3955 - type: nauc_map_at_1000_max value: 4.9692 - type: nauc_map_at_1000_std value: -26.3098 - type: nauc_map_at_1000_diff1 value: 35.3987 - type: nauc_recall_at_1_max value: 3.8164 - type: nauc_recall_at_1_std value: -22.1345 - type: nauc_recall_at_1_diff1 value: 39.7682 - type: nauc_recall_at_3_max value: 5.2443 - type: nauc_recall_at_3_std value: -29.965000000000003 - type: nauc_recall_at_3_diff1 value: 30.303 - type: nauc_recall_at_5_max value: 6.164499999999999 - type: nauc_recall_at_5_std value: -33.9534 - type: nauc_recall_at_5_diff1 value: 28.9101 - type: nauc_recall_at_10_max value: 6.8656999999999995 - type: nauc_recall_at_10_std value: -35.2711 - type: nauc_recall_at_10_diff1 value: 27.785500000000003 - type: nauc_recall_at_20_max value: 8.7891 - type: nauc_recall_at_20_std value: -31.276 - type: nauc_recall_at_20_diff1 value: 28.048099999999998 - type: nauc_recall_at_100_max value: 15.3546 - type: nauc_recall_at_100_std value: -7.2786 - type: nauc_recall_at_100_diff1 value: 29.0868 - type: nauc_recall_at_1000_max value: 33.858 - type: nauc_recall_at_1000_std value: 42.2189 - type: nauc_recall_at_1000_diff1 value: 18.9862 - type: nauc_precision_at_1_max value: 3.9297 - type: nauc_precision_at_1_std value: -22.016 - type: nauc_precision_at_1_diff1 value: 39.7204 - type: nauc_precision_at_3_max value: 5.1912 - type: nauc_precision_at_3_std value: -29.697000000000003 - type: nauc_precision_at_3_diff1 value: 30.089199999999998 - type: nauc_precision_at_5_max value: 6.311400000000001 - type: nauc_precision_at_5_std value: -32.9724 - type: nauc_precision_at_5_diff1 value: 28.0676 - type: nauc_precision_at_10_max value: 6.869400000000001 - type: nauc_precision_at_10_std value: -32.4788 - type: nauc_precision_at_10_diff1 value: 25.6897 - type: nauc_precision_at_20_max value: 9.206 - type: nauc_precision_at_20_std value: -25.3222 - type: nauc_precision_at_20_diff1 value: 23.799500000000002 - type: nauc_precision_at_100_max value: 13.8625 - type: nauc_precision_at_100_std value: 3.3068 - type: nauc_precision_at_100_diff1 value: 14.3806 - type: nauc_precision_at_1000_max value: 11.8588 - type: nauc_precision_at_1000_std value: 17.6676 - type: nauc_precision_at_1000_diff1 value: -3.8201 - type: nauc_mrr_at_1_max value: 3.9297 - type: nauc_mrr_at_1_std value: -22.016 - type: nauc_mrr_at_1_diff1 value: 39.7204 - type: nauc_mrr_at_3_max value: 4.6479 - type: nauc_mrr_at_3_std value: -25.644699999999997 - type: nauc_mrr_at_3_diff1 value: 35.478 - type: nauc_mrr_at_5_max value: 4.986 - type: nauc_mrr_at_5_std value: -26.4206 - type: nauc_mrr_at_5_diff1 value: 35.285 - type: nauc_mrr_at_10_max value: 5.0845 - type: nauc_mrr_at_10_std value: -26.411800000000003 - type: nauc_mrr_at_10_diff1 value: 35.2365 - type: nauc_mrr_at_20_max value: 5.1531 - type: nauc_mrr_at_20_std value: -26.0735 - type: nauc_mrr_at_20_diff1 value: 35.3495 - type: nauc_mrr_at_100_max value: 5.1672 - type: nauc_mrr_at_100_std value: -25.8254 - type: nauc_mrr_at_100_diff1 value: 35.4396 - type: nauc_mrr_at_1000_max value: 5.1629000000000005 - type: nauc_mrr_at_1000_std value: -25.8233 - type: nauc_mrr_at_1000_diff1 value: 35.4444 - type: main_score value: 44.043 task: type: Retrieval - dataset: config: en name: MTEB MTOPDomainClassification (en) revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf split: test type: mteb/mtop_domain metrics: - type: accuracy value: 92.08619999999999 - type: f1 value: 91.8074 - type: f1_weighted value: 92.0765 - type: main_score value: 92.08619999999999 task: type: Classification - dataset: config: en name: MTEB MTOPIntentClassification (en) revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba split: test type: mteb/mtop_intent metrics: - type: accuracy value: 65.2668 - type: f1 value: 44.499 - type: f1_weighted value: 67.9193 - type: main_score value: 65.2668 task: type: Classification - dataset: config: en name: MTEB MassiveIntentClassification (en) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 68.0128 - type: f1 value: 64.4011 - type: f1_weighted value: 67.4705 - type: main_score value: 68.0128 task: type: Classification - dataset: config: en name: MTEB MassiveScenarioClassification (en) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 72.67320000000001 - type: f1 value: 71.7881 - type: f1_weighted value: 72.9092 - type: main_score value: 72.67320000000001 task: type: Classification - dataset: config: default name: MTEB MedrxivClusteringP2P (default) revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73 split: test type: mteb/medrxiv-clustering-p2p metrics: - type: v_measure value: 31.5764 - type: v_measure_std value: 1.3743999999999998 - type: main_score value: 31.5764 task: type: Clustering - dataset: config: default name: MTEB MedrxivClusteringS2S (default) revision: 35191c8c0dca72d8ff3efcd72aa802307d469663 split: test type: mteb/medrxiv-clustering-s2s metrics: - type: v_measure value: 28.006999999999998 - type: v_measure_std value: 1.4235 - type: main_score value: 28.006999999999998 task: type: Clustering - dataset: config: default name: MTEB MindSmallReranking (default) revision: 59042f120c80e8afa9cdbb224f67076cec0fc9a7 split: test type: mteb/mind_small metrics: - type: map value: 30.3039 - type: mrr value: 31.168699999999998 - type: nAUC_map_max value: -25.113200000000003 - type: nAUC_map_std value: -8.5652 - type: nAUC_map_diff1 value: 12.437199999999999 - type: nAUC_mrr_max value: -19.5255 - type: nAUC_mrr_std value: -6.1112 - type: nAUC_mrr_diff1 value: 12.1585 - type: main_score value: 30.3039 task: type: Reranking - dataset: config: default name: MTEB NFCorpus (default) revision: ec0fa4fe99da2ff19ca1214b7966684033a58814 split: test type: mteb/nfcorpus metrics: - type: ndcg_at_1 value: 45.046 - type: ndcg_at_3 value: 41.975 - type: ndcg_at_5 value: 39.421 - type: ndcg_at_10 value: 35.879 - type: ndcg_at_20 value: 32.987 - type: ndcg_at_100 value: 32.107 - type: ndcg_at_1000 value: 40.67 - type: map_at_1 value: 5.854 - type: map_at_3 value: 9.991999999999999 - type: map_at_5 value: 11.405999999999999 - type: map_at_10 value: 13.272 - type: map_at_20 value: 14.604000000000001 - type: map_at_100 value: 16.521 - type: map_at_1000 value: 17.925 - type: recall_at_1 value: 5.854 - type: recall_at_3 value: 11.036999999999999 - type: recall_at_5 value: 13.391 - type: recall_at_10 value: 16.841 - type: recall_at_20 value: 20.522000000000002 - type: recall_at_100 value: 31.733 - type: recall_at_1000 value: 63.525 - type: precision_at_1 value: 46.749 - type: precision_at_3 value: 39.525 - type: precision_at_5 value: 34.056 - type: precision_at_10 value: 26.656000000000002 - type: precision_at_20 value: 19.211 - type: precision_at_100 value: 8.099 - type: precision_at_1000 value: 2.061 - type: mrr_at_1 value: 47.0588 - type: mrr_at_3 value: 53.9732 - type: mrr_at_5 value: 55.443799999999996 - type: mrr_at_10 value: 56.04599999999999 - type: mrr_at_20 value: 56.37799999999999 - type: mrr_at_100 value: 56.6504 - type: mrr_at_1000 value: 56.6866 - type: nauc_ndcg_at_1_max value: 43.5884 - type: nauc_ndcg_at_1_std value: 22.4376 - type: nauc_ndcg_at_1_diff1 value: 34.7846 - type: nauc_ndcg_at_3_max value: 44.7961 - type: nauc_ndcg_at_3_std value: 24.4811 - type: nauc_ndcg_at_3_diff1 value: 25.5747 - type: nauc_ndcg_at_5_max value: 43.5994 - type: nauc_ndcg_at_5_std value: 24.827199999999998 - type: nauc_ndcg_at_5_diff1 value: 23.8874 - type: nauc_ndcg_at_10_max value: 43.126999999999995 - type: nauc_ndcg_at_10_std value: 27.5053 - type: nauc_ndcg_at_10_diff1 value: 23.4832 - type: nauc_ndcg_at_20_max value: 43.1243 - type: nauc_ndcg_at_20_std value: 27.3455 - type: nauc_ndcg_at_20_diff1 value: 23.8534 - type: nauc_ndcg_at_100_max value: 46.5936 - type: nauc_ndcg_at_100_std value: 28.0084 - type: nauc_ndcg_at_100_diff1 value: 29.630200000000002 - type: nauc_ndcg_at_1000_max value: 51.7379 - type: nauc_ndcg_at_1000_std value: 33.2077 - type: nauc_ndcg_at_1000_diff1 value: 30.1522 - type: nauc_map_at_1_max value: 17.2703 - type: nauc_map_at_1_std value: -14.6241 - type: nauc_map_at_1_diff1 value: 46.9767 - type: nauc_map_at_3_max value: 25.562600000000003 - type: nauc_map_at_3_std value: -10.1565 - type: nauc_map_at_3_diff1 value: 39.347500000000004 - type: nauc_map_at_5_max value: 28.397299999999998 - type: nauc_map_at_5_std value: -7.0083 - type: nauc_map_at_5_diff1 value: 37.4216 - type: nauc_map_at_10_max value: 31.639400000000002 - type: nauc_map_at_10_std value: -1.9 - type: nauc_map_at_10_diff1 value: 35.9293 - type: nauc_map_at_20_max value: 34.342800000000004 - type: nauc_map_at_20_std value: 2.6614 - type: nauc_map_at_20_diff1 value: 34.7985 - type: nauc_map_at_100_max value: 37.046600000000005 - type: nauc_map_at_100_std value: 9.2072 - type: nauc_map_at_100_diff1 value: 33.2764 - type: nauc_map_at_1000_max value: 37.6597 - type: nauc_map_at_1000_std value: 12.6768 - type: nauc_map_at_1000_diff1 value: 31.773699999999998 - type: nauc_recall_at_1_max value: 17.2703 - type: nauc_recall_at_1_std value: -14.6241 - type: nauc_recall_at_1_diff1 value: 46.9767 - type: nauc_recall_at_3_max value: 24.5473 - type: nauc_recall_at_3_std value: -9.7412 - type: nauc_recall_at_3_diff1 value: 37.8539 - type: nauc_recall_at_5_max value: 27.249200000000002 - type: nauc_recall_at_5_std value: -5.823799999999999 - type: nauc_recall_at_5_diff1 value: 34.06 - type: nauc_recall_at_10_max value: 29.1217 - type: nauc_recall_at_10_std value: -0.21159999999999998 - type: nauc_recall_at_10_diff1 value: 32.3914 - type: nauc_recall_at_20_max value: 31.142999999999997 - type: nauc_recall_at_20_std value: 4.3805 - type: nauc_recall_at_20_diff1 value: 28.852899999999998 - type: nauc_recall_at_100_max value: 32.8751 - type: nauc_recall_at_100_std value: 16.0658 - type: nauc_recall_at_100_diff1 value: 24.8181 - type: nauc_recall_at_1000_max value: 24.5638 - type: nauc_recall_at_1000_std value: 20.822 - type: nauc_recall_at_1000_diff1 value: 13.123099999999999 - type: nauc_precision_at_1_max value: 44.714999999999996 - type: nauc_precision_at_1_std value: 23.2541 - type: nauc_precision_at_1_diff1 value: 33.9092 - type: nauc_precision_at_3_max value: 44.935199999999995 - type: nauc_precision_at_3_std value: 29.0989 - type: nauc_precision_at_3_diff1 value: 14.9816 - type: nauc_precision_at_5_max value: 40.7582 - type: nauc_precision_at_5_std value: 31.049 - type: nauc_precision_at_5_diff1 value: 9.7826 - type: nauc_precision_at_10_max value: 37.8974 - type: nauc_precision_at_10_std value: 38.9576 - type: nauc_precision_at_10_diff1 value: 4.3217 - type: nauc_precision_at_20_max value: 33.254099999999994 - type: nauc_precision_at_20_std value: 42.3527 - type: nauc_precision_at_20_diff1 value: -1.8002 - type: nauc_precision_at_100_max value: 20.6042 - type: nauc_precision_at_100_std value: 46.0314 - type: nauc_precision_at_100_diff1 value: -10.098 - type: nauc_precision_at_1000_max value: 6.8368 - type: nauc_precision_at_1000_std value: 36.4345 - type: nauc_precision_at_1000_diff1 value: -16.1738 - type: nauc_mrr_at_1_max value: 44.1317 - type: nauc_mrr_at_1_std value: 22.794900000000002 - type: nauc_mrr_at_1_diff1 value: 33.071600000000004 - type: nauc_mrr_at_3_max value: 49.8647 - type: nauc_mrr_at_3_std value: 28.821600000000004 - type: nauc_mrr_at_3_diff1 value: 31.1845 - type: nauc_mrr_at_5_max value: 50.3448 - type: nauc_mrr_at_5_std value: 28.721799999999998 - type: nauc_mrr_at_5_diff1 value: 31.6681 - type: nauc_mrr_at_10_max value: 50.601 - type: nauc_mrr_at_10_std value: 29.461199999999998 - type: nauc_mrr_at_10_diff1 value: 31.5519 - type: nauc_mrr_at_20_max value: 50.7861 - type: nauc_mrr_at_20_std value: 29.615000000000002 - type: nauc_mrr_at_20_diff1 value: 31.535200000000003 - type: nauc_mrr_at_100_max value: 50.7764 - type: nauc_mrr_at_100_std value: 29.772199999999998 - type: nauc_mrr_at_100_diff1 value: 31.5569 - type: nauc_mrr_at_1000_max value: 50.75150000000001 - type: nauc_mrr_at_1000_std value: 29.747600000000002 - type: nauc_mrr_at_1000_diff1 value: 31.5457 - type: main_score value: 35.879 task: type: Retrieval - dataset: config: default name: MTEB NQ (default) revision: b774495ed302d8c44a3a7ea25c90dbce03968f31 split: test type: mteb/nq metrics: - type: ndcg_at_1 value: 45.394 - type: ndcg_at_3 value: 57.17 - type: ndcg_at_5 value: 61.402 - type: ndcg_at_10 value: 64.59899999999999 - type: ndcg_at_20 value: 66.24600000000001 - type: ndcg_at_100 value: 67.522 - type: ndcg_at_1000 value: 67.849 - type: map_at_1 value: 40.6 - type: map_at_3 value: 53.055 - type: map_at_5 value: 55.67100000000001 - type: map_at_10 value: 57.160999999999994 - type: map_at_20 value: 57.701 - type: map_at_100 value: 57.926 - type: map_at_1000 value: 57.940999999999995 - type: recall_at_1 value: 40.6 - type: recall_at_3 value: 65.766 - type: recall_at_5 value: 75.466 - type: recall_at_10 value: 84.654 - type: recall_at_20 value: 90.60000000000001 - type: recall_at_100 value: 96.854 - type: recall_at_1000 value: 99.232 - type: precision_at_1 value: 45.394 - type: precision_at_3 value: 25.521 - type: precision_at_5 value: 17.781 - type: precision_at_10 value: 10.098 - type: precision_at_20 value: 5.4559999999999995 - type: precision_at_100 value: 1.176 - type: precision_at_1000 value: 0.121 - type: mrr_at_1 value: 45.394 - type: mrr_at_3 value: 56.3104 - type: mrr_at_5 value: 58.36130000000001 - type: mrr_at_10 value: 59.5005 - type: mrr_at_20 value: 59.866299999999995 - type: mrr_at_100 value: 59.9998 - type: mrr_at_1000 value: 60.0097 - type: nauc_ndcg_at_1_max value: 26.4568 - type: nauc_ndcg_at_1_std value: -5.4489 - type: nauc_ndcg_at_1_diff1 value: 39.8496 - type: nauc_ndcg_at_3_max value: 31.1415 - type: nauc_ndcg_at_3_std value: -7.0855 - type: nauc_ndcg_at_3_diff1 value: 36.4212 - type: nauc_ndcg_at_5_max value: 32.819199999999995 - type: nauc_ndcg_at_5_std value: -5.775 - type: nauc_ndcg_at_5_diff1 value: 35.7043 - type: nauc_ndcg_at_10_max value: 33.0741 - type: nauc_ndcg_at_10_std value: -4.5213 - type: nauc_ndcg_at_10_diff1 value: 36.19 - type: nauc_ndcg_at_20_max value: 33.266400000000004 - type: nauc_ndcg_at_20_std value: -3.5874 - type: nauc_ndcg_at_20_diff1 value: 36.2496 - type: nauc_ndcg_at_100_max value: 32.7922 - type: nauc_ndcg_at_100_std value: -3.2738000000000005 - type: nauc_ndcg_at_100_diff1 value: 36.5649 - type: nauc_ndcg_at_1000_max value: 32.237500000000004 - type: nauc_ndcg_at_1000_std value: -3.9578 - type: nauc_ndcg_at_1000_diff1 value: 36.717499999999994 - type: nauc_map_at_1_max value: 24.3328 - type: nauc_map_at_1_std value: -7.889799999999999 - type: nauc_map_at_1_diff1 value: 40.0251 - type: nauc_map_at_3_max value: 29.6774 - type: nauc_map_at_3_std value: -7.5739 - type: nauc_map_at_3_diff1 value: 37.459900000000005 - type: nauc_map_at_5_max value: 30.6947 - type: nauc_map_at_5_std value: -6.7940000000000005 - type: nauc_map_at_5_diff1 value: 37.0909 - type: nauc_map_at_10_max value: 30.723899999999997 - type: nauc_map_at_10_std value: -6.2581999999999995 - type: nauc_map_at_10_diff1 value: 37.1775 - type: nauc_map_at_20_max value: 30.7861 - type: nauc_map_at_20_std value: -5.9957 - type: nauc_map_at_20_diff1 value: 37.209900000000005 - type: nauc_map_at_100_max value: 30.7336 - type: nauc_map_at_100_std value: -5.909 - type: nauc_map_at_100_diff1 value: 37.2446 - type: nauc_map_at_1000_max value: 30.7142 - type: nauc_map_at_1000_std value: -5.9306 - type: nauc_map_at_1000_diff1 value: 37.25 - type: nauc_recall_at_1_max value: 24.3328 - type: nauc_recall_at_1_std value: -7.889799999999999 - type: nauc_recall_at_1_diff1 value: 40.0251 - type: nauc_recall_at_3_max value: 34.2412 - type: nauc_recall_at_3_std value: -7.5245999999999995 - type: nauc_recall_at_3_diff1 value: 32.7498 - type: nauc_recall_at_5_max value: 39.6798 - type: nauc_recall_at_5_std value: -4.1992 - type: nauc_recall_at_5_diff1 value: 29.5385 - type: nauc_recall_at_10_max value: 44.5052 - type: nauc_recall_at_10_std value: 2.4045 - type: nauc_recall_at_10_diff1 value: 30.051499999999997 - type: nauc_recall_at_20_max value: 52.8161 - type: nauc_recall_at_20_std value: 14.1647 - type: nauc_recall_at_20_diff1 value: 27.7847 - type: nauc_recall_at_100_max value: 74.644 - type: nauc_recall_at_100_std value: 54.927099999999996 - type: nauc_recall_at_100_diff1 value: 27.507900000000003 - type: nauc_recall_at_1000_max value: 85.1144 - type: nauc_recall_at_1000_std value: 80.0515 - type: nauc_recall_at_1000_diff1 value: 37.028299999999994 - type: nauc_precision_at_1_max value: 26.4568 - type: nauc_precision_at_1_std value: -5.4489 - type: nauc_precision_at_1_diff1 value: 39.8496 - type: nauc_precision_at_3_max value: 30.0271 - type: nauc_precision_at_3_std value: -0.8751 - type: nauc_precision_at_3_diff1 value: 21.8662 - type: nauc_precision_at_5_max value: 28.4063 - type: nauc_precision_at_5_std value: 4.1253 - type: nauc_precision_at_5_diff1 value: 13.1855 - type: nauc_precision_at_10_max value: 22.6524 - type: nauc_precision_at_10_std value: 10.340399999999999 - type: nauc_precision_at_10_diff1 value: 5.4243 - type: nauc_precision_at_20_max value: 18.4481 - type: nauc_precision_at_20_std value: 16.0409 - type: nauc_precision_at_20_diff1 value: -0.9561 - type: nauc_precision_at_100_max value: 9.361600000000001 - type: nauc_precision_at_100_std value: 19.1145 - type: nauc_precision_at_100_diff1 value: -8.0049 - type: nauc_precision_at_1000_max value: 3.0707 - type: nauc_precision_at_1000_std value: 15.259900000000002 - type: nauc_precision_at_1000_diff1 value: -10.190000000000001 - type: nauc_mrr_at_1_max value: 26.4568 - type: nauc_mrr_at_1_std value: -5.4489 - type: nauc_mrr_at_1_diff1 value: 39.8496 - type: nauc_mrr_at_3_max value: 30.262299999999996 - type: nauc_mrr_at_3_std value: -5.428100000000001 - type: nauc_mrr_at_3_diff1 value: 36.878899999999994 - type: nauc_mrr_at_5_max value: 30.813000000000002 - type: nauc_mrr_at_5_std value: -4.7534 - type: nauc_mrr_at_5_diff1 value: 36.5968 - type: nauc_mrr_at_10_max value: 30.857499999999998 - type: nauc_mrr_at_10_std value: -4.4249 - type: nauc_mrr_at_10_diff1 value: 36.973 - type: nauc_mrr_at_20_max value: 30.8228 - type: nauc_mrr_at_20_std value: -4.3275 - type: nauc_mrr_at_20_diff1 value: 37.0266 - type: nauc_mrr_at_100_max value: 30.7442 - type: nauc_mrr_at_100_std value: -4.3408 - type: nauc_mrr_at_100_diff1 value: 37.060500000000005 - type: nauc_mrr_at_1000_max value: 30.7286 - type: nauc_mrr_at_1000_std value: -4.36 - type: nauc_mrr_at_1000_diff1 value: 37.0647 - type: main_score value: 64.59899999999999 task: type: Retrieval - dataset: config: default name: MTEB QuoraRetrieval (default) revision: e4e08e0b7dbe3c8700f0daef558ff32256715259 split: test type: mteb/quora metrics: - type: ndcg_at_1 value: 82.01 - type: ndcg_at_3 value: 86.035 - type: ndcg_at_5 value: 87.628 - type: ndcg_at_10 value: 88.735 - type: ndcg_at_20 value: 89.375 - type: ndcg_at_100 value: 89.89 - type: ndcg_at_1000 value: 90.001 - type: map_at_1 value: 71.126 - type: map_at_3 value: 82.14399999999999 - type: map_at_5 value: 84.03500000000001 - type: map_at_10 value: 85.064 - type: map_at_20 value: 85.469 - type: map_at_100 value: 85.673 - type: map_at_1000 value: 85.69099999999999 - type: recall_at_1 value: 71.126 - type: recall_at_3 value: 87.76 - type: recall_at_5 value: 92.286 - type: recall_at_10 value: 95.56 - type: recall_at_20 value: 97.655 - type: recall_at_100 value: 99.497 - type: recall_at_1000 value: 99.979 - type: precision_at_1 value: 82.01 - type: precision_at_3 value: 37.653 - type: precision_at_5 value: 24.779999999999998 - type: precision_at_10 value: 13.441 - type: precision_at_20 value: 7.114 - type: precision_at_100 value: 1.524 - type: precision_at_1000 value: 0.157 - type: mrr_at_1 value: 81.96 - type: mrr_at_3 value: 87.105 - type: mrr_at_5 value: 87.779 - type: mrr_at_10 value: 88.02680000000001 - type: mrr_at_20 value: 88.10470000000001 - type: mrr_at_100 value: 88.126 - type: mrr_at_1000 value: 88.127 - type: nauc_ndcg_at_1_max value: 37.866499999999995 - type: nauc_ndcg_at_1_std value: -40.9317 - type: nauc_ndcg_at_1_diff1 value: 78.09089999999999 - type: nauc_ndcg_at_3_max value: 35.4917 - type: nauc_ndcg_at_3_std value: -48.968 - type: nauc_ndcg_at_3_diff1 value: 75.90050000000001 - type: nauc_ndcg_at_5_max value: 35.898799999999994 - type: nauc_ndcg_at_5_std value: -50.5572 - type: nauc_ndcg_at_5_diff1 value: 76.6471 - type: nauc_ndcg_at_10_max value: 36.7786 - type: nauc_ndcg_at_10_std value: -49.6733 - type: nauc_ndcg_at_10_diff1 value: 76.8147 - type: nauc_ndcg_at_20_max value: 37.1374 - type: nauc_ndcg_at_20_std value: -47.9144 - type: nauc_ndcg_at_20_diff1 value: 76.6412 - type: nauc_ndcg_at_100_max value: 37.3452 - type: nauc_ndcg_at_100_std value: -46.0007 - type: nauc_ndcg_at_100_diff1 value: 76.6194 - type: nauc_ndcg_at_1000_max value: 37.4848 - type: nauc_ndcg_at_1000_std value: -45.6578 - type: nauc_ndcg_at_1000_diff1 value: 76.6001 - type: nauc_map_at_1_max value: 26.7109 - type: nauc_map_at_1_std value: -42.9943 - type: nauc_map_at_1_diff1 value: 80.5567 - type: nauc_map_at_3_max value: 32.8491 - type: nauc_map_at_3_std value: -51.64 - type: nauc_map_at_3_diff1 value: 77.29700000000001 - type: nauc_map_at_5_max value: 34.4071 - type: nauc_map_at_5_std value: -51.6503 - type: nauc_map_at_5_diff1 value: 77.28920000000001 - type: nauc_map_at_10_max value: 35.4934 - type: nauc_map_at_10_std value: -50.0995 - type: nauc_map_at_10_diff1 value: 76.9983 - type: nauc_map_at_20_max value: 35.8087 - type: nauc_map_at_20_std value: -48.8069 - type: nauc_map_at_20_diff1 value: 76.8026 - type: nauc_map_at_100_max value: 35.8928 - type: nauc_map_at_100_std value: -48.0561 - type: nauc_map_at_100_diff1 value: 76.7244 - type: nauc_map_at_1000_max value: 35.924499999999995 - type: nauc_map_at_1000_std value: -47.981899999999996 - type: nauc_map_at_1000_diff1 value: 76.7183 - type: nauc_recall_at_1_max value: 26.7109 - type: nauc_recall_at_1_std value: -42.9943 - type: nauc_recall_at_1_diff1 value: 80.5567 - type: nauc_recall_at_3_max value: 29.066300000000002 - type: nauc_recall_at_3_std value: -60.1536 - type: nauc_recall_at_3_diff1 value: 73.32469999999999 - type: nauc_recall_at_5_max value: 30.1025 - type: nauc_recall_at_5_std value: -67.8779 - type: nauc_recall_at_5_diff1 value: 73.13340000000001 - type: nauc_recall_at_10_max value: 33.771699999999996 - type: nauc_recall_at_10_std value: -72.4753 - type: nauc_recall_at_10_diff1 value: 74.168 - type: nauc_recall_at_20_max value: 34.8005 - type: nauc_recall_at_20_std value: -68.60579999999999 - type: nauc_recall_at_20_diff1 value: 72.6083 - type: nauc_recall_at_100_max value: 33.394800000000004 - type: nauc_recall_at_100_std value: -49.7417 - type: nauc_recall_at_100_diff1 value: 73.5857 - type: nauc_recall_at_1000_max value: 48.8898 - type: nauc_recall_at_1000_std value: 54.583800000000004 - type: nauc_recall_at_1000_diff1 value: 64.0609 - type: nauc_precision_at_1_max value: 37.866499999999995 - type: nauc_precision_at_1_std value: -40.9317 - type: nauc_precision_at_1_diff1 value: 78.09089999999999 - type: nauc_precision_at_3_max value: 8.2308 - type: nauc_precision_at_3_std value: 5.0732 - type: nauc_precision_at_3_diff1 value: -19.919 - type: nauc_precision_at_5_max value: 3.0249 - type: nauc_precision_at_5_std value: 16.7897 - type: nauc_precision_at_5_diff1 value: -32.0086 - type: nauc_precision_at_10_max value: -0.5459999999999999 - type: nauc_precision_at_10_std value: 27.1262 - type: nauc_precision_at_10_diff1 value: -38.8076 - type: nauc_precision_at_20_max value: -2.7663 - type: nauc_precision_at_20_std value: 34.1696 - type: nauc_precision_at_20_diff1 value: -42.1088 - type: nauc_precision_at_100_max value: -5.0689 - type: nauc_precision_at_100_std value: 40.023599999999995 - type: nauc_precision_at_100_diff1 value: -43.8996 - type: nauc_precision_at_1000_max value: -5.1495 - type: nauc_precision_at_1000_std value: 41.4194 - type: nauc_precision_at_1000_diff1 value: -44.219 - type: nauc_mrr_at_1_max value: 37.7695 - type: nauc_mrr_at_1_std value: -41.0563 - type: nauc_mrr_at_1_diff1 value: 78.1854 - type: nauc_mrr_at_3_max value: 38.3824 - type: nauc_mrr_at_3_std value: -43.7797 - type: nauc_mrr_at_3_diff1 value: 77.0796 - type: nauc_mrr_at_5_max value: 38.5156 - type: nauc_mrr_at_5_std value: -43.8092 - type: nauc_mrr_at_5_diff1 value: 77.31710000000001 - type: nauc_mrr_at_10_max value: 38.523 - type: nauc_mrr_at_10_std value: -43.5039 - type: nauc_mrr_at_10_diff1 value: 77.375 - type: nauc_mrr_at_20_max value: 38.4635 - type: nauc_mrr_at_20_std value: -43.3619 - type: nauc_mrr_at_20_diff1 value: 77.3565 - type: nauc_mrr_at_100_max value: 38.4502 - type: nauc_mrr_at_100_std value: -43.3315 - type: nauc_mrr_at_100_diff1 value: 77.3584 - type: nauc_mrr_at_1000_max value: 38.449 - type: nauc_mrr_at_1000_std value: -43.3339 - type: nauc_mrr_at_1000_diff1 value: 77.3584 - type: main_score value: 88.735 task: type: Retrieval - dataset: config: default name: MTEB RedditClustering (default) revision: 24640382cdbf8abc73003fb0fa6d111a705499eb split: test type: mteb/reddit-clustering metrics: - type: v_measure value: 49.1271 - type: v_measure_std value: 4.5517 - type: main_score value: 49.1271 task: type: Clustering - dataset: config: default name: MTEB RedditClusteringP2P (default) revision: 385e3cb46b4cfa89021f56c4380204149d0efe33 split: test type: mteb/reddit-clustering-p2p metrics: - type: v_measure value: 61.0626 - type: v_measure_std value: 12.6364 - type: main_score value: 61.0626 task: type: Clustering - dataset: config: default name: MTEB SCIDOCS (default) revision: f8c2fcf00f625baaa80f62ec5bd9e1fff3b8ae88 split: test type: mteb/scidocs metrics: - type: ndcg_at_1 value: 23.7 - type: ndcg_at_3 value: 19.346 - type: ndcg_at_5 value: 17.044999999999998 - type: ndcg_at_10 value: 20.347 - type: ndcg_at_20 value: 23.237 - type: ndcg_at_100 value: 27.923 - type: ndcg_at_1000 value: 32.891999999999996 - type: map_at_1 value: 4.813 - type: map_at_3 value: 8.688 - type: map_at_5 value: 10.41 - type: map_at_10 value: 12.107999999999999 - type: map_at_20 value: 13.187 - type: map_at_100 value: 14.113000000000001 - type: map_at_1000 value: 14.383000000000001 - type: recall_at_1 value: 4.813 - type: recall_at_3 value: 11.022 - type: recall_at_5 value: 15.242 - type: recall_at_10 value: 21.308 - type: recall_at_20 value: 28.1 - type: recall_at_100 value: 43.335 - type: recall_at_1000 value: 67.672 - type: precision_at_1 value: 23.7 - type: precision_at_3 value: 18.099999999999998 - type: precision_at_5 value: 15.0 - type: precision_at_10 value: 10.48 - type: precision_at_20 value: 6.909999999999999 - type: precision_at_100 value: 2.133 - type: precision_at_1000 value: 0.333 - type: mrr_at_1 value: 23.7 - type: mrr_at_3 value: 31.35 - type: mrr_at_5 value: 33.650000000000006 - type: mrr_at_10 value: 34.9399 - type: mrr_at_20 value: 35.5429 - type: mrr_at_100 value: 35.9342 - type: mrr_at_1000 value: 35.9943 - type: nauc_ndcg_at_1_max value: 20.214499999999997 - type: nauc_ndcg_at_1_std value: 7.2459999999999996 - type: nauc_ndcg_at_1_diff1 value: 26.8353 - type: nauc_ndcg_at_3_max value: 23.3459 - type: nauc_ndcg_at_3_std value: 10.9732 - type: nauc_ndcg_at_3_diff1 value: 21.0618 - type: nauc_ndcg_at_5_max value: 24.5147 - type: nauc_ndcg_at_5_std value: 13.309000000000001 - type: nauc_ndcg_at_5_diff1 value: 20.0975 - type: nauc_ndcg_at_10_max value: 27.0937 - type: nauc_ndcg_at_10_std value: 16.4516 - type: nauc_ndcg_at_10_diff1 value: 19.9585 - type: nauc_ndcg_at_20_max value: 28.503600000000002 - type: nauc_ndcg_at_20_std value: 19.1956 - type: nauc_ndcg_at_20_diff1 value: 19.508200000000002 - type: nauc_ndcg_at_100_max value: 30.7317 - type: nauc_ndcg_at_100_std value: 23.2169 - type: nauc_ndcg_at_100_diff1 value: 19.7085 - type: nauc_ndcg_at_1000_max value: 30.3307 - type: nauc_ndcg_at_1000_std value: 24.7664 - type: nauc_ndcg_at_1000_diff1 value: 19.0469 - type: nauc_map_at_1_max value: 20.3702 - type: nauc_map_at_1_std value: 7.219200000000001 - type: nauc_map_at_1_diff1 value: 27.0193 - type: nauc_map_at_3_max value: 23.0558 - type: nauc_map_at_3_std value: 9.411999999999999 - type: nauc_map_at_3_diff1 value: 21.3691 - type: nauc_map_at_5_max value: 23.763 - type: nauc_map_at_5_std value: 11.228 - type: nauc_map_at_5_diff1 value: 20.4299 - type: nauc_map_at_10_max value: 25.6655 - type: nauc_map_at_10_std value: 14.0481 - type: nauc_map_at_10_diff1 value: 19.7937 - type: nauc_map_at_20_max value: 26.5994 - type: nauc_map_at_20_std value: 15.820400000000001 - type: nauc_map_at_20_diff1 value: 19.476499999999998 - type: nauc_map_at_100_max value: 27.4895 - type: nauc_map_at_100_std value: 17.262 - type: nauc_map_at_100_diff1 value: 19.4661 - type: nauc_map_at_1000_max value: 27.5301 - type: nauc_map_at_1000_std value: 17.4927 - type: nauc_map_at_1000_diff1 value: 19.4691 - type: nauc_recall_at_1_max value: 20.3702 - type: nauc_recall_at_1_std value: 7.219200000000001 - type: nauc_recall_at_1_diff1 value: 27.0193 - type: nauc_recall_at_3_max value: 23.6476 - type: nauc_recall_at_3_std value: 11.9176 - type: nauc_recall_at_3_diff1 value: 18.1657 - type: nauc_recall_at_5_max value: 24.8053 - type: nauc_recall_at_5_std value: 15.5205 - type: nauc_recall_at_5_diff1 value: 16.4924 - type: nauc_recall_at_10_max value: 27.9864 - type: nauc_recall_at_10_std value: 20.1496 - type: nauc_recall_at_10_diff1 value: 16.0154 - type: nauc_recall_at_20_max value: 29.0157 - type: nauc_recall_at_20_std value: 24.374100000000002 - type: nauc_recall_at_20_diff1 value: 14.174800000000001 - type: nauc_recall_at_100_max value: 31.245299999999997 - type: nauc_recall_at_100_std value: 32.161699999999996 - type: nauc_recall_at_100_diff1 value: 12.9714 - type: nauc_recall_at_1000_max value: 25.6486 - type: nauc_recall_at_1000_std value: 37.1526 - type: nauc_recall_at_1000_diff1 value: 6.0907 - type: nauc_precision_at_1_max value: 20.214499999999997 - type: nauc_precision_at_1_std value: 7.2459999999999996 - type: nauc_precision_at_1_diff1 value: 26.8353 - type: nauc_precision_at_3_max value: 23.8245 - type: nauc_precision_at_3_std value: 12.2589 - type: nauc_precision_at_3_diff1 value: 18.192800000000002 - type: nauc_precision_at_5_max value: 25.3681 - type: nauc_precision_at_5_std value: 15.947700000000001 - type: nauc_precision_at_5_diff1 value: 16.6931 - type: nauc_precision_at_10_max value: 28.2682 - type: nauc_precision_at_10_std value: 20.2673 - type: nauc_precision_at_10_diff1 value: 15.8977 - type: nauc_precision_at_20_max value: 29.3989 - type: nauc_precision_at_20_std value: 24.5769 - type: nauc_precision_at_20_diff1 value: 14.1994 - type: nauc_precision_at_100_max value: 31.418000000000003 - type: nauc_precision_at_100_std value: 32.0978 - type: nauc_precision_at_100_diff1 value: 12.768199999999998 - type: nauc_precision_at_1000_max value: 25.501099999999997 - type: nauc_precision_at_1000_std value: 36.477399999999996 - type: nauc_precision_at_1000_diff1 value: 5.5335 - type: nauc_mrr_at_1_max value: 20.214499999999997 - type: nauc_mrr_at_1_std value: 7.2459999999999996 - type: nauc_mrr_at_1_diff1 value: 26.8353 - type: nauc_mrr_at_3_max value: 22.7925 - type: nauc_mrr_at_3_std value: 10.6945 - type: nauc_mrr_at_3_diff1 value: 23.6308 - type: nauc_mrr_at_5_max value: 23.427799999999998 - type: nauc_mrr_at_5_std value: 11.8634 - type: nauc_mrr_at_5_diff1 value: 23.0875 - type: nauc_mrr_at_10_max value: 24.0918 - type: nauc_mrr_at_10_std value: 12.4753 - type: nauc_mrr_at_10_diff1 value: 23.352999999999998 - type: nauc_mrr_at_20_max value: 24.078 - type: nauc_mrr_at_20_std value: 12.5849 - type: nauc_mrr_at_20_diff1 value: 23.3351 - type: nauc_mrr_at_100_max value: 24.0858 - type: nauc_mrr_at_100_std value: 12.5772 - type: nauc_mrr_at_100_diff1 value: 23.4778 - type: nauc_mrr_at_1000_max value: 24.058799999999998 - type: nauc_mrr_at_1000_std value: 12.549 - type: nauc_mrr_at_1000_diff1 value: 23.4713 - type: main_score value: 20.347 task: type: Retrieval - dataset: config: default name: MTEB SICK-R (default) revision: 20a6d6f312dd54037fe07a32d58e5e168867909d split: test type: mteb/sickr-sts metrics: - type: pearson value: 75.7747 - type: spearman value: 71.3142 - type: cosine_pearson value: 75.7747 - type: cosine_spearman value: 71.3142 - type: manhattan_pearson value: 73.8759 - type: manhattan_spearman value: 71.1003 - type: euclidean_pearson value: 74.088 - type: euclidean_spearman value: 71.3142 - type: main_score value: 71.3142 task: type: STS - dataset: config: default name: MTEB STS12 (default) revision: a0d554a64d88156834ff5ae9920b964011b16384 split: test type: mteb/sts12-sts metrics: - type: pearson value: 72.5903 - type: spearman value: 70.6581 - type: cosine_pearson value: 72.5903 - type: cosine_spearman value: 70.6581 - type: manhattan_pearson value: 69.2077 - type: manhattan_spearman value: 70.4521 - type: euclidean_pearson value: 69.41720000000001 - type: euclidean_spearman value: 70.6581 - type: main_score value: 70.6581 task: type: STS - dataset: config: default name: MTEB STS13 (default) revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca split: test type: mteb/sts13-sts metrics: - type: pearson value: 73.1686 - type: spearman value: 77.4225 - type: cosine_pearson value: 73.1686 - type: cosine_spearman value: 77.4225 - type: manhattan_pearson value: 76.2481 - type: manhattan_spearman value: 77.325 - type: euclidean_pearson value: 76.3568 - type: euclidean_spearman value: 77.4225 - type: main_score value: 77.4225 task: type: STS - dataset: config: default name: MTEB STS14 (default) revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375 split: test type: mteb/sts14-sts metrics: - type: pearson value: 74.46340000000001 - type: spearman value: 72.9162 - type: cosine_pearson value: 74.46340000000001 - type: cosine_spearman value: 72.9162 - type: manhattan_pearson value: 73.8079 - type: manhattan_spearman value: 72.8704 - type: euclidean_pearson value: 73.8244 - type: euclidean_spearman value: 72.9162 - type: main_score value: 72.9162 task: type: STS - dataset: config: default name: MTEB STS15 (default) revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3 split: test type: mteb/sts15-sts metrics: - type: pearson value: 80.1161 - type: spearman value: 81.83200000000001 - type: cosine_pearson value: 80.1161 - type: cosine_spearman value: 81.83200000000001 - type: manhattan_pearson value: 81.573 - type: manhattan_spearman value: 81.807 - type: euclidean_pearson value: 81.59490000000001 - type: euclidean_spearman value: 81.83200000000001 - type: main_score value: 81.83200000000001 task: type: STS - dataset: config: default name: MTEB STS16 (default) revision: 4d8694f8f0e0100860b497b999b3dbed754a0513 split: test type: mteb/sts16-sts metrics: - type: pearson value: 78.8244 - type: spearman value: 81.2262 - type: cosine_pearson value: 78.8244 - type: cosine_spearman value: 81.2262 - type: manhattan_pearson value: 80.6177 - type: manhattan_spearman value: 81.1361 - type: euclidean_pearson value: 80.7347 - type: euclidean_spearman value: 81.2262 - type: main_score value: 81.2262 task: type: STS - dataset: config: es-en name: MTEB STS17 (es-en) revision: faeb762787bd10488a50c8b5be4a3b82e411949c split: test type: mteb/sts17-crosslingual-sts metrics: - type: pearson value: 67.9751 - type: spearman value: 68.92099999999999 - type: cosine_pearson value: 67.9751 - type: cosine_spearman value: 68.92099999999999 - type: manhattan_pearson value: 68.9355 - type: manhattan_spearman value: 68.777 - type: euclidean_pearson value: 69.11410000000001 - type: euclidean_spearman value: 68.92099999999999 - type: main_score value: 68.92099999999999 task: type: STS - dataset: config: fr-en name: MTEB STS17 (fr-en) revision: faeb762787bd10488a50c8b5be4a3b82e411949c split: test type: mteb/sts17-crosslingual-sts metrics: - type: pearson value: 72.08449999999999 - type: spearman value: 74.6931 - type: cosine_pearson value: 72.08449999999999 - type: cosine_spearman value: 74.6931 - type: manhattan_pearson value: 73.52 - type: manhattan_spearman value: 74.7097 - type: euclidean_pearson value: 73.62180000000001 - type: euclidean_spearman value: 74.6931 - type: main_score value: 74.6931 task: type: STS - dataset: config: en-en name: MTEB STS17 (en-en) revision: faeb762787bd10488a50c8b5be4a3b82e411949c split: test type: mteb/sts17-crosslingual-sts metrics: - type: pearson value: 80.528 - type: spearman value: 84.10459999999999 - type: cosine_pearson value: 80.528 - type: cosine_spearman value: 84.10459999999999 - type: manhattan_pearson value: 83.1537 - type: manhattan_spearman value: 84.0952 - type: euclidean_pearson value: 83.337 - type: euclidean_spearman value: 84.10459999999999 - type: main_score value: 84.10459999999999 task: type: STS - dataset: config: en-tr name: MTEB STS17 (en-tr) revision: faeb762787bd10488a50c8b5be4a3b82e411949c split: test type: mteb/sts17-crosslingual-sts metrics: - type: pearson value: 49.641400000000004 - type: spearman value: 48.9413 - type: cosine_pearson value: 49.641400000000004 - type: cosine_spearman value: 48.9413 - type: manhattan_pearson value: 51.434000000000005 - type: manhattan_spearman value: 49.1595 - type: euclidean_pearson value: 50.867799999999995 - type: euclidean_spearman value: 48.9413 - type: main_score value: 48.9413 task: type: STS - dataset: config: it-en name: MTEB STS17 (it-en) revision: faeb762787bd10488a50c8b5be4a3b82e411949c split: test type: mteb/sts17-crosslingual-sts metrics: - type: pearson value: 71.2577 - type: spearman value: 73.82419999999999 - type: cosine_pearson value: 71.2577 - type: cosine_spearman value: 73.82419999999999 - type: manhattan_pearson value: 71.9329 - type: manhattan_spearman value: 73.4651 - type: euclidean_pearson value: 72.2771 - type: euclidean_spearman value: 73.82419999999999 - type: main_score value: 73.82419999999999 task: type: STS - dataset: config: nl-en name: MTEB STS17 (nl-en) revision: faeb762787bd10488a50c8b5be4a3b82e411949c split: test type: mteb/sts17-crosslingual-sts metrics: - type: pearson value: 64.1562 - type: spearman value: 64.8766 - type: cosine_pearson value: 64.1562 - type: cosine_spearman value: 64.8766 - type: manhattan_pearson value: 64.16579999999999 - type: manhattan_spearman value: 64.1931 - type: euclidean_pearson value: 64.6169 - type: euclidean_spearman value: 64.8766 - type: main_score value: 64.8766 task: type: STS - dataset: config: en-ar name: MTEB STS17 (en-ar) revision: faeb762787bd10488a50c8b5be4a3b82e411949c split: test type: mteb/sts17-crosslingual-sts metrics: - type: pearson value: 42.257400000000004 - type: spearman value: 43.2176 - type: cosine_pearson value: 42.257400000000004 - type: cosine_spearman value: 43.2176 - type: manhattan_pearson value: 43.5359 - type: manhattan_spearman value: 42.4143 - type: euclidean_pearson value: 43.6717 - type: euclidean_spearman value: 43.2176 - type: main_score value: 43.2176 task: type: STS - dataset: config: en-de name: MTEB STS17 (en-de) revision: faeb762787bd10488a50c8b5be4a3b82e411949c split: test type: mteb/sts17-crosslingual-sts metrics: - type: pearson value: 74.0088 - type: spearman value: 75.8687 - type: cosine_pearson value: 74.0088 - type: cosine_spearman value: 75.8687 - type: manhattan_pearson value: 74.8505 - type: manhattan_spearman value: 75.6101 - type: euclidean_pearson value: 75.1303 - type: euclidean_spearman value: 75.8687 - type: main_score value: 75.8687 task: type: STS - dataset: config: zh-en name: MTEB STS22 (zh-en) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: pearson value: 68.0842 - type: spearman value: 69.4346 - type: cosine_pearson value: 68.0842 - type: cosine_spearman value: 69.4346 - type: manhattan_pearson value: 69.9982 - type: manhattan_spearman value: 69.8952 - type: euclidean_pearson value: 69.6375 - type: euclidean_spearman value: 69.4346 - type: main_score value: 69.4346 task: type: STS - dataset: config: es-en name: MTEB STS22 (es-en) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: pearson value: 76.3695 - type: spearman value: 78.88730000000001 - type: cosine_pearson value: 76.3695 - type: cosine_spearman value: 78.88730000000001 - type: manhattan_pearson value: 79.0721 - type: manhattan_spearman value: 79.1151 - type: euclidean_pearson value: 78.783 - type: euclidean_spearman value: 78.88730000000001 - type: main_score value: 78.88730000000001 task: type: STS - dataset: config: de-en name: MTEB STS22 (de-en) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: pearson value: 60.59139999999999 - type: spearman value: 52.692099999999996 - type: cosine_pearson value: 60.59139999999999 - type: cosine_spearman value: 52.692099999999996 - type: manhattan_pearson value: 64.66499999999999 - type: manhattan_spearman value: 53.09009999999999 - type: euclidean_pearson value: 64.5541 - type: euclidean_spearman value: 52.692099999999996 - type: main_score value: 52.692099999999996 task: type: STS - dataset: config: pl-en name: MTEB STS22 (pl-en) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: pearson value: 77.8405 - type: spearman value: 76.6188 - type: cosine_pearson value: 77.8405 - type: cosine_spearman value: 76.6188 - type: manhattan_pearson value: 76.6598 - type: manhattan_spearman value: 76.3583 - type: euclidean_pearson value: 77.1442 - type: euclidean_spearman value: 76.6188 - type: main_score value: 76.6188 task: type: STS - dataset: config: en name: MTEB STS22 (en) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: pearson value: 69.8017 - type: spearman value: 68.7734 - type: cosine_pearson value: 69.8017 - type: cosine_spearman value: 68.7734 - type: manhattan_pearson value: 70.6884 - type: manhattan_spearman value: 68.2974 - type: euclidean_pearson value: 70.7968 - type: euclidean_spearman value: 68.7734 - type: main_score value: 68.7734 task: type: STS - dataset: config: default name: MTEB STSBenchmark (default) revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831 split: test type: mteb/stsbenchmark-sts metrics: - type: pearson value: 73.3293 - type: spearman value: 76.00919999999999 - type: cosine_pearson value: 73.3293 - type: cosine_spearman value: 76.00919999999999 - type: manhattan_pearson value: 75.0184 - type: manhattan_spearman value: 75.8014 - type: euclidean_pearson value: 75.2638 - type: euclidean_spearman value: 76.00919999999999 - type: main_score value: 76.00919999999999 task: type: STS - dataset: config: default name: MTEB SciDocsRR (default) revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab split: test type: mteb/scidocs-reranking metrics: - type: map value: 77.3669 - type: mrr value: 93.5985 - type: nAUC_map_max value: 50.2355 - type: nAUC_map_std value: 65.5401 - type: nAUC_map_diff1 value: 9.6333 - type: nAUC_mrr_max value: 76.5201 - type: nAUC_mrr_std value: 74.7401 - type: nAUC_mrr_diff1 value: 53.170899999999996 - type: main_score value: 77.3669 task: type: Reranking - dataset: config: default name: MTEB SciFact (default) revision: 0228b52cf27578f30900b9e5271d331663a030d7 split: test type: mteb/scifact metrics: - type: ndcg_at_1 value: 61.0 - type: ndcg_at_3 value: 67.589 - type: ndcg_at_5 value: 68.948 - type: ndcg_at_10 value: 71.8 - type: ndcg_at_20 value: 72.595 - type: ndcg_at_100 value: 74.138 - type: ndcg_at_1000 value: 74.83800000000001 - type: map_at_1 value: 57.74399999999999 - type: map_at_3 value: 64.866 - type: map_at_5 value: 66.018 - type: map_at_10 value: 67.535 - type: map_at_20 value: 67.77 - type: map_at_100 value: 68.011 - type: map_at_1000 value: 68.042 - type: recall_at_1 value: 57.74399999999999 - type: recall_at_3 value: 71.906 - type: recall_at_5 value: 75.344 - type: recall_at_10 value: 83.2 - type: recall_at_20 value: 86.26700000000001 - type: recall_at_100 value: 94.333 - type: recall_at_1000 value: 99.667 - type: precision_at_1 value: 61.0 - type: precision_at_3 value: 26.111 - type: precision_at_5 value: 16.8 - type: precision_at_10 value: 9.5 - type: precision_at_20 value: 4.933 - type: precision_at_100 value: 1.073 - type: precision_at_1000 value: 0.11299999999999999 - type: mrr_at_1 value: 61.0 - type: mrr_at_3 value: 67.4444 - type: mrr_at_5 value: 68.0778 - type: mrr_at_10 value: 69.0483 - type: mrr_at_20 value: 69.2333 - type: mrr_at_100 value: 69.4403 - type: mrr_at_1000 value: 69.4708 - type: nauc_ndcg_at_1_max value: 53.481500000000004 - type: nauc_ndcg_at_1_std value: 8.227 - type: nauc_ndcg_at_1_diff1 value: 72.0771 - type: nauc_ndcg_at_3_max value: 57.0147 - type: nauc_ndcg_at_3_std value: 5.2435 - type: nauc_ndcg_at_3_diff1 value: 68.8841 - type: nauc_ndcg_at_5_max value: 57.4675 - type: nauc_ndcg_at_5_std value: 8.4709 - type: nauc_ndcg_at_5_diff1 value: 67.2977 - type: nauc_ndcg_at_10_max value: 60.3957 - type: nauc_ndcg_at_10_std value: 11.3174 - type: nauc_ndcg_at_10_diff1 value: 67.8332 - type: nauc_ndcg_at_20_max value: 60.3607 - type: nauc_ndcg_at_20_std value: 11.9948 - type: nauc_ndcg_at_20_diff1 value: 68.1122 - type: nauc_ndcg_at_100_max value: 59.5293 - type: nauc_ndcg_at_100_std value: 11.697799999999999 - type: nauc_ndcg_at_100_diff1 value: 68.453 - type: nauc_ndcg_at_1000_max value: 58.8931 - type: nauc_ndcg_at_1000_std value: 10.876199999999999 - type: nauc_ndcg_at_1000_diff1 value: 68.5746 - type: nauc_map_at_1_max value: 49.762299999999996 - type: nauc_map_at_1_std value: -0.2785 - type: nauc_map_at_1_diff1 value: 71.9072 - type: nauc_map_at_3_max value: 54.108599999999996 - type: nauc_map_at_3_std value: 2.0995 - type: nauc_map_at_3_diff1 value: 69.3459 - type: nauc_map_at_5_max value: 55.257 - type: nauc_map_at_5_std value: 5.5776 - type: nauc_map_at_5_diff1 value: 68.3314 - type: nauc_map_at_10_max value: 57.1506 - type: nauc_map_at_10_std value: 7.4561 - type: nauc_map_at_10_diff1 value: 68.8482 - type: nauc_map_at_20_max value: 57.126200000000004 - type: nauc_map_at_20_std value: 7.6833 - type: nauc_map_at_20_diff1 value: 68.9132 - type: nauc_map_at_100_max value: 56.9874 - type: nauc_map_at_100_std value: 7.7405 - type: nauc_map_at_100_diff1 value: 68.9371 - type: nauc_map_at_1000_max value: 56.959199999999996 - type: nauc_map_at_1000_std value: 7.709499999999999 - type: nauc_map_at_1000_diff1 value: 68.9444 - type: nauc_recall_at_1_max value: 49.762299999999996 - type: nauc_recall_at_1_std value: -0.2785 - type: nauc_recall_at_1_diff1 value: 71.9072 - type: nauc_recall_at_3_max value: 58.22580000000001 - type: nauc_recall_at_3_std value: 2.3135 - type: nauc_recall_at_3_diff1 value: 65.5868 - type: nauc_recall_at_5_max value: 60.4096 - type: nauc_recall_at_5_std value: 11.7662 - type: nauc_recall_at_5_diff1 value: 61.5815 - type: nauc_recall_at_10_max value: 72.74629999999999 - type: nauc_recall_at_10_std value: 22.148 - type: nauc_recall_at_10_diff1 value: 62.2401 - type: nauc_recall_at_20_max value: 74.9625 - type: nauc_recall_at_20_std value: 28.1358 - type: nauc_recall_at_20_diff1 value: 63.240700000000004 - type: nauc_recall_at_100_max value: 79.15910000000001 - type: nauc_recall_at_100_std value: 39.4162 - type: nauc_recall_at_100_diff1 value: 65.733 - type: nauc_recall_at_1000_max value: 100.0 - type: nauc_recall_at_1000_std value: 72.2222 - type: nauc_recall_at_1000_diff1 value: 72.2222 - type: nauc_precision_at_1_max value: 53.481500000000004 - type: nauc_precision_at_1_std value: 8.227 - type: nauc_precision_at_1_diff1 value: 72.0771 - type: nauc_precision_at_3_max value: 55.675799999999995 - type: nauc_precision_at_3_std value: 23.9615 - type: nauc_precision_at_3_diff1 value: 48.1199 - type: nauc_precision_at_5_max value: 50.503299999999996 - type: nauc_precision_at_5_std value: 36.9259 - type: nauc_precision_at_5_diff1 value: 31.769399999999997 - type: nauc_precision_at_10_max value: 45.4878 - type: nauc_precision_at_10_std value: 44.0469 - type: nauc_precision_at_10_diff1 value: 16.666900000000002 - type: nauc_precision_at_20_max value: 40.2908 - type: nauc_precision_at_20_std value: 47.330600000000004 - type: nauc_precision_at_20_diff1 value: 11.0043 - type: nauc_precision_at_100_max value: 27.4643 - type: nauc_precision_at_100_std value: 53.0014 - type: nauc_precision_at_100_diff1 value: -4.8238 - type: nauc_precision_at_1000_max value: 15.755099999999999 - type: nauc_precision_at_1000_std value: 56.634499999999996 - type: nauc_precision_at_1000_diff1 value: -21.124100000000002 - type: nauc_mrr_at_1_max value: 53.481500000000004 - type: nauc_mrr_at_1_std value: 8.227 - type: nauc_mrr_at_1_diff1 value: 72.0771 - type: nauc_mrr_at_3_max value: 57.6662 - type: nauc_mrr_at_3_std value: 9.2816 - type: nauc_mrr_at_3_diff1 value: 69.8276 - type: nauc_mrr_at_5_max value: 57.6565 - type: nauc_mrr_at_5_std value: 10.422099999999999 - type: nauc_mrr_at_5_diff1 value: 69.0964 - type: nauc_mrr_at_10_max value: 58.000099999999996 - type: nauc_mrr_at_10_std value: 10.957600000000001 - type: nauc_mrr_at_10_diff1 value: 69.0098 - type: nauc_mrr_at_20_max value: 58.0066 - type: nauc_mrr_at_20_std value: 11.0139 - type: nauc_mrr_at_20_diff1 value: 69.1278 - type: nauc_mrr_at_100_max value: 57.9072 - type: nauc_mrr_at_100_std value: 10.9621 - type: nauc_mrr_at_100_diff1 value: 69.1925 - type: nauc_mrr_at_1000_max value: 57.87949999999999 - type: nauc_mrr_at_1000_std value: 10.934199999999999 - type: nauc_mrr_at_1000_diff1 value: 69.2004 - type: main_score value: 71.8 task: type: Retrieval - dataset: config: default name: MTEB SprintDuplicateQuestions (default) revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46 split: test type: mteb/sprintduplicatequestions-pairclassification metrics: - type: similarity_accuracy value: 99.8248 - type: similarity_accuracy_threshold value: 74.6155 - type: similarity_f1 value: 91.12780000000001 - type: similarity_f1_threshold value: 74.2422 - type: similarity_precision value: 91.3568 - type: similarity_recall value: 90.9 - type: similarity_ap value: 96.00319999999999 - type: cosine_accuracy value: 99.8248 - type: cosine_accuracy_threshold value: 74.6155 - type: cosine_f1 value: 91.12780000000001 - type: cosine_f1_threshold value: 74.2422 - type: cosine_precision value: 91.3568 - type: cosine_recall value: 90.9 - type: cosine_ap value: 96.00319999999999 - type: manhattan_accuracy value: 99.8257 - type: manhattan_accuracy_threshold value: 1574.1653 - type: manhattan_f1 value: 91.1531 - type: manhattan_f1_threshold value: 1595.7924 - type: manhattan_precision value: 90.6126 - type: manhattan_recall value: 91.7 - type: manhattan_ap value: 95.9848 - type: euclidean_accuracy value: 99.8248 - type: euclidean_accuracy_threshold value: 71.2523 - type: euclidean_f1 value: 91.12780000000001 - type: euclidean_f1_threshold value: 71.7744 - type: euclidean_precision value: 91.3568 - type: euclidean_recall value: 90.9 - type: euclidean_ap value: 96.00319999999999 - type: dot_accuracy value: 99.8248 - type: dot_accuracy_threshold value: 74.6155 - type: dot_f1 value: 91.12780000000001 - type: dot_f1_threshold value: 74.2422 - type: dot_precision value: 91.3568 - type: dot_recall value: 90.9 - type: dot_ap value: 96.00319999999999 - type: max_accuracy value: 99.8257 - type: max_f1 value: 91.1531 - type: max_precision value: 91.3568 - type: max_recall value: 91.7 - type: max_ap value: 96.00319999999999 - type: main_score value: 96.00319999999999 task: type: PairClassification - dataset: config: default name: MTEB StackExchangeClustering (default) revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259 split: test type: mteb/stackexchange-clustering metrics: - type: v_measure value: 61.3985 - type: v_measure_std value: 5.2151000000000005 - type: main_score value: 61.3985 task: type: Clustering - dataset: config: default name: MTEB StackExchangeClusteringP2P (default) revision: 815ca46b2622cec33ccafc3735d572c266efdb44 split: test type: mteb/stackexchange-clustering-p2p metrics: - type: v_measure value: 36.1433 - type: v_measure_std value: 1.5853 - type: main_score value: 36.1433 task: type: Clustering - dataset: config: default name: MTEB StackOverflowDupQuestions (default) revision: e185fbe320c72810689fc5848eb6114e1ef5ec69 split: test type: mteb/stackoverflowdupquestions-reranking metrics: - type: map value: 50.47580000000001 - type: mrr value: 51.221399999999996 - type: nAUC_map_max value: 10.1311 - type: nAUC_map_std value: 6.239999999999999 - type: nAUC_map_diff1 value: 36.3486 - type: nAUC_mrr_max value: 10.9306 - type: nAUC_mrr_std value: 6.7909 - type: nAUC_mrr_diff1 value: 36.5536 - type: main_score value: 50.47580000000001 task: type: Reranking - dataset: config: default name: MTEB SummEval (default) revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c split: test type: mteb/summeval metrics: - type: pearson value: 29.8474 - type: spearman value: 29.391099999999998 - type: cosine_spearman value: 29.391099999999998 - type: cosine_pearson value: 29.8474 - type: dot_spearman value: 29.391099999999998 - type: dot_pearson value: 29.8474 - type: main_score value: 29.391099999999998 task: type: Summarization - dataset: config: default name: MTEB TRECCOVID (default) revision: bb9466bac8153a0349341eb1b22e06409e78ef4e split: test type: mteb/trec-covid metrics: - type: ndcg_at_1 value: 85.0 - type: ndcg_at_3 value: 84.58099999999999 - type: ndcg_at_5 value: 83.573 - type: ndcg_at_10 value: 80.285 - type: ndcg_at_20 value: 77.469 - type: ndcg_at_100 value: 63.524 - type: ndcg_at_1000 value: 56.839 - type: map_at_1 value: 0.22799999999999998 - type: map_at_3 value: 0.656 - type: map_at_5 value: 1.078 - type: map_at_10 value: 2.0389999999999997 - type: map_at_20 value: 3.7670000000000003 - type: map_at_100 value: 12.8 - type: map_at_1000 value: 31.575999999999997 - type: recall_at_1 value: 0.22799999999999998 - type: recall_at_3 value: 0.695 - type: recall_at_5 value: 1.151 - type: recall_at_10 value: 2.215 - type: recall_at_20 value: 4.232 - type: recall_at_100 value: 15.828000000000001 - type: recall_at_1000 value: 53.516 - type: precision_at_1 value: 90.0 - type: precision_at_3 value: 89.333 - type: precision_at_5 value: 88.8 - type: precision_at_10 value: 84.6 - type: precision_at_20 value: 81.6 - type: precision_at_100 value: 65.64 - type: precision_at_1000 value: 25.380000000000003 - type: mrr_at_1 value: 90.0 - type: mrr_at_3 value: 94.6667 - type: mrr_at_5 value: 94.6667 - type: mrr_at_10 value: 94.6667 - type: mrr_at_20 value: 94.6667 - type: mrr_at_100 value: 94.6667 - type: mrr_at_1000 value: 94.6667 - type: nauc_ndcg_at_1_max value: -5.4637 - type: nauc_ndcg_at_1_std value: 14.5981 - type: nauc_ndcg_at_1_diff1 value: 13.6414 - type: nauc_ndcg_at_3_max value: 10.9521 - type: nauc_ndcg_at_3_std value: 39.8204 - type: nauc_ndcg_at_3_diff1 value: -13.839799999999999 - type: nauc_ndcg_at_5_max value: 20.9664 - type: nauc_ndcg_at_5_std value: 50.876999999999995 - type: nauc_ndcg_at_5_diff1 value: -15.3559 - type: nauc_ndcg_at_10_max value: 34.053 - type: nauc_ndcg_at_10_std value: 59.1102 - type: nauc_ndcg_at_10_diff1 value: -23.3868 - type: nauc_ndcg_at_20_max value: 39.5081 - type: nauc_ndcg_at_20_std value: 70.287 - type: nauc_ndcg_at_20_diff1 value: -36.7999 - type: nauc_ndcg_at_100_max value: 38.8671 - type: nauc_ndcg_at_100_std value: 80.5875 - type: nauc_ndcg_at_100_diff1 value: -28.766599999999997 - type: nauc_ndcg_at_1000_max value: 45.4017 - type: nauc_ndcg_at_1000_std value: 73.1799 - type: nauc_ndcg_at_1000_diff1 value: -13.5374 - type: nauc_map_at_1_max value: -15.7901 - type: nauc_map_at_1_std value: -14.5481 - type: nauc_map_at_1_diff1 value: 35.3307 - type: nauc_map_at_3_max value: -4.8114 - type: nauc_map_at_3_std value: -8.3704 - type: nauc_map_at_3_diff1 value: 26.2918 - type: nauc_map_at_5_max value: -0.9780000000000001 - type: nauc_map_at_5_std value: -3.4821 - type: nauc_map_at_5_diff1 value: 25.469 - type: nauc_map_at_10_max value: 4.2075000000000005 - type: nauc_map_at_10_std value: 1.5897999999999999 - type: nauc_map_at_10_diff1 value: 20.0578 - type: nauc_map_at_20_max value: 11.1623 - type: nauc_map_at_20_std value: 13.4387 - type: nauc_map_at_20_diff1 value: 12.9992 - type: nauc_map_at_100_max value: 21.7341 - type: nauc_map_at_100_std value: 51.2629 - type: nauc_map_at_100_diff1 value: 6.3333 - type: nauc_map_at_1000_max value: 45.7524 - type: nauc_map_at_1000_std value: 79.5106 - type: nauc_map_at_1000_diff1 value: -16.2395 - type: nauc_recall_at_1_max value: -15.7901 - type: nauc_recall_at_1_std value: -14.5481 - type: nauc_recall_at_1_diff1 value: 35.3307 - type: nauc_recall_at_3_max value: -3.9641 - type: nauc_recall_at_3_std value: -11.6408 - type: nauc_recall_at_3_diff1 value: 26.243 - type: nauc_recall_at_5_max value: -1.3654 - type: nauc_recall_at_5_std value: -7.7433000000000005 - type: nauc_recall_at_5_diff1 value: 25.5058 - type: nauc_recall_at_10_max value: 0.6649999999999999 - type: nauc_recall_at_10_std value: -5.8116 - type: nauc_recall_at_10_diff1 value: 23.0906 - type: nauc_recall_at_20_max value: 4.398 - type: nauc_recall_at_20_std value: 2.5343999999999998 - type: nauc_recall_at_20_diff1 value: 17.0552 - type: nauc_recall_at_100_max value: 12.8082 - type: nauc_recall_at_100_std value: 32.912400000000005 - type: nauc_recall_at_100_diff1 value: 14.6836 - type: nauc_recall_at_1000_max value: 42.261500000000005 - type: nauc_recall_at_1000_std value: 60.5793 - type: nauc_recall_at_1000_diff1 value: -6.1521 - type: nauc_precision_at_1_max value: -7.077500000000001 - type: nauc_precision_at_1_std value: 19.7572 - type: nauc_precision_at_1_diff1 value: 21.9141 - type: nauc_precision_at_3_max value: 30.758799999999997 - type: nauc_precision_at_3_std value: 53.897099999999995 - type: nauc_precision_at_3_diff1 value: -25.885399999999997 - type: nauc_precision_at_5_max value: 43.5162 - type: nauc_precision_at_5_std value: 66.8874 - type: nauc_precision_at_5_diff1 value: -20.7483 - type: nauc_precision_at_10_max value: 46.7798 - type: nauc_precision_at_10_std value: 63.677499999999995 - type: nauc_precision_at_10_diff1 value: -21.1182 - type: nauc_precision_at_20_max value: 49.8621 - type: nauc_precision_at_20_std value: 79.1937 - type: nauc_precision_at_20_diff1 value: -38.9691 - type: nauc_precision_at_100_max value: 42.8699 - type: nauc_precision_at_100_std value: 83.7695 - type: nauc_precision_at_100_diff1 value: -26.794 - type: nauc_precision_at_1000_max value: 42.7819 - type: nauc_precision_at_1000_std value: 53.815900000000006 - type: nauc_precision_at_1000_diff1 value: -34.4047 - type: nauc_mrr_at_1_max value: -7.077500000000001 - type: nauc_mrr_at_1_std value: 19.7572 - type: nauc_mrr_at_1_diff1 value: 21.9141 - type: nauc_mrr_at_3_max value: -2.1212999999999997 - type: nauc_mrr_at_3_std value: 21.9859 - type: nauc_mrr_at_3_diff1 value: 25.0584 - type: nauc_mrr_at_5_max value: -2.1212999999999997 - type: nauc_mrr_at_5_std value: 21.9859 - type: nauc_mrr_at_5_diff1 value: 25.0584 - type: nauc_mrr_at_10_max value: -2.1212999999999997 - type: nauc_mrr_at_10_std value: 21.9859 - type: nauc_mrr_at_10_diff1 value: 25.0584 - type: nauc_mrr_at_20_max value: -2.1212999999999997 - type: nauc_mrr_at_20_std value: 21.9859 - type: nauc_mrr_at_20_diff1 value: 25.0584 - type: nauc_mrr_at_100_max value: -2.1212999999999997 - type: nauc_mrr_at_100_std value: 21.9859 - type: nauc_mrr_at_100_diff1 value: 25.0584 - type: nauc_mrr_at_1000_max value: -2.1212999999999997 - type: nauc_mrr_at_1000_std value: 21.9859 - type: nauc_mrr_at_1000_diff1 value: 25.0584 - type: main_score value: 80.285 task: type: Retrieval - dataset: config: default name: MTEB Touche2020 (default) revision: a34f9a33db75fa0cbb21bb5cfc3dae8dc8bec93f split: test type: mteb/touche2020 metrics: - type: ndcg_at_1 value: 33.672999999999995 - type: ndcg_at_3 value: 34.392 - type: ndcg_at_5 value: 32.606 - type: ndcg_at_10 value: 29.767 - type: ndcg_at_20 value: 30.353 - type: ndcg_at_100 value: 41.094 - type: ndcg_at_1000 value: 51.937 - type: map_at_1 value: 2.64 - type: map_at_3 value: 6.428000000000001 - type: map_at_5 value: 8.792 - type: map_at_10 value: 11.882 - type: map_at_20 value: 14.818000000000001 - type: map_at_100 value: 18.613 - type: map_at_1000 value: 20.233 - type: recall_at_1 value: 2.64 - type: recall_at_3 value: 7.951999999999999 - type: recall_at_5 value: 11.898 - type: recall_at_10 value: 18.782 - type: recall_at_20 value: 27.488 - type: recall_at_100 value: 51.337999999999994 - type: recall_at_1000 value: 84.399 - type: precision_at_1 value: 36.735 - type: precision_at_3 value: 36.735 - type: precision_at_5 value: 33.061 - type: precision_at_10 value: 26.122 - type: precision_at_20 value: 19.898 - type: precision_at_100 value: 8.429 - type: precision_at_1000 value: 1.5650000000000002 - type: mrr_at_1 value: 36.7347 - type: mrr_at_3 value: 51.7007 - type: mrr_at_5 value: 54.65989999999999 - type: mrr_at_10 value: 55.8868 - type: mrr_at_20 value: 56.2944 - type: mrr_at_100 value: 56.360200000000006 - type: mrr_at_1000 value: 56.360200000000006 - type: nauc_ndcg_at_1_max value: -23.0012 - type: nauc_ndcg_at_1_std value: -9.474 - type: nauc_ndcg_at_1_diff1 value: 15.5991 - type: nauc_ndcg_at_3_max value: -16.1454 - type: nauc_ndcg_at_3_std value: -26.226100000000002 - type: nauc_ndcg_at_3_diff1 value: 22.9111 - type: nauc_ndcg_at_5_max value: -20.3259 - type: nauc_ndcg_at_5_std value: -23.3106 - type: nauc_ndcg_at_5_diff1 value: 20.112199999999998 - type: nauc_ndcg_at_10_max value: -17.4616 - type: nauc_ndcg_at_10_std value: -15.5791 - type: nauc_ndcg_at_10_diff1 value: 13.2876 - type: nauc_ndcg_at_20_max value: -20.0683 - type: nauc_ndcg_at_20_std value: -10.979899999999999 - type: nauc_ndcg_at_20_diff1 value: 5.929 - type: nauc_ndcg_at_100_max value: -21.096899999999998 - type: nauc_ndcg_at_100_std value: 13.212399999999999 - type: nauc_ndcg_at_100_diff1 value: 3.9886 - type: nauc_ndcg_at_1000_max value: -14.1544 - type: nauc_ndcg_at_1000_std value: 19.5979 - type: nauc_ndcg_at_1000_diff1 value: 1.2742 - type: nauc_map_at_1_max value: -18.123900000000003 - type: nauc_map_at_1_std value: -17.8031 - type: nauc_map_at_1_diff1 value: 21.032899999999998 - type: nauc_map_at_3_max value: -6.7797 - type: nauc_map_at_3_std value: -28.810299999999998 - type: nauc_map_at_3_diff1 value: 16.2912 - type: nauc_map_at_5_max value: -7.620699999999999 - type: nauc_map_at_5_std value: -27.6982 - type: nauc_map_at_5_diff1 value: 14.813100000000002 - type: nauc_map_at_10_max value: -5.1492 - type: nauc_map_at_10_std value: -23.885 - type: nauc_map_at_10_diff1 value: 6.9926 - type: nauc_map_at_20_max value: -9.6331 - type: nauc_map_at_20_std value: -19.215 - type: nauc_map_at_20_diff1 value: 0.6491 - type: nauc_map_at_100_max value: -9.7297 - type: nauc_map_at_100_std value: -6.9502999999999995 - type: nauc_map_at_100_diff1 value: -1.5897999999999999 - type: nauc_map_at_1000_max value: -8.9517 - type: nauc_map_at_1000_std value: -3.9941999999999998 - type: nauc_map_at_1000_diff1 value: -2.8158 - type: nauc_recall_at_1_max value: -18.123900000000003 - type: nauc_recall_at_1_std value: -17.8031 - type: nauc_recall_at_1_diff1 value: 21.032899999999998 - type: nauc_recall_at_3_max value: -12.1006 - type: nauc_recall_at_3_std value: -35.3199 - type: nauc_recall_at_3_diff1 value: 12.044 - type: nauc_recall_at_5_max value: -15.7192 - type: nauc_recall_at_5_std value: -30.7299 - type: nauc_recall_at_5_diff1 value: 8.3249 - type: nauc_recall_at_10_max value: -13.3968 - type: nauc_recall_at_10_std value: -19.2107 - type: nauc_recall_at_10_diff1 value: 0.1315 - type: nauc_recall_at_20_max value: -19.5043 - type: nauc_recall_at_20_std value: -10.005500000000001 - type: nauc_recall_at_20_diff1 value: -7.197299999999999 - type: nauc_recall_at_100_max value: -21.4032 - type: nauc_recall_at_100_std value: 33.5358 - type: nauc_recall_at_100_diff1 value: -10.4876 - type: nauc_recall_at_1000_max value: 1.8395000000000001 - type: nauc_recall_at_1000_std value: 70.462 - type: nauc_recall_at_1000_diff1 value: -23.4072 - type: nauc_precision_at_1_max value: -23.0917 - type: nauc_precision_at_1_std value: -8.036999999999999 - type: nauc_precision_at_1_diff1 value: 19.354599999999998 - type: nauc_precision_at_3_max value: -11.3547 - type: nauc_precision_at_3_std value: -30.2495 - type: nauc_precision_at_3_diff1 value: 20.3126 - type: nauc_precision_at_5_max value: -17.2545 - type: nauc_precision_at_5_std value: -24.8896 - type: nauc_precision_at_5_diff1 value: 15.6276 - type: nauc_precision_at_10_max value: -11.5796 - type: nauc_precision_at_10_std value: -2.3662 - type: nauc_precision_at_10_diff1 value: 3.8091 - type: nauc_precision_at_20_max value: -11.9042 - type: nauc_precision_at_20_std value: 15.6577 - type: nauc_precision_at_20_diff1 value: -8.8878 - type: nauc_precision_at_100_max value: -0.5217 - type: nauc_precision_at_100_std value: 71.8387 - type: nauc_precision_at_100_diff1 value: -16.8714 - type: nauc_precision_at_1000_max value: 36.234300000000005 - type: nauc_precision_at_1000_std value: 37.5447 - type: nauc_precision_at_1000_diff1 value: -20.7229 - type: nauc_mrr_at_1_max value: -23.0917 - type: nauc_mrr_at_1_std value: -8.036999999999999 - type: nauc_mrr_at_1_diff1 value: 19.354599999999998 - type: nauc_mrr_at_3_max value: -27.9937 - type: nauc_mrr_at_3_std value: -26.519900000000003 - type: nauc_mrr_at_3_diff1 value: 20.288 - type: nauc_mrr_at_5_max value: -33.218599999999995 - type: nauc_mrr_at_5_std value: -23.857400000000002 - type: nauc_mrr_at_5_diff1 value: 15.978200000000001 - type: nauc_mrr_at_10_max value: -31.7904 - type: nauc_mrr_at_10_std value: -19.169900000000002 - type: nauc_mrr_at_10_diff1 value: 17.762700000000002 - type: nauc_mrr_at_20_max value: -30.44 - type: nauc_mrr_at_20_std value: -20.2867 - type: nauc_mrr_at_20_diff1 value: 18.895500000000002 - type: nauc_mrr_at_100_max value: -30.5404 - type: nauc_mrr_at_100_std value: -20.5699 - type: nauc_mrr_at_100_diff1 value: 18.7046 - type: nauc_mrr_at_1000_max value: -30.5404 - type: nauc_mrr_at_1000_std value: -20.5699 - type: nauc_mrr_at_1000_diff1 value: 18.7046 - type: main_score value: 29.767 task: type: Retrieval - dataset: config: default name: MTEB ToxicConversationsClassification (default) revision: edfaf9da55d3dd50d43143d90c1ac476895ae6de split: test type: mteb/toxic_conversations_50k metrics: - type: accuracy value: 64.8096 - type: f1 value: 49.844300000000004 - type: f1_weighted value: 72.5251 - type: ap value: 11.7519 - type: ap_weighted value: 11.7519 - type: main_score value: 64.8096 task: type: Classification - dataset: config: default name: MTEB TweetSentimentExtractionClassification (default) revision: d604517c81ca91fe16a244d1248fc021f9ecee7a split: test type: mteb/tweet_sentiment_extraction metrics: - type: accuracy value: 58.1692 - type: f1 value: 58.4408 - type: f1_weighted value: 57.565599999999996 - type: main_score value: 58.1692 task: type: Classification - dataset: config: default name: MTEB TwentyNewsgroupsClustering (default) revision: 6125ec4e24fa026cec8a478383ee943acfbd5449 split: test type: mteb/twentynewsgroups-clustering metrics: - type: v_measure value: 39.293 - type: v_measure_std value: 1.5684 - type: main_score value: 39.293 task: type: Clustering - dataset: config: default name: MTEB TwitterSemEval2015 (default) revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1 split: test type: mteb/twittersemeval2015-pairclassification metrics: - type: similarity_accuracy value: 83.29260000000001 - type: similarity_accuracy_threshold value: 78.2732 - type: similarity_f1 value: 60.656600000000005 - type: similarity_f1_threshold value: 73.4961 - type: similarity_precision value: 59.007 - type: similarity_recall value: 62.4011 - type: similarity_ap value: 64.7501 - type: cosine_accuracy value: 83.29260000000001 - type: cosine_accuracy_threshold value: 78.2732 - type: cosine_f1 value: 60.656600000000005 - type: cosine_f1_threshold value: 73.4961 - type: cosine_precision value: 59.007 - type: cosine_recall value: 62.4011 - type: cosine_ap value: 64.7501 - type: manhattan_accuracy value: 83.2986 - type: manhattan_accuracy_threshold value: 1476.7148 - type: manhattan_f1 value: 60.7459 - type: manhattan_f1_threshold value: 1607.9180000000001 - type: manhattan_precision value: 59.0581 - type: manhattan_recall value: 62.53300000000001 - type: manhattan_ap value: 64.76859999999999 - type: euclidean_accuracy value: 83.29260000000001 - type: euclidean_accuracy_threshold value: 65.9194 - type: euclidean_f1 value: 60.656600000000005 - type: euclidean_f1_threshold value: 72.8065 - type: euclidean_precision value: 59.007 - type: euclidean_recall value: 62.4011 - type: euclidean_ap value: 64.7501 - type: dot_accuracy value: 83.29260000000001 - type: dot_accuracy_threshold value: 78.2731 - type: dot_f1 value: 60.656600000000005 - type: dot_f1_threshold value: 73.4961 - type: dot_precision value: 59.007 - type: dot_recall value: 62.4011 - type: dot_ap value: 64.7501 - type: max_accuracy value: 83.2986 - type: max_f1 value: 60.7459 - type: max_precision value: 59.0581 - type: max_recall value: 62.53300000000001 - type: max_ap value: 64.76859999999999 - type: main_score value: 64.76859999999999 task: type: PairClassification - dataset: config: default name: MTEB TwitterURLCorpus (default) revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf split: test type: mteb/twitterurlcorpus-pairclassification metrics: - type: similarity_accuracy value: 89.0247 - type: similarity_accuracy_threshold value: 69.271 - type: similarity_f1 value: 78.24419999999999 - type: similarity_f1_threshold value: 66.2183 - type: similarity_precision value: 76.616 - type: similarity_recall value: 79.943 - type: similarity_ap value: 85.9494 - type: cosine_accuracy value: 89.0247 - type: cosine_accuracy_threshold value: 69.271 - type: cosine_f1 value: 78.24419999999999 - type: cosine_f1_threshold value: 66.2183 - type: cosine_precision value: 76.616 - type: cosine_recall value: 79.943 - type: cosine_ap value: 85.9494 - type: manhattan_accuracy value: 89.0267 - type: manhattan_accuracy_threshold value: 1750.3544000000002 - type: manhattan_f1 value: 78.2188 - type: manhattan_f1_threshold value: 1837.7304 - type: manhattan_precision value: 75.1472 - type: manhattan_recall value: 81.5522 - type: manhattan_ap value: 85.9496 - type: euclidean_accuracy value: 89.0247 - type: euclidean_accuracy_threshold value: 78.3951 - type: euclidean_f1 value: 78.24419999999999 - type: euclidean_f1_threshold value: 82.197 - type: euclidean_precision value: 76.616 - type: euclidean_recall value: 79.943 - type: euclidean_ap value: 85.9494 - type: dot_accuracy value: 89.0247 - type: dot_accuracy_threshold value: 69.271 - type: dot_f1 value: 78.24419999999999 - type: dot_f1_threshold value: 66.2183 - type: dot_precision value: 76.616 - type: dot_recall value: 79.943 - type: dot_ap value: 85.9494 - type: max_accuracy value: 89.0267 - type: max_f1 value: 78.24419999999999 - type: max_precision value: 76.616 - type: max_recall value: 81.5522 - type: max_ap value: 85.9496 - type: main_score value: 85.9496 task: type: PairClassification ---

Snowflake's Arctic-embed-m-v2.0

News | Models | Usage | Evaluation | Contact | FAQ License | Acknowledgement

## News - 12/11/2024: Release of Technical Report - 12/04/2024: Release of snowflake-arctic-embed-l-v2.0 and snowflake-arctic-embed-m-v2.0 our newest models with multilingual workloads in mind. ## Models Snowflake arctic-embed-m-v2.0 is the newest addition to the suite of embedding models Snowflake has released optimizing for retrieval performance and inference efficiency. Arctic Embed 2.0 introduces a new standard for multilingual embedding models, combining high-quality multilingual text retrieval without sacrificing performance in English. Released under the permissive Apache 2.0 license, Arctic Embed 2.0 is ideal for applications that demand reliable, enterprise-grade multilingual search and retrieval at scale. Key Features: 1. Multilingual without compromise: Excels in English and non-English retrieval, outperforming leading open-source and proprietary models on benchmarks like MTEB Retrieval, CLEF, and MIRACL. 2. Inference efficiency: Its 113m non-embedding parameters inference is fast and efficient for any scale. 3. Compression-friendly: Achieves high-quality retrieval with embeddings as small as 128 bytes/vector using Matryoshka Representation Learning (MRL) and quantization-aware embedding training. **Please note that like our v1.5 model, the MRL for this model is 256 dimensions, and high-quality 128-byte compression is achieved via 4-bit quantization (e.g. using a fast-scan FAISS index or using the example code published alongside our 1.5 model).** 4. Long Context Support: arctic-embed-m-v2.0 builds on GTE-multilingual-base which can support a context window of up to 8192 via the use of RoPE. ### Quality Benchmarks Unlike most other open-source models, Arctic-embed-m-v2.0 excels across English (via MTEB Retrieval) and multilingual (via MIRACL and CLEF). You no longer need to support models to empower high-quality English and multilingual retrieval. All numbers mentioned below are the average NDCG@10 across the dataset being discussed. | Model Name | # params | # non-emb params | # dimensions | BEIR (15) | MIRACL (4) | CLEF (Focused) | CLEF (Full) | |---|:---:|:---:|:---:|:---:|:---:|:---:|:---:| | **snowflake-arctic-m-v2.0** | 305M | 113M | 768 | **55.4** | 55.2 | **51.7** | **53.9** | | snowflake-arctic-m | 109M | 86M | 768 | 54.9 | 24.9 | 34.4 | 29.1 | | me5 base | 560M | 303M | 1024 | 51.4 | 54.0 | 43.0 | 34.6 | | bge-m3 (BAAI) | 568M | 303M | 1024 | 48.8 | **56.8** | 40.8 | 41.3 | | gte (Alibaba) | 305M | 113M | 768 | 51.1 | 52.3 | 47.7 | 53.1 | Aside from high-quality retrieval, arctic delivers embeddings that are easily compressible. By leveraging vector truncation via MRL to decrease vector size by 3x with about 3% degradation in quality. Combine MRLed vectors with vector compression (Int4) to power retrieval in 128 bytes per doc. | Model | | BEIR (15) | Relative Performance | MIRACL (4) | Relative Performance | CLEF (5) | Relative Performance | CLEF (Full) | Relative Performance | |---|---|:---:|:---:|:---:|:---:|:---:|---|---|---| | snowflake-arctic-m-v2.0 | 768 | 55.4 | N/A | 55.2 | N/A | 51.7 | N/A | 53.9 | N/A | | snowflake-arctic-m-v2.0 | 256 | 54.4 | -1.81% | 54.0 | -2.17% | 50.6 | -2.13% | 52.3 | -3.06% | ## Usage ### Using Sentence Transformers ### Using Huggingface Transformers You can use the transformers package to use Snowflake's arctic-embed model, as shown below. For optimal retrieval quality, use the CLS token to embed each text portion and use the query prefix below (just on the query). ### Using Huggingface Transformers.js If you haven't already, you can install the Transformers.js JavaScript library from NPM using: You can then use the model for retrieval, as follows: ## Contact Feel free to open an issue or pull request if you have any questions or suggestions about this project. You also can email Daniel Campos(daniel.campos@snowflake.com). ## License Arctic is licensed under the Apache-2. The released models can be used for commercial purposes free of charge.", + "model_explanation_gemini": "Generates sentence embeddings for similarity tasks and feature extraction across multiple languages. \n\n**Features:** \n- Pipeline: Sentence-similarity \n- Capabilities: Sentence-transformers, feature extraction \n- Supported languages: 70+ (e.g., en, es, fr, zh) \n- License: Apache-2.0 \n- Benchmark performance: Evaluated on MTEB tasks (classification, retrieval) \n\n**Comparison (implied from benchmarks):** \nPerforms competitively in classification (" +} \ No newline at end of file diff --git a/model_data_json/THUDM_glm-4-voice-tokenizer.json b/model_data_json/THUDM_glm-4-voice-tokenizer.json new file mode 100644 index 0000000000000000000000000000000000000000..48f4502600fc3fe76600e429d76b834f668d267b --- /dev/null +++ b/model_data_json/THUDM_glm-4-voice-tokenizer.json @@ -0,0 +1,11 @@ +{ + "model_id": "THUDM/glm-4-voice-tokenizer", + "downloads": 78346, + "tags": [ + "safetensors", + "whisper", + "region:us" + ], + "description": "# GLM-4-Voice-Tokenizer GLM-4-Voice 是智谱 AI 推出的端到端语音模型。GLM-4-Voice 能够直接理解和生成中英文语音,进行实时语音对话,并且能够根据用户的指令改变语音的情感、语调、语速、方言等属性。 GLM-4-Voice is an end-to-end voice model launched by Zhipu AI. GLM-4-Voice can directly understand and generate Chinese and English speech, engage in real-time voice conversations, and change attributes such as emotion, intonation, speech rate, and dialect based on user instructions. 本仓库是 GLM-4-Voice 的 speech tokenizer 部分。通过在 Whisper 的 encoder 部分增加 vector quantization 进行训练,将连续的语音输入转化为离散的 token。每秒音频转化为 12.5 个离散 token。 The repo provides the speech tokenzier of GLM-4-Voice, which is trained by adding vector quantization to the encoder part of Whisper and converts continuous speech input into discrete tokens. Each second of audio is converted into 12.5 discrete tokens. 更多信息请参考我们的仓库 GLM-4-Voice. For more information please refer to our repo GLM-4-Voice.", + "model_explanation_gemini": "\"GLM-4-Voice-Tokenizer converts continuous speech into discrete tokens (12.5 tokens per second) using Whisper's encoder with added vector quantization, enabling GLM-4-Voice's end-to-end multilingual speech understanding, generation, and real-time voice modulation (emotion, tone, speed, dialect).\" \n\n**Features**: \n1. Speech-to-token conversion (12.5 tokens/sec) \n2. Based on Whisper’s encoder + vector quantization" +} \ No newline at end of file diff --git a/model_data_json/TaylorAI_gte-tiny.json b/model_data_json/TaylorAI_gte-tiny.json new file mode 100644 index 0000000000000000000000000000000000000000..410c0d0cbe0cd09af3e9d308a1ffe4b72f3651a9 --- /dev/null +++ b/model_data_json/TaylorAI_gte-tiny.json @@ -0,0 +1,22 @@ +{ + "model_id": "TaylorAI/gte-tiny", + "downloads": 74970, + "tags": [ + "sentence-transformers", + "pytorch", + "onnx", + "safetensors", + "bert", + "feature-extraction", + "sentence-similarity", + "transformers", + "mteb", + "model-index", + "autotrain_compatible", + "text-embeddings-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- model-index: - name: gte_tiny results: - task: type: Classification dataset: type: mteb/amazon_counterfactual name: MTEB AmazonCounterfactualClassification (en) config: en split: test revision: e8379541af4e31359cca9fbcf4b00f2671dba205 metrics: - type: accuracy value: 71.76119402985076 - type: ap value: 34.63659287952359 - type: f1 value: 65.88939512571113 - task: type: Classification dataset: type: mteb/amazon_polarity name: MTEB AmazonPolarityClassification config: default split: test revision: e2d317d38cd51312af73b3d32a06d1a08b442046 metrics: - type: accuracy value: 86.61324999999998 - type: ap value: 81.7476302802319 - type: f1 value: 86.5863470912001 - task: type: Classification dataset: type: mteb/amazon_reviews_multi name: MTEB AmazonReviewsClassification (en) config: en split: test revision: 1399c76144fd37290681b995c656ef9b2e06e26d metrics: - type: accuracy value: 42.61000000000001 - type: f1 value: 42.2217180000715 - task: type: Retrieval dataset: type: arguana name: MTEB ArguAna config: default split: test revision: None metrics: - type: map_at_1 value: 28.377999999999997 - type: map_at_10 value: 44.565 - type: map_at_100 value: 45.48 - type: map_at_1000 value: 45.487 - type: map_at_3 value: 39.841 - type: map_at_5 value: 42.284 - type: mrr_at_1 value: 29.445 - type: mrr_at_10 value: 44.956 - type: mrr_at_100 value: 45.877 - type: mrr_at_1000 value: 45.884 - type: mrr_at_3 value: 40.209 - type: mrr_at_5 value: 42.719 - type: ndcg_at_1 value: 28.377999999999997 - type: ndcg_at_10 value: 53.638 - type: ndcg_at_100 value: 57.354000000000006 - type: ndcg_at_1000 value: 57.513000000000005 - type: ndcg_at_3 value: 43.701 - type: ndcg_at_5 value: 48.114000000000004 - type: precision_at_1 value: 28.377999999999997 - type: precision_at_10 value: 8.272 - type: precision_at_100 value: 0.984 - type: precision_at_1000 value: 0.1 - type: precision_at_3 value: 18.303 - type: precision_at_5 value: 13.129 - type: recall_at_1 value: 28.377999999999997 - type: recall_at_10 value: 82.717 - type: recall_at_100 value: 98.43499999999999 - type: recall_at_1000 value: 99.644 - type: recall_at_3 value: 54.908 - type: recall_at_5 value: 65.647 - task: type: Clustering dataset: type: mteb/arxiv-clustering-p2p name: MTEB ArxivClusteringP2P config: default split: test revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d metrics: - type: v_measure value: 46.637318326729876 - task: type: Clustering dataset: type: mteb/arxiv-clustering-s2s name: MTEB ArxivClusteringS2S config: default split: test revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53 metrics: - type: v_measure value: 36.01134479855804 - task: type: Reranking dataset: type: mteb/askubuntudupquestions-reranking name: MTEB AskUbuntuDupQuestions config: default split: test revision: 2000358ca161889fa9c082cb41daa8dcfb161a54 metrics: - type: map value: 59.82917555338909 - type: mrr value: 74.7888361254012 - task: type: STS dataset: type: mteb/biosses-sts name: MTEB BIOSSES config: default split: test revision: d3fb88f8f02e40887cd149695127462bbcf29b4a metrics: - type: cos_sim_pearson value: 87.1657730995964 - type: cos_sim_spearman value: 86.62787748941281 - type: euclidean_pearson value: 85.48127914481798 - type: euclidean_spearman value: 86.48148861167424 - type: manhattan_pearson value: 85.07496934780823 - type: manhattan_spearman value: 86.39473964708843 - task: type: Classification dataset: type: mteb/banking77 name: MTEB Banking77Classification config: default split: test revision: 0fd18e25b25c072e09e0d92ab615fda904d66300 metrics: - type: accuracy value: 81.73051948051948 - type: f1 value: 81.66368364988331 - task: type: Clustering dataset: type: mteb/biorxiv-clustering-p2p name: MTEB BiorxivClusteringP2P config: default split: test revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40 metrics: - type: v_measure value: 39.18623707448217 - task: type: Clustering dataset: type: mteb/biorxiv-clustering-s2s name: MTEB BiorxivClusteringS2S config: default split: test revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908 metrics: - type: v_measure value: 32.12697757150375 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackAndroidRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 29.160000000000004 - type: map_at_10 value: 40.474 - type: map_at_100 value: 41.905 - type: map_at_1000 value: 42.041000000000004 - type: map_at_3 value: 37.147000000000006 - type: map_at_5 value: 38.873999999999995 - type: mrr_at_1 value: 36.91 - type: mrr_at_10 value: 46.495999999999995 - type: mrr_at_100 value: 47.288000000000004 - type: mrr_at_1000 value: 47.339999999999996 - type: mrr_at_3 value: 43.777 - type: mrr_at_5 value: 45.257999999999996 - type: ndcg_at_1 value: 36.91 - type: ndcg_at_10 value: 46.722 - type: ndcg_at_100 value: 51.969 - type: ndcg_at_1000 value: 54.232 - type: ndcg_at_3 value: 41.783 - type: ndcg_at_5 value: 43.797000000000004 - type: precision_at_1 value: 36.91 - type: precision_at_10 value: 9.013 - type: precision_at_100 value: 1.455 - type: precision_at_1000 value: 0.193 - type: precision_at_3 value: 20.124 - type: precision_at_5 value: 14.363000000000001 - type: recall_at_1 value: 29.160000000000004 - type: recall_at_10 value: 58.521 - type: recall_at_100 value: 80.323 - type: recall_at_1000 value: 95.13000000000001 - type: recall_at_3 value: 44.205 - type: recall_at_5 value: 49.97 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackEnglishRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 27.750000000000004 - type: map_at_10 value: 36.39 - type: map_at_100 value: 37.5 - type: map_at_1000 value: 37.625 - type: map_at_3 value: 33.853 - type: map_at_5 value: 35.397 - type: mrr_at_1 value: 34.14 - type: mrr_at_10 value: 41.841 - type: mrr_at_100 value: 42.469 - type: mrr_at_1000 value: 42.521 - type: mrr_at_3 value: 39.724 - type: mrr_at_5 value: 40.955999999999996 - type: ndcg_at_1 value: 34.14 - type: ndcg_at_10 value: 41.409 - type: ndcg_at_100 value: 45.668 - type: ndcg_at_1000 value: 47.916 - type: ndcg_at_3 value: 37.836 - type: ndcg_at_5 value: 39.650999999999996 - type: precision_at_1 value: 34.14 - type: precision_at_10 value: 7.739 - type: precision_at_100 value: 1.2630000000000001 - type: precision_at_1000 value: 0.173 - type: precision_at_3 value: 18.217 - type: precision_at_5 value: 12.854 - type: recall_at_1 value: 27.750000000000004 - type: recall_at_10 value: 49.882 - type: recall_at_100 value: 68.556 - type: recall_at_1000 value: 83.186 - type: recall_at_3 value: 39.047 - type: recall_at_5 value: 44.458 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackGamingRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 36.879 - type: map_at_10 value: 48.878 - type: map_at_100 value: 49.918 - type: map_at_1000 value: 49.978 - type: map_at_3 value: 45.867999999999995 - type: map_at_5 value: 47.637 - type: mrr_at_1 value: 42.696 - type: mrr_at_10 value: 52.342 - type: mrr_at_100 value: 53.044000000000004 - type: mrr_at_1000 value: 53.077 - type: mrr_at_3 value: 50.01 - type: mrr_at_5 value: 51.437 - type: ndcg_at_1 value: 42.696 - type: ndcg_at_10 value: 54.469 - type: ndcg_at_100 value: 58.664 - type: ndcg_at_1000 value: 59.951 - type: ndcg_at_3 value: 49.419999999999995 - type: ndcg_at_5 value: 52.007000000000005 - type: precision_at_1 value: 42.696 - type: precision_at_10 value: 8.734 - type: precision_at_100 value: 1.1769999999999998 - type: precision_at_1000 value: 0.133 - type: precision_at_3 value: 22.027 - type: precision_at_5 value: 15.135000000000002 - type: recall_at_1 value: 36.879 - type: recall_at_10 value: 67.669 - type: recall_at_100 value: 85.822 - type: recall_at_1000 value: 95.092 - type: recall_at_3 value: 54.157999999999994 - type: recall_at_5 value: 60.436 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackGisRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 22.942 - type: map_at_10 value: 31.741999999999997 - type: map_at_100 value: 32.721000000000004 - type: map_at_1000 value: 32.809 - type: map_at_3 value: 29.17 - type: map_at_5 value: 30.714000000000002 - type: mrr_at_1 value: 24.746000000000002 - type: mrr_at_10 value: 33.517 - type: mrr_at_100 value: 34.451 - type: mrr_at_1000 value: 34.522000000000006 - type: mrr_at_3 value: 31.148999999999997 - type: mrr_at_5 value: 32.606 - type: ndcg_at_1 value: 24.746000000000002 - type: ndcg_at_10 value: 36.553000000000004 - type: ndcg_at_100 value: 41.53 - type: ndcg_at_1000 value: 43.811 - type: ndcg_at_3 value: 31.674000000000003 - type: ndcg_at_5 value: 34.241 - type: precision_at_1 value: 24.746000000000002 - type: precision_at_10 value: 5.684 - type: precision_at_100 value: 0.859 - type: precision_at_1000 value: 0.109 - type: precision_at_3 value: 13.597000000000001 - type: precision_at_5 value: 9.672 - type: recall_at_1 value: 22.942 - type: recall_at_10 value: 49.58 - type: recall_at_100 value: 72.614 - type: recall_at_1000 value: 89.89200000000001 - type: recall_at_3 value: 36.552 - type: recall_at_5 value: 42.702 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackMathematicaRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 15.345 - type: map_at_10 value: 22.428 - type: map_at_100 value: 23.756 - type: map_at_1000 value: 23.872 - type: map_at_3 value: 20.212 - type: map_at_5 value: 21.291 - type: mrr_at_1 value: 19.279 - type: mrr_at_10 value: 27.1 - type: mrr_at_100 value: 28.211000000000002 - type: mrr_at_1000 value: 28.279 - type: mrr_at_3 value: 24.813 - type: mrr_at_5 value: 25.889 - type: ndcg_at_1 value: 19.279 - type: ndcg_at_10 value: 27.36 - type: ndcg_at_100 value: 33.499 - type: ndcg_at_1000 value: 36.452 - type: ndcg_at_3 value: 23.233999999999998 - type: ndcg_at_5 value: 24.806 - type: precision_at_1 value: 19.279 - type: precision_at_10 value: 5.149 - type: precision_at_100 value: 0.938 - type: precision_at_1000 value: 0.133 - type: precision_at_3 value: 11.360000000000001 - type: precision_at_5 value: 8.035 - type: recall_at_1 value: 15.345 - type: recall_at_10 value: 37.974999999999994 - type: recall_at_100 value: 64.472 - type: recall_at_1000 value: 85.97200000000001 - type: recall_at_3 value: 26.203 - type: recall_at_5 value: 30.485 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackPhysicsRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 26.362000000000002 - type: map_at_10 value: 36.406 - type: map_at_100 value: 37.726 - type: map_at_1000 value: 37.84 - type: map_at_3 value: 33.425 - type: map_at_5 value: 35.043 - type: mrr_at_1 value: 32.146 - type: mrr_at_10 value: 41.674 - type: mrr_at_100 value: 42.478 - type: mrr_at_1000 value: 42.524 - type: mrr_at_3 value: 38.948 - type: mrr_at_5 value: 40.415 - type: ndcg_at_1 value: 32.146 - type: ndcg_at_10 value: 42.374 - type: ndcg_at_100 value: 47.919 - type: ndcg_at_1000 value: 50.013 - type: ndcg_at_3 value: 37.29 - type: ndcg_at_5 value: 39.531 - type: precision_at_1 value: 32.146 - type: precision_at_10 value: 7.767 - type: precision_at_100 value: 1.236 - type: precision_at_1000 value: 0.16 - type: precision_at_3 value: 17.965999999999998 - type: precision_at_5 value: 12.742999999999999 - type: recall_at_1 value: 26.362000000000002 - type: recall_at_10 value: 54.98800000000001 - type: recall_at_100 value: 78.50200000000001 - type: recall_at_1000 value: 92.146 - type: recall_at_3 value: 40.486 - type: recall_at_5 value: 46.236 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackProgrammersRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 24.417 - type: map_at_10 value: 33.161 - type: map_at_100 value: 34.357 - type: map_at_1000 value: 34.473 - type: map_at_3 value: 30.245 - type: map_at_5 value: 31.541999999999998 - type: mrr_at_1 value: 29.909000000000002 - type: mrr_at_10 value: 38.211 - type: mrr_at_100 value: 39.056999999999995 - type: mrr_at_1000 value: 39.114 - type: mrr_at_3 value: 35.769 - type: mrr_at_5 value: 36.922 - type: ndcg_at_1 value: 29.909000000000002 - type: ndcg_at_10 value: 38.694 - type: ndcg_at_100 value: 44.057 - type: ndcg_at_1000 value: 46.6 - type: ndcg_at_3 value: 33.822 - type: ndcg_at_5 value: 35.454 - type: precision_at_1 value: 29.909000000000002 - type: precision_at_10 value: 7.180000000000001 - type: precision_at_100 value: 1.153 - type: precision_at_1000 value: 0.155 - type: precision_at_3 value: 16.134 - type: precision_at_5 value: 11.256 - type: recall_at_1 value: 24.417 - type: recall_at_10 value: 50.260000000000005 - type: recall_at_100 value: 73.55699999999999 - type: recall_at_1000 value: 91.216 - type: recall_at_3 value: 35.971 - type: recall_at_5 value: 40.793 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 24.266916666666663 - type: map_at_10 value: 32.75025 - type: map_at_100 value: 33.91341666666667 - type: map_at_1000 value: 34.031749999999995 - type: map_at_3 value: 30.166416666666674 - type: map_at_5 value: 31.577000000000005 - type: mrr_at_1 value: 28.828166666666664 - type: mrr_at_10 value: 36.80991666666667 - type: mrr_at_100 value: 37.67075 - type: mrr_at_1000 value: 37.733 - type: mrr_at_3 value: 34.513416666666664 - type: mrr_at_5 value: 35.788 - type: ndcg_at_1 value: 28.828166666666664 - type: ndcg_at_10 value: 37.796 - type: ndcg_at_100 value: 42.94783333333333 - type: ndcg_at_1000 value: 45.38908333333333 - type: ndcg_at_3 value: 33.374750000000006 - type: ndcg_at_5 value: 35.379666666666665 - type: precision_at_1 value: 28.828166666666664 - type: precision_at_10 value: 6.615749999999999 - type: precision_at_100 value: 1.0848333333333333 - type: precision_at_1000 value: 0.1484166666666667 - type: precision_at_3 value: 15.347833333333332 - type: precision_at_5 value: 10.848916666666666 - type: recall_at_1 value: 24.266916666666663 - type: recall_at_10 value: 48.73458333333333 - type: recall_at_100 value: 71.56341666666667 - type: recall_at_1000 value: 88.63091666666668 - type: recall_at_3 value: 36.31208333333333 - type: recall_at_5 value: 41.55633333333333 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackStatsRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 23.497 - type: map_at_10 value: 30.249 - type: map_at_100 value: 30.947000000000003 - type: map_at_1000 value: 31.049 - type: map_at_3 value: 28.188000000000002 - type: map_at_5 value: 29.332 - type: mrr_at_1 value: 26.687 - type: mrr_at_10 value: 33.182 - type: mrr_at_100 value: 33.794999999999995 - type: mrr_at_1000 value: 33.873 - type: mrr_at_3 value: 31.263 - type: mrr_at_5 value: 32.428000000000004 - type: ndcg_at_1 value: 26.687 - type: ndcg_at_10 value: 34.252 - type: ndcg_at_100 value: 38.083 - type: ndcg_at_1000 value: 40.682 - type: ndcg_at_3 value: 30.464999999999996 - type: ndcg_at_5 value: 32.282 - type: precision_at_1 value: 26.687 - type: precision_at_10 value: 5.2909999999999995 - type: precision_at_100 value: 0.788 - type: precision_at_1000 value: 0.109 - type: precision_at_3 value: 13.037 - type: precision_at_5 value: 9.049 - type: recall_at_1 value: 23.497 - type: recall_at_10 value: 43.813 - type: recall_at_100 value: 61.88399999999999 - type: recall_at_1000 value: 80.926 - type: recall_at_3 value: 33.332 - type: recall_at_5 value: 37.862 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackTexRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 16.073 - type: map_at_10 value: 22.705000000000002 - type: map_at_100 value: 23.703 - type: map_at_1000 value: 23.833 - type: map_at_3 value: 20.593 - type: map_at_5 value: 21.7 - type: mrr_at_1 value: 19.683 - type: mrr_at_10 value: 26.39 - type: mrr_at_100 value: 27.264 - type: mrr_at_1000 value: 27.349 - type: mrr_at_3 value: 24.409 - type: mrr_at_5 value: 25.474000000000004 - type: ndcg_at_1 value: 19.683 - type: ndcg_at_10 value: 27.014 - type: ndcg_at_100 value: 31.948 - type: ndcg_at_1000 value: 35.125 - type: ndcg_at_3 value: 23.225 - type: ndcg_at_5 value: 24.866 - type: precision_at_1 value: 19.683 - type: precision_at_10 value: 4.948 - type: precision_at_100 value: 0.876 - type: precision_at_1000 value: 0.133 - type: precision_at_3 value: 10.943 - type: precision_at_5 value: 7.86 - type: recall_at_1 value: 16.073 - type: recall_at_10 value: 36.283 - type: recall_at_100 value: 58.745999999999995 - type: recall_at_1000 value: 81.711 - type: recall_at_3 value: 25.637 - type: recall_at_5 value: 29.919 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackUnixRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 25.776 - type: map_at_10 value: 33.317 - type: map_at_100 value: 34.437 - type: map_at_1000 value: 34.54 - type: map_at_3 value: 30.706 - type: map_at_5 value: 32.202999999999996 - type: mrr_at_1 value: 30.224 - type: mrr_at_10 value: 37.34 - type: mrr_at_100 value: 38.268 - type: mrr_at_1000 value: 38.335 - type: mrr_at_3 value: 35.075 - type: mrr_at_5 value: 36.348 - type: ndcg_at_1 value: 30.224 - type: ndcg_at_10 value: 38.083 - type: ndcg_at_100 value: 43.413000000000004 - type: ndcg_at_1000 value: 45.856 - type: ndcg_at_3 value: 33.437 - type: ndcg_at_5 value: 35.661 - type: precision_at_1 value: 30.224 - type: precision_at_10 value: 6.1850000000000005 - type: precision_at_100 value: 1.0030000000000001 - type: precision_at_1000 value: 0.132 - type: precision_at_3 value: 14.646 - type: precision_at_5 value: 10.428999999999998 - type: recall_at_1 value: 25.776 - type: recall_at_10 value: 48.787000000000006 - type: recall_at_100 value: 72.04899999999999 - type: recall_at_1000 value: 89.339 - type: recall_at_3 value: 36.192 - type: recall_at_5 value: 41.665 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackWebmastersRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 23.156 - type: map_at_10 value: 30.886000000000003 - type: map_at_100 value: 32.551 - type: map_at_1000 value: 32.769 - type: map_at_3 value: 28.584 - type: map_at_5 value: 29.959999999999997 - type: mrr_at_1 value: 28.260999999999996 - type: mrr_at_10 value: 35.555 - type: mrr_at_100 value: 36.687 - type: mrr_at_1000 value: 36.742999999999995 - type: mrr_at_3 value: 33.531 - type: mrr_at_5 value: 34.717 - type: ndcg_at_1 value: 28.260999999999996 - type: ndcg_at_10 value: 36.036 - type: ndcg_at_100 value: 42.675000000000004 - type: ndcg_at_1000 value: 45.303 - type: ndcg_at_3 value: 32.449 - type: ndcg_at_5 value: 34.293 - type: precision_at_1 value: 28.260999999999996 - type: precision_at_10 value: 6.837999999999999 - type: precision_at_100 value: 1.4569999999999999 - type: precision_at_1000 value: 0.23500000000000001 - type: precision_at_3 value: 15.217 - type: precision_at_5 value: 11.028 - type: recall_at_1 value: 23.156 - type: recall_at_10 value: 45.251999999999995 - type: recall_at_100 value: 75.339 - type: recall_at_1000 value: 91.56 - type: recall_at_3 value: 34.701 - type: recall_at_5 value: 39.922999999999995 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackWordpressRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 19.846 - type: map_at_10 value: 26.367 - type: map_at_100 value: 27.439999999999998 - type: map_at_1000 value: 27.552 - type: map_at_3 value: 24.006 - type: map_at_5 value: 25.230999999999998 - type: mrr_at_1 value: 21.257 - type: mrr_at_10 value: 28.071 - type: mrr_at_100 value: 29.037000000000003 - type: mrr_at_1000 value: 29.119 - type: mrr_at_3 value: 25.692999999999998 - type: mrr_at_5 value: 27.006000000000004 - type: ndcg_at_1 value: 21.257 - type: ndcg_at_10 value: 30.586000000000002 - type: ndcg_at_100 value: 35.949 - type: ndcg_at_1000 value: 38.728 - type: ndcg_at_3 value: 25.862000000000002 - type: ndcg_at_5 value: 27.967 - type: precision_at_1 value: 21.257 - type: precision_at_10 value: 4.861 - type: precision_at_100 value: 0.8130000000000001 - type: precision_at_1000 value: 0.116 - type: precision_at_3 value: 10.906 - type: precision_at_5 value: 7.763000000000001 - type: recall_at_1 value: 19.846 - type: recall_at_10 value: 41.805 - type: recall_at_100 value: 66.89699999999999 - type: recall_at_1000 value: 87.401 - type: recall_at_3 value: 29.261 - type: recall_at_5 value: 34.227000000000004 - task: type: Retrieval dataset: type: climate-fever name: MTEB ClimateFEVER config: default split: test revision: None metrics: - type: map_at_1 value: 10.333 - type: map_at_10 value: 17.14 - type: map_at_100 value: 18.878 - type: map_at_1000 value: 19.067 - type: map_at_3 value: 14.123 - type: map_at_5 value: 15.699 - type: mrr_at_1 value: 23.192 - type: mrr_at_10 value: 33.553 - type: mrr_at_100 value: 34.553 - type: mrr_at_1000 value: 34.603 - type: mrr_at_3 value: 29.848000000000003 - type: mrr_at_5 value: 32.18 - type: ndcg_at_1 value: 23.192 - type: ndcg_at_10 value: 24.707 - type: ndcg_at_100 value: 31.701 - type: ndcg_at_1000 value: 35.260999999999996 - type: ndcg_at_3 value: 19.492 - type: ndcg_at_5 value: 21.543 - type: precision_at_1 value: 23.192 - type: precision_at_10 value: 7.824000000000001 - type: precision_at_100 value: 1.52 - type: precision_at_1000 value: 0.218 - type: precision_at_3 value: 14.180000000000001 - type: precision_at_5 value: 11.530999999999999 - type: recall_at_1 value: 10.333 - type: recall_at_10 value: 30.142999999999997 - type: recall_at_100 value: 54.298 - type: recall_at_1000 value: 74.337 - type: recall_at_3 value: 17.602999999999998 - type: recall_at_5 value: 22.938 - task: type: Retrieval dataset: type: dbpedia-entity name: MTEB DBPedia config: default split: test revision: None metrics: - type: map_at_1 value: 8.03 - type: map_at_10 value: 17.345 - type: map_at_100 value: 23.462 - type: map_at_1000 value: 24.77 - type: map_at_3 value: 12.714 - type: map_at_5 value: 14.722 - type: mrr_at_1 value: 61.0 - type: mrr_at_10 value: 69.245 - type: mrr_at_100 value: 69.715 - type: mrr_at_1000 value: 69.719 - type: mrr_at_3 value: 67.583 - type: mrr_at_5 value: 68.521 - type: ndcg_at_1 value: 47.625 - type: ndcg_at_10 value: 35.973 - type: ndcg_at_100 value: 39.875 - type: ndcg_at_1000 value: 46.922000000000004 - type: ndcg_at_3 value: 40.574 - type: ndcg_at_5 value: 38.18 - type: precision_at_1 value: 61.0 - type: precision_at_10 value: 29.049999999999997 - type: precision_at_100 value: 8.828 - type: precision_at_1000 value: 1.8290000000000002 - type: precision_at_3 value: 45.333 - type: precision_at_5 value: 37.9 - type: recall_at_1 value: 8.03 - type: recall_at_10 value: 22.334 - type: recall_at_100 value: 45.919 - type: recall_at_1000 value: 68.822 - type: recall_at_3 value: 14.038999999999998 - type: recall_at_5 value: 17.118 - task: type: Classification dataset: type: mteb/emotion name: MTEB EmotionClassification config: default split: test revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37 metrics: - type: accuracy value: 44.714999999999996 - type: f1 value: 39.83929362259356 - task: type: Retrieval dataset: type: fever name: MTEB FEVER config: default split: test revision: None metrics: - type: map_at_1 value: 52.242999999999995 - type: map_at_10 value: 64.087 - type: map_at_100 value: 64.549 - type: map_at_1000 value: 64.567 - type: map_at_3 value: 61.667 - type: map_at_5 value: 63.266 - type: mrr_at_1 value: 56.271 - type: mrr_at_10 value: 68.146 - type: mrr_at_100 value: 68.524 - type: mrr_at_1000 value: 68.53200000000001 - type: mrr_at_3 value: 65.869 - type: mrr_at_5 value: 67.37100000000001 - type: ndcg_at_1 value: 56.271 - type: ndcg_at_10 value: 70.109 - type: ndcg_at_100 value: 72.09 - type: ndcg_at_1000 value: 72.479 - type: ndcg_at_3 value: 65.559 - type: ndcg_at_5 value: 68.242 - type: precision_at_1 value: 56.271 - type: precision_at_10 value: 9.286999999999999 - type: precision_at_100 value: 1.039 - type: precision_at_1000 value: 0.109 - type: precision_at_3 value: 26.308 - type: precision_at_5 value: 17.291 - type: recall_at_1 value: 52.242999999999995 - type: recall_at_10 value: 84.71 - type: recall_at_100 value: 93.309 - type: recall_at_1000 value: 96.013 - type: recall_at_3 value: 72.554 - type: recall_at_5 value: 79.069 - task: type: Retrieval dataset: type: fiqa name: MTEB FiQA2018 config: default split: test revision: None metrics: - type: map_at_1 value: 14.346 - type: map_at_10 value: 24.552 - type: map_at_100 value: 26.161 - type: map_at_1000 value: 26.345000000000002 - type: map_at_3 value: 21.208 - type: map_at_5 value: 22.959 - type: mrr_at_1 value: 29.166999999999998 - type: mrr_at_10 value: 38.182 - type: mrr_at_100 value: 39.22 - type: mrr_at_1000 value: 39.263 - type: mrr_at_3 value: 35.983 - type: mrr_at_5 value: 37.14 - type: ndcg_at_1 value: 29.166999999999998 - type: ndcg_at_10 value: 31.421 - type: ndcg_at_100 value: 38.129999999999995 - type: ndcg_at_1000 value: 41.569 - type: ndcg_at_3 value: 28.172000000000004 - type: ndcg_at_5 value: 29.029 - type: precision_at_1 value: 29.166999999999998 - type: precision_at_10 value: 8.997 - type: precision_at_100 value: 1.5709999999999997 - type: precision_at_1000 value: 0.22 - type: precision_at_3 value: 19.187 - type: precision_at_5 value: 13.980999999999998 - type: recall_at_1 value: 14.346 - type: recall_at_10 value: 37.963 - type: recall_at_100 value: 63.43299999999999 - type: recall_at_1000 value: 84.057 - type: recall_at_3 value: 26.119999999999997 - type: recall_at_5 value: 30.988 - task: type: Retrieval dataset: type: hotpotqa name: MTEB HotpotQA config: default split: test revision: None metrics: - type: map_at_1 value: 33.059 - type: map_at_10 value: 46.421 - type: map_at_100 value: 47.323 - type: map_at_1000 value: 47.403 - type: map_at_3 value: 43.553999999999995 - type: map_at_5 value: 45.283 - type: mrr_at_1 value: 66.117 - type: mrr_at_10 value: 73.10900000000001 - type: mrr_at_100 value: 73.444 - type: mrr_at_1000 value: 73.46000000000001 - type: mrr_at_3 value: 71.70400000000001 - type: mrr_at_5 value: 72.58099999999999 - type: ndcg_at_1 value: 66.117 - type: ndcg_at_10 value: 55.696999999999996 - type: ndcg_at_100 value: 59.167 - type: ndcg_at_1000 value: 60.809000000000005 - type: ndcg_at_3 value: 51.243 - type: ndcg_at_5 value: 53.627 - type: precision_at_1 value: 66.117 - type: precision_at_10 value: 11.538 - type: precision_at_100 value: 1.429 - type: precision_at_1000 value: 0.165 - type: precision_at_3 value: 31.861 - type: precision_at_5 value: 20.997 - type: recall_at_1 value: 33.059 - type: recall_at_10 value: 57.691 - type: recall_at_100 value: 71.458 - type: recall_at_1000 value: 82.35 - type: recall_at_3 value: 47.792 - type: recall_at_5 value: 52.492000000000004 - task: type: Classification dataset: type: mteb/imdb name: MTEB ImdbClassification config: default split: test revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7 metrics: - type: accuracy value: 80.544 - type: ap value: 74.69592367984956 - type: f1 value: 80.51138138449883 - task: type: Retrieval dataset: type: msmarco name: MTEB MSMARCO config: default split: dev revision: None metrics: - type: map_at_1 value: 17.095 - type: map_at_10 value: 28.038999999999998 - type: map_at_100 value: 29.246 - type: map_at_1000 value: 29.311 - type: map_at_3 value: 24.253 - type: map_at_5 value: 26.442 - type: mrr_at_1 value: 17.535999999999998 - type: mrr_at_10 value: 28.53 - type: mrr_at_100 value: 29.697000000000003 - type: mrr_at_1000 value: 29.755 - type: mrr_at_3 value: 24.779999999999998 - type: mrr_at_5 value: 26.942 - type: ndcg_at_1 value: 17.549999999999997 - type: ndcg_at_10 value: 34.514 - type: ndcg_at_100 value: 40.497 - type: ndcg_at_1000 value: 42.17 - type: ndcg_at_3 value: 26.764 - type: ndcg_at_5 value: 30.678 - type: precision_at_1 value: 17.549999999999997 - type: precision_at_10 value: 5.692 - type: precision_at_100 value: 0.8699999999999999 - type: precision_at_1000 value: 0.101 - type: precision_at_3 value: 11.562 - type: precision_at_5 value: 8.917 - type: recall_at_1 value: 17.095 - type: recall_at_10 value: 54.642 - type: recall_at_100 value: 82.652 - type: recall_at_1000 value: 95.555 - type: recall_at_3 value: 33.504 - type: recall_at_5 value: 42.925000000000004 - task: type: Classification dataset: type: mteb/mtop_domain name: MTEB MTOPDomainClassification (en) config: en split: test revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf metrics: - type: accuracy value: 91.75558595531236 - type: f1 value: 91.25979279648296 - task: type: Classification dataset: type: mteb/mtop_intent name: MTEB MTOPIntentClassification (en) config: en split: test revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba metrics: - type: accuracy value: 69.90424076607387 - type: f1 value: 52.067408707562244 - task: type: Classification dataset: type: mteb/amazon_massive_intent name: MTEB MassiveIntentClassification (en) config: en split: test revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 metrics: - type: accuracy value: 70.13449899125757 - type: f1 value: 67.62456762910598 - task: type: Classification dataset: type: mteb/amazon_massive_scenario name: MTEB MassiveScenarioClassification (en) config: en split: test revision: 7d571f92784cd94a019292a1f45445077d0ef634 metrics: - type: accuracy value: 74.862138533961 - type: f1 value: 74.66457222091381 - task: type: Clustering dataset: type: mteb/medrxiv-clustering-p2p name: MTEB MedrxivClusteringP2P config: default split: test revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73 metrics: - type: v_measure value: 34.10761942610792 - task: type: Clustering dataset: type: mteb/medrxiv-clustering-s2s name: MTEB MedrxivClusteringS2S config: default split: test revision: 35191c8c0dca72d8ff3efcd72aa802307d469663 metrics: - type: v_measure value: 31.673172170578408 - task: type: Reranking dataset: type: mteb/mind_small name: MTEB MindSmallReranking config: default split: test revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69 metrics: - type: map value: 32.058704977250315 - type: mrr value: 33.24327760839221 - task: type: Retrieval dataset: type: nfcorpus name: MTEB NFCorpus config: default split: test revision: None metrics: - type: map_at_1 value: 5.163 - type: map_at_10 value: 11.652999999999999 - type: map_at_100 value: 14.849 - type: map_at_1000 value: 16.253999999999998 - type: map_at_3 value: 8.616999999999999 - type: map_at_5 value: 10.100000000000001 - type: mrr_at_1 value: 44.272 - type: mrr_at_10 value: 52.25 - type: mrr_at_100 value: 52.761 - type: mrr_at_1000 value: 52.811 - type: mrr_at_3 value: 50.31 - type: mrr_at_5 value: 51.347 - type: ndcg_at_1 value: 42.105 - type: ndcg_at_10 value: 32.044 - type: ndcg_at_100 value: 29.763 - type: ndcg_at_1000 value: 38.585 - type: ndcg_at_3 value: 36.868 - type: ndcg_at_5 value: 35.154999999999994 - type: precision_at_1 value: 43.653 - type: precision_at_10 value: 23.622 - type: precision_at_100 value: 7.7490000000000006 - type: precision_at_1000 value: 2.054 - type: precision_at_3 value: 34.262 - type: precision_at_5 value: 30.154999999999998 - type: recall_at_1 value: 5.163 - type: recall_at_10 value: 15.478 - type: recall_at_100 value: 30.424 - type: recall_at_1000 value: 62.67 - type: recall_at_3 value: 9.615 - type: recall_at_5 value: 12.369 - task: type: Retrieval dataset: type: nq name: MTEB NQ config: default split: test revision: None metrics: - type: map_at_1 value: 21.618000000000002 - type: map_at_10 value: 35.465 - type: map_at_100 value: 36.712 - type: map_at_1000 value: 36.757 - type: map_at_3 value: 31.189 - type: map_at_5 value: 33.537 - type: mrr_at_1 value: 24.305 - type: mrr_at_10 value: 37.653 - type: mrr_at_100 value: 38.662 - type: mrr_at_1000 value: 38.694 - type: mrr_at_3 value: 33.889 - type: mrr_at_5 value: 35.979 - type: ndcg_at_1 value: 24.305 - type: ndcg_at_10 value: 43.028 - type: ndcg_at_100 value: 48.653999999999996 - type: ndcg_at_1000 value: 49.733 - type: ndcg_at_3 value: 34.768 - type: ndcg_at_5 value: 38.753 - type: precision_at_1 value: 24.305 - type: precision_at_10 value: 7.59 - type: precision_at_100 value: 1.076 - type: precision_at_1000 value: 0.11800000000000001 - type: precision_at_3 value: 16.271 - type: precision_at_5 value: 12.068 - type: recall_at_1 value: 21.618000000000002 - type: recall_at_10 value: 63.977 - type: recall_at_100 value: 89.03999999999999 - type: recall_at_1000 value: 97.10600000000001 - type: recall_at_3 value: 42.422 - type: recall_at_5 value: 51.629000000000005 - task: type: Retrieval dataset: type: quora name: MTEB QuoraRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 69.405 - type: map_at_10 value: 83.05 - type: map_at_100 value: 83.684 - type: map_at_1000 value: 83.70400000000001 - type: map_at_3 value: 80.08800000000001 - type: map_at_5 value: 81.937 - type: mrr_at_1 value: 79.85 - type: mrr_at_10 value: 86.369 - type: mrr_at_100 value: 86.48599999999999 - type: mrr_at_1000 value: 86.48700000000001 - type: mrr_at_3 value: 85.315 - type: mrr_at_5 value: 86.044 - type: ndcg_at_1 value: 79.86999999999999 - type: ndcg_at_10 value: 87.04499999999999 - type: ndcg_at_100 value: 88.373 - type: ndcg_at_1000 value: 88.531 - type: ndcg_at_3 value: 84.04 - type: ndcg_at_5 value: 85.684 - type: precision_at_1 value: 79.86999999999999 - type: precision_at_10 value: 13.183 - type: precision_at_100 value: 1.51 - type: precision_at_1000 value: 0.156 - type: precision_at_3 value: 36.67 - type: precision_at_5 value: 24.12 - type: recall_at_1 value: 69.405 - type: recall_at_10 value: 94.634 - type: recall_at_100 value: 99.214 - type: recall_at_1000 value: 99.958 - type: recall_at_3 value: 85.992 - type: recall_at_5 value: 90.656 - task: type: Clustering dataset: type: mteb/reddit-clustering name: MTEB RedditClustering config: default split: test revision: 24640382cdbf8abc73003fb0fa6d111a705499eb metrics: - type: v_measure value: 50.191676323145465 - task: type: Clustering dataset: type: mteb/reddit-clustering-p2p name: MTEB RedditClusteringP2P config: default split: test revision: 282350215ef01743dc01b456c7f5241fa8937f16 metrics: - type: v_measure value: 56.4874020363744 - task: type: Retrieval dataset: type: scidocs name: MTEB SCIDOCS config: default split: test revision: None metrics: - type: map_at_1 value: 4.228 - type: map_at_10 value: 11.245 - type: map_at_100 value: 13.353000000000002 - type: map_at_1000 value: 13.665 - type: map_at_3 value: 7.779999999999999 - type: map_at_5 value: 9.405 - type: mrr_at_1 value: 20.9 - type: mrr_at_10 value: 31.657999999999998 - type: mrr_at_100 value: 32.769999999999996 - type: mrr_at_1000 value: 32.833 - type: mrr_at_3 value: 28.333000000000002 - type: mrr_at_5 value: 30.043 - type: ndcg_at_1 value: 20.9 - type: ndcg_at_10 value: 19.073 - type: ndcg_at_100 value: 27.055 - type: ndcg_at_1000 value: 32.641 - type: ndcg_at_3 value: 17.483999999999998 - type: ndcg_at_5 value: 15.42 - type: precision_at_1 value: 20.9 - type: precision_at_10 value: 10.17 - type: precision_at_100 value: 2.162 - type: precision_at_1000 value: 0.35100000000000003 - type: precision_at_3 value: 16.467000000000002 - type: precision_at_5 value: 13.68 - type: recall_at_1 value: 4.228 - type: recall_at_10 value: 20.573 - type: recall_at_100 value: 43.887 - type: recall_at_1000 value: 71.22 - type: recall_at_3 value: 10.023 - type: recall_at_5 value: 13.873 - task: type: STS dataset: type: mteb/sickr-sts name: MTEB SICK-R config: default split: test revision: a6ea5a8cab320b040a23452cc28066d9beae2cee metrics: - type: cos_sim_pearson value: 82.77965135067481 - type: cos_sim_spearman value: 75.85121335808076 - type: euclidean_pearson value: 80.09115175262697 - type: euclidean_spearman value: 75.72249155647123 - type: manhattan_pearson value: 79.89723577351782 - type: manhattan_spearman value: 75.49855259442387 - task: type: STS dataset: type: mteb/sts12-sts name: MTEB STS12 config: default split: test revision: a0d554a64d88156834ff5ae9920b964011b16384 metrics: - type: cos_sim_pearson value: 80.46084116030949 - type: cos_sim_spearman value: 72.57579204392951 - type: euclidean_pearson value: 76.39020830763684 - type: euclidean_spearman value: 72.3718627025895 - type: manhattan_pearson value: 76.6148833027359 - type: manhattan_spearman value: 72.57570008442319 - task: type: STS dataset: type: mteb/sts13-sts name: MTEB STS13 config: default split: test revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca metrics: - type: cos_sim_pearson value: 80.43678068337017 - type: cos_sim_spearman value: 82.38941154076062 - type: euclidean_pearson value: 81.59260573633661 - type: euclidean_spearman value: 82.31144262574114 - type: manhattan_pearson value: 81.43266909137056 - type: manhattan_spearman value: 82.14704293004861 - task: type: STS dataset: type: mteb/sts14-sts name: MTEB STS14 config: default split: test revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375 metrics: - type: cos_sim_pearson value: 80.73713431763163 - type: cos_sim_spearman value: 77.97860512809388 - type: euclidean_pearson value: 80.35755041527027 - type: euclidean_spearman value: 78.021703511412 - type: manhattan_pearson value: 80.24440317109162 - type: manhattan_spearman value: 77.93165415697575 - task: type: STS dataset: type: mteb/sts15-sts name: MTEB STS15 config: default split: test revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3 metrics: - type: cos_sim_pearson value: 85.15111852351204 - type: cos_sim_spearman value: 86.54032447238258 - type: euclidean_pearson value: 86.14157021537433 - type: euclidean_spearman value: 86.67537291929713 - type: manhattan_pearson value: 86.081041854808 - type: manhattan_spearman value: 86.61561701560558 - task: type: STS dataset: type: mteb/sts16-sts name: MTEB STS16 config: default split: test revision: 4d8694f8f0e0100860b497b999b3dbed754a0513 metrics: - type: cos_sim_pearson value: 81.34532445104026 - type: cos_sim_spearman value: 83.31325001474116 - type: euclidean_pearson value: 82.81892375201032 - type: euclidean_spearman value: 83.4521695148055 - type: manhattan_pearson value: 82.72503790526163 - type: manhattan_spearman value: 83.37833652941349 - task: type: STS dataset: type: mteb/sts17-crosslingual-sts name: MTEB STS17 (en-en) config: en-en split: test revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d metrics: - type: cos_sim_pearson value: 87.25463453839801 - type: cos_sim_spearman value: 88.27655263515948 - type: euclidean_pearson value: 88.0248334411439 - type: euclidean_spearman value: 88.18141448876868 - type: manhattan_pearson value: 87.8080451127279 - type: manhattan_spearman value: 88.01028114423058 - task: type: STS dataset: type: mteb/sts22-crosslingual-sts name: MTEB STS22 (en) config: en split: test revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80 metrics: - type: cos_sim_pearson value: 63.57551045355218 - type: cos_sim_spearman value: 66.67614095126629 - type: euclidean_pearson value: 66.0787243112528 - type: euclidean_spearman value: 66.83660560636939 - type: manhattan_pearson value: 66.74684019662031 - type: manhattan_spearman value: 67.11761598074368 - task: type: STS dataset: type: mteb/stsbenchmark-sts name: MTEB STSBenchmark config: default split: test revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831 metrics: - type: cos_sim_pearson value: 83.70881496766829 - type: cos_sim_spearman value: 84.37803542941634 - type: euclidean_pearson value: 84.84501245857096 - type: euclidean_spearman value: 84.47088079741476 - type: manhattan_pearson value: 84.77244090794765 - type: manhattan_spearman value: 84.43307343706205 - task: type: Reranking dataset: type: mteb/scidocs-reranking name: MTEB SciDocsRR config: default split: test revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab metrics: - type: map value: 81.53946254759089 - type: mrr value: 94.68259953554072 - task: type: Retrieval dataset: type: scifact name: MTEB SciFact config: default split: test revision: None metrics: - type: map_at_1 value: 51.817 - type: map_at_10 value: 62.339999999999996 - type: map_at_100 value: 62.88 - type: map_at_1000 value: 62.909000000000006 - type: map_at_3 value: 59.004 - type: map_at_5 value: 60.906000000000006 - type: mrr_at_1 value: 54.333 - type: mrr_at_10 value: 63.649 - type: mrr_at_100 value: 64.01 - type: mrr_at_1000 value: 64.039 - type: mrr_at_3 value: 61.056 - type: mrr_at_5 value: 62.639 - type: ndcg_at_1 value: 54.333 - type: ndcg_at_10 value: 67.509 - type: ndcg_at_100 value: 69.69999999999999 - type: ndcg_at_1000 value: 70.613 - type: ndcg_at_3 value: 61.729 - type: ndcg_at_5 value: 64.696 - type: precision_at_1 value: 54.333 - type: precision_at_10 value: 9.2 - type: precision_at_100 value: 1.043 - type: precision_at_1000 value: 0.11199999999999999 - type: precision_at_3 value: 24.0 - type: precision_at_5 value: 16.2 - type: recall_at_1 value: 51.817 - type: recall_at_10 value: 82.056 - type: recall_at_100 value: 91.667 - type: recall_at_1000 value: 99.0 - type: recall_at_3 value: 66.717 - type: recall_at_5 value: 74.17200000000001 - task: type: PairClassification dataset: type: mteb/sprintduplicatequestions-pairclassification name: MTEB SprintDuplicateQuestions config: default split: test revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46 metrics: - type: cos_sim_accuracy value: 99.82475247524752 - type: cos_sim_ap value: 95.4781199603258 - type: cos_sim_f1 value: 91.16186693147964 - type: cos_sim_precision value: 90.53254437869822 - type: cos_sim_recall value: 91.8 - type: dot_accuracy value: 99.75049504950495 - type: dot_ap value: 93.05183539809457 - type: dot_f1 value: 87.31117824773412 - type: dot_precision value: 87.93103448275862 - type: dot_recall value: 86.7 - type: euclidean_accuracy value: 99.82475247524752 - type: euclidean_ap value: 95.38547978154382 - type: euclidean_f1 value: 91.16325511732403 - type: euclidean_precision value: 91.02691924227318 - type: euclidean_recall value: 91.3 - type: manhattan_accuracy value: 99.82574257425742 - type: manhattan_ap value: 95.47237521890308 - type: manhattan_f1 value: 91.27849355797821 - type: manhattan_precision value: 90.47151277013754 - type: manhattan_recall value: 92.10000000000001 - type: max_accuracy value: 99.82574257425742 - type: max_ap value: 95.4781199603258 - type: max_f1 value: 91.27849355797821 - task: type: Clustering dataset: type: mteb/stackexchange-clustering name: MTEB StackExchangeClustering config: default split: test revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259 metrics: - type: v_measure value: 57.542169376331245 - task: type: Clustering dataset: type: mteb/stackexchange-clustering-p2p name: MTEB StackExchangeClusteringP2P config: default split: test revision: 815ca46b2622cec33ccafc3735d572c266efdb44 metrics: - type: v_measure value: 35.74399302634387 - task: type: Reranking dataset: type: mteb/stackoverflowdupquestions-reranking name: MTEB StackOverflowDupQuestions config: default split: test revision: e185fbe320c72810689fc5848eb6114e1ef5ec69 metrics: - type: map value: 49.65076347632749 - type: mrr value: 50.418099057804945 - task: type: Summarization dataset: type: mteb/summeval name: MTEB SummEval config: default split: test revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c metrics: - type: cos_sim_pearson value: 29.73997756592847 - type: cos_sim_spearman value: 29.465208011593308 - type: dot_pearson value: 24.83735342474541 - type: dot_spearman value: 26.005180528584855 - task: type: Retrieval dataset: type: trec-covid name: MTEB TRECCOVID config: default split: test revision: None metrics: - type: map_at_1 value: 0.208 - type: map_at_10 value: 1.434 - type: map_at_100 value: 7.829 - type: map_at_1000 value: 19.807 - type: map_at_3 value: 0.549 - type: map_at_5 value: 0.8330000000000001 - type: mrr_at_1 value: 78.0 - type: mrr_at_10 value: 85.35199999999999 - type: mrr_at_100 value: 85.673 - type: mrr_at_1000 value: 85.673 - type: mrr_at_3 value: 84.667 - type: mrr_at_5 value: 85.06700000000001 - type: ndcg_at_1 value: 72.0 - type: ndcg_at_10 value: 59.214999999999996 - type: ndcg_at_100 value: 44.681 - type: ndcg_at_1000 value: 43.035000000000004 - type: ndcg_at_3 value: 66.53099999999999 - type: ndcg_at_5 value: 63.23 - type: precision_at_1 value: 78.0 - type: precision_at_10 value: 62.4 - type: precision_at_100 value: 45.76 - type: precision_at_1000 value: 19.05 - type: precision_at_3 value: 71.333 - type: precision_at_5 value: 67.2 - type: recall_at_1 value: 0.208 - type: recall_at_10 value: 1.6580000000000001 - type: recall_at_100 value: 11.324 - type: recall_at_1000 value: 41.537 - type: recall_at_3 value: 0.579 - type: recall_at_5 value: 0.8959999999999999 - task: type: Retrieval dataset: type: webis-touche2020 name: MTEB Touche2020 config: default split: test revision: None metrics: - type: map_at_1 value: 2.442 - type: map_at_10 value: 8.863 - type: map_at_100 value: 14.606 - type: map_at_1000 value: 16.258 - type: map_at_3 value: 4.396 - type: map_at_5 value: 6.199000000000001 - type: mrr_at_1 value: 30.612000000000002 - type: mrr_at_10 value: 43.492 - type: mrr_at_100 value: 44.557 - type: mrr_at_1000 value: 44.557 - type: mrr_at_3 value: 40.816 - type: mrr_at_5 value: 42.143 - type: ndcg_at_1 value: 25.509999999999998 - type: ndcg_at_10 value: 22.076 - type: ndcg_at_100 value: 34.098 - type: ndcg_at_1000 value: 46.265 - type: ndcg_at_3 value: 24.19 - type: ndcg_at_5 value: 23.474 - type: precision_at_1 value: 30.612000000000002 - type: precision_at_10 value: 19.796 - type: precision_at_100 value: 7.286 - type: precision_at_1000 value: 1.5310000000000001 - type: precision_at_3 value: 25.85 - type: precision_at_5 value: 24.490000000000002 - type: recall_at_1 value: 2.442 - type: recall_at_10 value: 15.012 - type: recall_at_100 value: 45.865 - type: recall_at_1000 value: 82.958 - type: recall_at_3 value: 5.731 - type: recall_at_5 value: 9.301 - task: type: Classification dataset: type: mteb/toxic_conversations_50k name: MTEB ToxicConversationsClassification config: default split: test revision: d7c0de2777da35d6aae2200a62c6e0e5af397c4c metrics: - type: accuracy value: 70.974 - type: ap value: 14.534996211286682 - type: f1 value: 54.785946183399005 - task: type: Classification dataset: type: mteb/tweet_sentiment_extraction name: MTEB TweetSentimentExtractionClassification config: default split: test revision: d604517c81ca91fe16a244d1248fc021f9ecee7a metrics: - type: accuracy value: 58.56819468024901 - type: f1 value: 58.92391487111204 - task: type: Clustering dataset: type: mteb/twentynewsgroups-clustering name: MTEB TwentyNewsgroupsClustering config: default split: test revision: 6125ec4e24fa026cec8a478383ee943acfbd5449 metrics: - type: v_measure value: 43.273202335218194 - task: type: PairClassification dataset: type: mteb/twittersemeval2015-pairclassification name: MTEB TwitterSemEval2015 config: default split: test revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1 metrics: - type: cos_sim_accuracy value: 84.37742146986946 - type: cos_sim_ap value: 68.1684129575579 - type: cos_sim_f1 value: 64.93475108748189 - type: cos_sim_precision value: 59.89745876058849 - type: cos_sim_recall value: 70.89709762532982 - type: dot_accuracy value: 80.49710913750968 - type: dot_ap value: 54.699790073944186 - type: dot_f1 value: 54.45130013221684 - type: dot_precision value: 46.74612183125236 - type: dot_recall value: 65.19788918205805 - type: euclidean_accuracy value: 84.5085533766466 - type: euclidean_ap value: 68.38835695236224 - type: euclidean_f1 value: 65.3391121002694 - type: euclidean_precision value: 58.75289656625237 - type: euclidean_recall value: 73.58839050131925 - type: manhattan_accuracy value: 84.40126363473803 - type: manhattan_ap value: 68.09539181555348 - type: manhattan_f1 value: 64.99028182701653 - type: manhattan_precision value: 60.22062134173795 - type: manhattan_recall value: 70.58047493403694 - type: max_accuracy value: 84.5085533766466 - type: max_ap value: 68.38835695236224 - type: max_f1 value: 65.3391121002694 - task: type: PairClassification dataset: type: mteb/twitterurlcorpus-pairclassification name: MTEB TwitterURLCorpus config: default split: test revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf metrics: - type: cos_sim_accuracy value: 88.34167733923235 - type: cos_sim_ap value: 84.84136381147736 - type: cos_sim_f1 value: 77.01434980904001 - type: cos_sim_precision value: 74.27937915742794 - type: cos_sim_recall value: 79.95842315983985 - type: dot_accuracy value: 85.06422944075756 - type: dot_ap value: 76.49446747522325 - type: dot_f1 value: 71.11606520830432 - type: dot_precision value: 64.93638676844785 - type: dot_recall value: 78.59562673236834 - type: euclidean_accuracy value: 88.45810532852097 - type: euclidean_ap value: 84.91526721863501 - type: euclidean_f1 value: 77.04399001750662 - type: euclidean_precision value: 74.62298867162133 - type: euclidean_recall value: 79.62734832152756 - type: manhattan_accuracy value: 88.46004579500912 - type: manhattan_ap value: 84.81590026238194 - type: manhattan_f1 value: 76.97804626491822 - type: manhattan_precision value: 73.79237288135593 - type: manhattan_recall value: 80.45118570988605 - type: max_accuracy value: 88.46004579500912 - type: max_ap value: 84.91526721863501 - type: max_f1 value: 77.04399001750662 pipeline_tag: sentence-similarity tags: - sentence-transformers - feature-extraction - sentence-similarity - transformers - mteb --- # {gte-tiny} This is a sentence-transformers model: It maps sentences & paragraphs to a 384 dimensional dense vector space and can be used for tasks like clustering or semantic search. It is distilled from , with comparable (slightly worse) performance at around half the size. ## Usage (Sentence-Transformers) Using this model becomes easy when you have sentence-transformers installed: Then you can use the model like this: ## Usage (HuggingFace Transformers) Without sentence-transformers, you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings. ## Evaluation Results For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: ## Full Model Architecture ## Citing & Authors ", + "model_explanation_gemini": "\"TaylorAI_gte-tiny performs text classification, retrieval, clustering, reranking, and semantic textual similarity (STS) tasks, achieving competitive metrics across multiple datasets like MTEB Amazon reviews, banking queries, and scientific text clustering.\"\n\n**Model Features**: \n- **Tasks**: Classification, Retrieval, Clustering, Reranking, STS \n- **Datasets**: Amazon reviews (counterfactual/polarity), Banking77, ArXiv/BioRxiv clustering, BIOSSES (" +} \ No newline at end of file diff --git a/model_data_json/TheBloke_Llama-2-7B-Chat-GGUF.json b/model_data_json/TheBloke_Llama-2-7B-Chat-GGUF.json new file mode 100644 index 0000000000000000000000000000000000000000..bef51c68de73ab2eb4f9c2b5ace4b4c77f3a9092 --- /dev/null +++ b/model_data_json/TheBloke_Llama-2-7B-Chat-GGUF.json @@ -0,0 +1,22 @@ +{ + "model_id": "TheBloke/Llama-2-7B-Chat-GGUF", + "downloads": 83135, + "tags": [ + "transformers", + "gguf", + "llama", + "facebook", + "meta", + "pytorch", + "llama-2", + "text-generation", + "en", + "arxiv:2307.09288", + "base_model:meta-llama/Llama-2-7b-chat-hf", + "base_model:quantized:meta-llama/Llama-2-7b-chat-hf", + "license:llama2", + "region:us" + ], + "description": "--- language: - en license: llama2 tags: - facebook - meta - pytorch - llama - llama-2 model_name: Llama 2 7B Chat arxiv: 2307.09288 base_model: meta-llama/Llama-2-7b-chat-hf inference: false model_creator: Meta Llama 2 model_type: llama pipeline_tag: text-generation prompt_template: '[INST] <> You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don''t know the answer to a question, please don''t share false information. <> {prompt}[/INST] ' quantized_by: TheBloke ---
\"TheBlokeAI\"

TheBloke's LLM work is generously supported by a grant from


# Llama 2 7B Chat - GGUF - Model creator: Meta Llama 2 - Original model: Llama 2 7B Chat ## Description This repo contains GGUF format model files for Meta Llama 2's Llama 2 7B Chat. ### About GGUF GGUF is a new format introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp. GGUF offers numerous advantages over GGML, such as better tokenisation, and support for special tokens. It is also supports metadata, and is designed to be extensible. Here is an incomplate list of clients and libraries that are known to support GGUF: * llama.cpp. The source project for GGUF. Offers a CLI and a server option. * text-generation-webui, the most widely used web UI, with many features and powerful extensions. Supports GPU acceleration. * KoboldCpp, a fully featured web UI, with GPU accel across all platforms and GPU architectures. Especially good for story telling. * LM Studio, an easy-to-use and powerful local GUI for Windows and macOS (Silicon), with GPU acceleration. * LoLLMS Web UI, a great web UI with many interesting and unique features, including a full model library for easy model selection. * Faraday.dev, an attractive and easy to use character-based chat GUI for Windows and macOS (both Silicon and Intel), with GPU acceleration. * ctransformers, a Python library with GPU accel, LangChain support, and OpenAI-compatible AI server. * llama-cpp-python, a Python library with GPU accel, LangChain support, and OpenAI-compatible API server. * candle, a Rust ML framework with a focus on performance, including GPU support, and ease of use. ## Repositories available * AWQ model(s) for GPU inference. * GPTQ models for GPU inference, with multiple quantisation parameter options. * 2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference * Meta Llama 2's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions ## Prompt template: Llama-2-Chat ## Compatibility These quantised GGUFv2 files are compatible with llama.cpp from August 27th onwards, as of commit d0cee0d36d5be95a0d9088b674dbb27354107221 They are also compatible with many third party UIs and libraries - please see the list at the top of this README. ## Explanation of quantisation methods
Click to see details The new methods available are: * GGML_TYPE_Q2_K - \"type-1\" 2-bit quantization in super-blocks containing 16 blocks, each block having 16 weight. Block scales and mins are quantized with 4 bits. This ends up effectively using 2.5625 bits per weight (bpw) * GGML_TYPE_Q3_K - \"type-0\" 3-bit quantization in super-blocks containing 16 blocks, each block having 16 weights. Scales are quantized with 6 bits. This end up using 3.4375 bpw. * GGML_TYPE_Q4_K - \"type-1\" 4-bit quantization in super-blocks containing 8 blocks, each block having 32 weights. Scales and mins are quantized with 6 bits. This ends up using 4.5 bpw. * GGML_TYPE_Q5_K - \"type-1\" 5-bit quantization. Same super-block structure as GGML_TYPE_Q4_K resulting in 5.5 bpw * GGML_TYPE_Q6_K - \"type-0\" 6-bit quantization. Super-blocks with 16 blocks, each block having 16 weights. Scales are quantized with 8 bits. This ends up using 6.5625 bpw Refer to the Provided Files table below to see what files use which methods, and how.
## Provided files | Name | Quant method | Bits | Size | Max RAM required | Use case | | ---- | ---- | ---- | ---- | ---- | ----- | | llama-2-7b-chat.Q2_K.gguf | Q2_K | 2 | 2.83 GB| 5.33 GB | smallest, significant quality loss - not recommended for most purposes | | llama-2-7b-chat.Q3_K_S.gguf | Q3_K_S | 3 | 2.95 GB| 5.45 GB | very small, high quality loss | | llama-2-7b-chat.Q3_K_M.gguf | Q3_K_M | 3 | 3.30 GB| 5.80 GB | very small, high quality loss | | llama-2-7b-chat.Q3_K_L.gguf | Q3_K_L | 3 | 3.60 GB| 6.10 GB | small, substantial quality loss | | llama-2-7b-chat.Q4_0.gguf | Q4_0 | 4 | 3.83 GB| 6.33 GB | legacy; small, very high quality loss - prefer using Q3_K_M | | llama-2-7b-chat.Q4_K_S.gguf | Q4_K_S | 4 | 3.86 GB| 6.36 GB | small, greater quality loss | | llama-2-7b-chat.Q4_K_M.gguf | Q4_K_M | 4 | 4.08 GB| 6.58 GB | medium, balanced quality - recommended | | llama-2-7b-chat.Q5_0.gguf | Q5_0 | 5 | 4.65 GB| 7.15 GB | legacy; medium, balanced quality - prefer using Q4_K_M | | llama-2-7b-chat.Q5_K_S.gguf | Q5_K_S | 5 | 4.65 GB| 7.15 GB | large, low quality loss - recommended | | llama-2-7b-chat.Q5_K_M.gguf | Q5_K_M | 5 | 4.78 GB| 7.28 GB | large, very low quality loss - recommended | | llama-2-7b-chat.Q6_K.gguf | Q6_K | 6 | 5.53 GB| 8.03 GB | very large, extremely low quality loss | | llama-2-7b-chat.Q8_0.gguf | Q8_0 | 8 | 7.16 GB| 9.66 GB | very large, extremely low quality loss - not recommended | **Note**: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead. ## How to download GGUF files **Note for manual downloaders:** You almost never want to clone the entire repo! Multiple different quantisation formats are provided, and most users only want to pick and download a single file. The following clients/libraries will automatically download models for you, providing a list of available models to choose from: - LM Studio - LoLLMS Web UI - Faraday.dev ### In Under Download Model, you can enter the model repo: TheBloke/Llama-2-7b-Chat-GGUF and below it, a specific filename to download, such as: llama-2-7b-chat.q4_K_M.gguf. Then click Download. ### On the command line, including multiple files at once I recommend using the Python library: Then you can download any individual model file to the current directory, at high speed, with a command like this:
More advanced huggingface-cli download usage You can also download multiple files at once with a pattern: For more documentation on downloading with , please see: HF -> Hub Python Library -> Download files -> Download from the CLI. To accelerate downloads on fast connections (1Gbit/s or higher), install : And set environment variable to : Windows CLI users: Use before running the download command.
## Example command Make sure you are using from commit d0cee0d36d5be95a0d9088b674dbb27354107221 or later. Change to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration. Change to the desired sequence length. For extended sequence models - eg 8K, 16K, 32K - the necessary RoPE scaling parameters are read from the GGUF file and set by llama.cpp automatically. If you want to have a chat-style conversation, replace the argument with For other parameters and how to use them, please refer to the llama.cpp documentation ## How to run in Further instructions here: text-generation-webui/docs/llama.cpp.md. ## How to run from Python code You can use GGUF models from Python using the llama-cpp-python or ctransformers libraries. ### How to load this model from Python using ctransformers #### First install the package #### Simple example code to load one of these GGUF models ## How to use with LangChain Here's guides on using llama-cpp-python or ctransformers with LangChain: * LangChain + llama-cpp-python * LangChain + ctransformers ## Discord For further support, and discussions on these models and AI in general, join us at: TheBloke AI's Discord server ## Thanks, and how to contribute Thanks to the chirper.ai team! Thanks to Clay from gpus.llm-utils.org! I've had a lot of people ask if they can contribute. I enjoy providing models and helping people, and would love to be able to spend even more time doing it, as well as expanding into new projects like fine tuning/training. If you're able and willing to contribute it will be most gratefully received and will help me to keep providing more models, and to start work on new AI projects. Donaters will get priority support on any and all AI/LLM/model questions and requests, access to a private Discord room, plus other benefits. * Patreon: * Ko-Fi: **Special thanks to**: Aemon Algiz. **Patreon special mentions**: Alicia Loh, Stephen Murray, K, Ajan Kanaga, RoA, Magnesian, Deo Leter, Olakabola, Eugene Pentland, zynix, Deep Realms, Raymond Fosdick, Elijah Stavena, Iucharbius, Erik Bjäreholt, Luis Javier Navarrete Lozano, Nicholas, theTransient, John Detwiler, alfie_i, knownsqashed, Mano Prime, Willem Michiel, Enrico Ros, LangChain4j, OG, Michael Dempsey, Pierre Kircher, Pedro Madruga, James Bentley, Thomas Belote, Luke @flexchar, Leonard Tan, Johann-Peter Hartmann, Illia Dulskyi, Fen Risland, Chadd, S_X, Jeff Scroggin, Ken Nordquist, Sean Connelly, Artur Olbinski, Swaroop Kallakuri, Jack West, Ai Maven, David Ziegler, Russ Johnson, transmissions 11, John Villwock, Alps Aficionado, Clay Pascal, Viktor Bowallius, Subspace Studios, Rainer Wilmers, Trenton Dambrowitz, vamX, Michael Levine, 준교 김, Brandon Frisco, Kalila, Trailburnt, Randy H, Talal Aujan, Nathan Dryer, Vadim, 阿明, ReadyPlayerEmma, Tiffany J. Kim, George Stoitzev, Spencer Kim, Jerry Meng, Gabriel Tamborski, Cory Kujawski, Jeffrey Morgan, Spiking Neurons AB, Edmond Seymore, Alexandros Triantafyllidis, Lone Striker, Cap'n Zoog, Nikolai Manek, danny, ya boyyy, Derek Yates, usrbinkat, Mandus, TL, Nathan LeClaire, subjectnull, Imad Khwaja, webtim, Raven Klaugh, Asp the Wyvern, Gabriel Puliatti, Caitlyn Gatomon, Joseph William Delisle, Jonathan Leane, Luke Pendergrass, SuperWojo, Sebastain Graf, Will Dee, Fred von Graf, Andrey, Dan Guido, Daniel P. Andersen, Nitin Borwankar, Elle, Vitor Caleffi, biorpg, jjj, NimbleBox.ai, Pieter, Matthew Berman, terasurfer, Michael Davis, Alex, Stanislav Ovsiannikov Thank you to all my generous patrons and donaters! And thank you again to a16z for their generous grant. # Original model card: Meta Llama 2's Llama 2 7B Chat # **Llama 2** Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is the repository for the 7B fine-tuned model, optimized for dialogue use cases and converted for the Hugging Face Transformers format. Links to other models can be found in the index at the bottom. ## Model Details *Note: Use of this model is governed by the Meta license. In order to download the model weights and tokenizer, please visit the website and accept our License before requesting access here.* Meta developed and publicly released the Llama 2 family of large language models (LLMs), a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama-2-Chat, are optimized for dialogue use cases. Llama-2-Chat models outperform open-source chat models on most benchmarks we tested, and in our human evaluations for helpfulness and safety, are on par with some popular closed-source models like ChatGPT and PaLM. **Model Developers** Meta **Variations** Llama 2 comes in a range of parameter sizes — 7B, 13B, and 70B — as well as pretrained and fine-tuned variations. **Input** Models input text only. **Output** Models generate text only. **Model Architecture** Llama 2 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align to human preferences for helpfulness and safety. ||Training Data|Params|Content Length|GQA|Tokens|LR| |---|---|---|---|---|---|---| |Llama 2|*A new mix of publicly available online data*|7B|4k|✗|2.0T|3.0 x 10-4| |Llama 2|*A new mix of publicly available online data*|13B|4k|✗|2.0T|3.0 x 10-4| |Llama 2|*A new mix of publicly available online data*|70B|4k|✔|2.0T|1.5 x 10-4| *Llama 2 family of models.* Token counts refer to pretraining data only. All models are trained with a global batch-size of 4M tokens. Bigger models - 70B -- use Grouped-Query Attention (GQA) for improved inference scalability. **Model Dates** Llama 2 was trained between January 2023 and July 2023. **Status** This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. **License** A custom commercial license is available at: **Research Paper** \"Llama-2: Open Foundation and Fine-tuned Chat Models\" ## Intended Use **Intended Use Cases** Llama 2 is intended for commercial and research use in English. Tuned models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. To get the expected features and performance for the chat versions, a specific formatting needs to be followed, including the and tags, and tokens, and the whitespaces and breaklines in between (we recommend calling on inputs to avoid double-spaces). See our reference code in github for details: []( **Out-of-scope Uses** Use in any manner that violates applicable laws or regulations (including trade compliance laws).Use in languages other than English. Use in any other way that is prohibited by the Acceptable Use Policy and Licensing Agreement for Llama 2. ## Hardware and Software **Training Factors** We used custom training libraries, Meta's Research Super Cluster, and production clusters for pretraining. Fine-tuning, annotation, and evaluation were also performed on third-party cloud compute. **Carbon Footprint** Pretraining utilized a cumulative 3.3M GPU hours of computation on hardware of type A100-80GB (TDP of 350-400W). Estimated total emissions were 539 tCO2eq, 100% of which were offset by Meta’s sustainability program. ||Time (GPU hours)|Power Consumption (W)|Carbon Emitted(tCO2eq)| |---|---|---|---| |Llama 2 7B|184320|400|31.22| |Llama 2 13B|368640|400|62.44| |Llama 2 70B|1720320|400|291.42| |Total|3311616||539.00| **CO2 emissions during pretraining.** Time: total GPU time required for training each model. Power Consumption: peak power capacity per GPU device for the GPUs used adjusted for power usage efficiency. 100% of the emissions are directly offset by Meta's sustainability program, and because we are openly releasing these models, the pretraining costs do not need to be incurred by others. ## Training Data **Overview** Llama 2 was pretrained on 2 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over one million new human-annotated examples. Neither the pretraining nor the fine-tuning datasets include Meta user data. **Data Freshness** The pretraining data has a cutoff of September 2022, but some tuning data is more recent, up to July 2023. ## Evaluation Results In this section, we report the results for the Llama 1 and Llama 2 models on standard academic benchmarks.For all the evaluations, we use our internal evaluations library. |Model|Size|Code|Commonsense Reasoning|World Knowledge|Reading Comprehension|Math|MMLU|BBH|AGI Eval| |---|---|---|---|---|---|---|---|---|---| |Llama 1|7B|14.1|60.8|46.2|58.5|6.95|35.1|30.3|23.9| |Llama 1|13B|18.9|66.1|52.6|62.3|10.9|46.9|37.0|33.9| |Llama 1|33B|26.0|70.0|58.4|67.6|21.4|57.8|39.8|41.7| |Llama 1|65B|30.7|70.7|60.5|68.6|30.8|63.4|43.5|47.6| |Llama 2|7B|16.8|63.9|48.9|61.3|14.6|45.3|32.6|29.3| |Llama 2|13B|24.5|66.9|55.4|65.8|28.7|54.8|39.4|39.1| |Llama 2|70B|**37.5**|**71.9**|**63.6**|**69.4**|**35.2**|**68.9**|**51.2**|**54.2**| **Overall performance on grouped academic benchmarks.** *Code:* We report the average pass@1 scores of our models on HumanEval and MBPP. *Commonsense Reasoning:* We report the average of PIQA, SIQA, HellaSwag, WinoGrande, ARC easy and challenge, OpenBookQA, and CommonsenseQA. We report 7-shot results for CommonSenseQA and 0-shot results for all other benchmarks. *World Knowledge:* We evaluate the 5-shot performance on NaturalQuestions and TriviaQA and report the average. *Reading Comprehension:* For reading comprehension, we report the 0-shot average on SQuAD, QuAC, and BoolQ. *MATH:* We report the average of the GSM8K (8 shot) and MATH (4 shot) benchmarks at top 1. |||TruthfulQA|Toxigen| |---|---|---|---| |Llama 1|7B|27.42|23.00| |Llama 1|13B|41.74|23.08| |Llama 1|33B|44.19|22.57| |Llama 1|65B|48.71|21.77| |Llama 2|7B|33.29|**21.25**| |Llama 2|13B|41.86|26.10| |Llama 2|70B|**50.18**|24.60| **Evaluation of pretrained LLMs on automatic safety benchmarks.** For TruthfulQA, we present the percentage of generations that are both truthful and informative (the higher the better). For ToxiGen, we present the percentage of toxic generations (the smaller the better). |||TruthfulQA|Toxigen| |---|---|---|---| |Llama-2-Chat|7B|57.04|**0.00**| |Llama-2-Chat|13B|62.18|**0.00**| |Llama-2-Chat|70B|**64.14**|0.01| **Evaluation of fine-tuned LLMs on different safety datasets.** Same metric definitions as above. ## Ethical Considerations and Limitations Llama 2 is a new technology that carries risks with use. Testing conducted to date has been in English, and has not covered, nor could it cover all scenarios. For these reasons, as with all LLMs, Llama 2’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 2, developers should perform safety testing and tuning tailored to their specific applications of the model. Please see the Responsible Use Guide available at ## Reporting Issues Please report any software “bug,” or other problems with the models through one of the following means: - Reporting issues with the model: github.com/facebookresearch/llama - Reporting problematic content generated by the model: developers.facebook.com/llama_output_feedback - Reporting bugs and security concerns: facebook.com/whitehat/info ## Llama Model Index |Model|Llama2|Llama2-hf|Llama2-chat|Llama2-chat-hf| |---|---|---|---|---| |7B| Link | Link | Link | Link| |13B| Link | Link | Link | Link| |70B| Link | Link | Link | Link| ", + "model_explanation_gemini": "A 7B-parameter Llama 2-based chat model in GGUF format, optimized for helpful, safe, and honest text generation while avoiding harmful or unethical content.\n\n**Features:** \n- Fine-tuned for chat/assistant tasks \n- Enforces safety and ethical guidelines in responses \n- Quantized in GGUF format for CPU/GPU efficiency \n- Supports multiple inference clients (llama.cpp, KoboldCpp, etc.) \n\n**Comparison:** \nThis GGUF-quant" +} \ No newline at end of file diff --git a/model_data_json/TheBloke_Mixtral-8x7B-Instruct-v0.1-AWQ.json b/model_data_json/TheBloke_Mixtral-8x7B-Instruct-v0.1-AWQ.json new file mode 100644 index 0000000000000000000000000000000000000000..bf3c4e901676d6ceb2d7d1da53f0a83e1ad5ff9f --- /dev/null +++ b/model_data_json/TheBloke_Mixtral-8x7B-Instruct-v0.1-AWQ.json @@ -0,0 +1,26 @@ +{ + "model_id": "TheBloke/Mixtral-8x7B-Instruct-v0.1-AWQ", + "downloads": 75286, + "tags": [ + "transformers", + "safetensors", + "mixtral", + "text-generation", + "conversational", + "fr", + "it", + "de", + "es", + "en", + "base_model:mistralai/Mixtral-8x7B-Instruct-v0.1", + "base_model:quantized:mistralai/Mixtral-8x7B-Instruct-v0.1", + "license:apache-2.0", + "autotrain_compatible", + "text-generation-inference", + "4-bit", + "awq", + "region:us" + ], + "description": "--- base_model: mistralai/Mixtral-8x7B-Instruct-v0.1 inference: false language: - fr - it - de - es - en license: apache-2.0 model_creator: Mistral AI_ model_name: Mixtral 8X7B Instruct v0.1 model_type: mixtral prompt_template: '[INST] {prompt} [/INST] ' quantized_by: TheBloke widget: - output: text: 'Arr, shiver me timbers! Ye have a llama on yer lawn, ye say? Well, that be a new one for me! Here''s what I''d suggest, arr: 1. Firstly, ensure yer safety. Llamas may look gentle, but they can be protective if they feel threatened. 2. Try to make the area less appealing to the llama. Remove any food sources or water that might be attracting it. 3. Contact local animal control or a wildlife rescue organization. They be the experts and can provide humane ways to remove the llama from yer property. 4. If ye have any experience with animals, you could try to gently herd the llama towards a nearby field or open space. But be careful, arr! Remember, arr, it be important to treat the llama with respect and care. It be a creature just trying to survive, like the rest of us.' text: '[INST] You are a pirate chatbot who always responds with Arr and pirate speak! There''s a llama on my lawn, how can I get rid of him? [/INST]' ---
\"TheBlokeAI\"

TheBloke's LLM work is generously supported by a grant from


# Mixtral 8X7B Instruct v0.1 - AWQ - Model creator: Mistral AI_ - Original model: Mixtral 8X7B Instruct v0.1 ## Description This repo contains AWQ model files for Mistral AI_'s Mixtral 8X7B Instruct v0.1. ### About AWQ AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization. Compared to GPTQ, it offers faster Transformers-based inference with equivalent or better quality compared to the most commonly used GPTQ settings. AWQ models are currently supported on Linux and Windows, with NVidia GPUs only. macOS users: please use GGUF models instead. It is supported by: - Text Generation Webui - using Loader: AutoAWQ - vLLM - version 0.2.2 or later for support for all model types. - Hugging Face Text Generation Inference (TGI) - Transformers version 4.35.0 and later, from any code or client that supports Transformers - AutoAWQ - for use from Python code ## Repositories available * AWQ model(s) for GPU inference. * GPTQ models for GPU inference, with multiple quantisation parameter options. * 2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference * Mistral AI_'s original unquantised fp16 model in pytorch format, for GPU inference and for further conversions ## Prompt template: Mistral ## Provided files, and AWQ parameters I currently release 128g GEMM models only. The addition of group_size 32 models, and GEMV kernel models, is being actively considered. Models are released as sharded safetensors files. | Branch | Bits | GS | AWQ Dataset | Seq Len | Size | | ------ | ---- | -- | ----------- | ------- | ---- | | main | 4 | 128 | VMware Open Instruct | 8192 | 24.65 GB ## How to easily download and use this model in text-generation-webui Please make sure you're using the latest version of text-generation-webui. It is strongly recommended to use the text-generation-webui one-click-installers unless you're sure you know how to make a manual install. 1. Click the **Model tab**. 2. Under **Download custom model or LoRA**, enter . 3. Click **Download**. 4. The model will start downloading. Once it's finished it will say \"Done\". 5. In the top left, click the refresh icon next to **Model**. 6. In the **Model** dropdown, choose the model you just downloaded: 7. Select **Loader: AutoAWQ**. 8. Click Load, and the model will load and is now ready for use. 9. If you want any custom settings, set them and then click **Save settings for this model** followed by **Reload the Model** in the top right. 10. Once you're ready, click the **Text Generation** tab and enter a prompt to get started! ## Multi-user inference server: vLLM Documentation on installing and using vLLM can be found here. - Please ensure you are using vLLM version 0.2 or later. - When using vLLM as a server, pass the parameter. For example: - When using vLLM from Python code, again set . For example: ## Multi-user inference server: Hugging Face Text Generation Inference (TGI) Use TGI version 1.1.0 or later. The official Docker container is: Example Docker parameters: Example Python code for interfacing with TGI (requires huggingface-hub 0.17.0 or later): ## Inference from Python code using Transformers ### Install the necessary packages - Requires: Transformers 4.35.0 or later. - Requires: AutoAWQ 0.1.6 or later. Note that if you are using PyTorch 2.0.1, the above AutoAWQ command will automatically upgrade you to PyTorch 2.1.0. If you are using CUDA 11.8 and wish to continue using PyTorch 2.0.1, instead run this command: If you have problems installing AutoAWQ using the pre-built wheels, install it from source instead: ### Transformers example code (requires Transformers 4.35.0 and later) ## Compatibility The files provided are tested to work with: - text-generation-webui using . - vLLM version 0.2.0 and later. - Hugging Face Text Generation Inference (TGI) version 1.1.0 and later. - Transformers version 4.35.0 and later. - AutoAWQ version 0.1.1 and later. ## Discord For further support, and discussions on these models and AI in general, join us at: TheBloke AI's Discord server ## Thanks, and how to contribute Thanks to the chirper.ai team! Thanks to Clay from gpus.llm-utils.org! I've had a lot of people ask if they can contribute. I enjoy providing models and helping people, and would love to be able to spend even more time doing it, as well as expanding into new projects like fine tuning/training. If you're able and willing to contribute it will be most gratefully received and will help me to keep providing more models, and to start work on new AI projects. Donaters will get priority support on any and all AI/LLM/model questions and requests, access to a private Discord room, plus other benefits. * Patreon: * Ko-Fi: **Special thanks to**: Aemon Algiz. **Patreon special mentions**: Michael Levine, 阿明, Trailburnt, Nikolai Manek, John Detwiler, Randy H, Will Dee, Sebastain Graf, NimbleBox.ai, Eugene Pentland, Emad Mostaque, Ai Maven, Jim Angel, Jeff Scroggin, Michael Davis, Manuel Alberto Morcote, Stephen Murray, Robert, Justin Joy, Luke @flexchar, Brandon Frisco, Elijah Stavena, S_X, Dan Guido, Undi ., Komninos Chatzipapas, Shadi, theTransient, Lone Striker, Raven Klaugh, jjj, Cap'n Zoog, Michel-Marie MAUDET (LINAGORA), Matthew Berman, David, Fen Risland, Omer Bin Jawed, Luke Pendergrass, Kalila, OG, Erik Bjäreholt, Rooh Singh, Joseph William Delisle, Dan Lewis, TL, John Villwock, AzureBlack, Brad, Pedro Madruga, Caitlyn Gatomon, K, jinyuan sun, Mano Prime, Alex, Jeffrey Morgan, Alicia Loh, Illia Dulskyi, Chadd, transmissions 11, fincy, Rainer Wilmers, ReadyPlayerEmma, knownsqashed, Mandus, biorpg, Deo Leter, Brandon Phillips, SuperWojo, Sean Connelly, Iucharbius, Jack West, Harry Royden McLaughlin, Nicholas, terasurfer, Vitor Caleffi, Duane Dunston, Johann-Peter Hartmann, David Ziegler, Olakabola, Ken Nordquist, Trenton Dambrowitz, Tom X Nguyen, Vadim, Ajan Kanaga, Leonard Tan, Clay Pascal, Alexandros Triantafyllidis, JM33133, Xule, vamX, ya boyyy, subjectnull, Talal Aujan, Alps Aficionado, wassieverse, Ari Malik, James Bentley, Woland, Spencer Kim, Michael Dempsey, Fred von Graf, Elle, zynix, William Richards, Stanislav Ovsiannikov, Edmond Seymore, Jonathan Leane, Martin Kemka, usrbinkat, Enrico Ros Thank you to all my generous patrons and donaters! And thank you again to a16z for their generous grant. # Original model card: Mistral AI_'s Mixtral 8X7B Instruct v0.1 # Model Card for Mixtral-8x7B The Mixtral-8x7B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts. The Mixtral-8x7B outperforms Llama 2 70B on most benchmarks we tested. For full details of this model please read our release blog post. ## Warning This repo contains weights that are compatible with vLLM serving of the model as well as Hugging Face transformers library. It is based on the original Mixtral torrent release, but the file format and parameter names are different. Please note that model cannot (yet) be instantiated with HF. ## Instruction format This format must be strictly respected, otherwise the model will generate sub-optimal outputs. The template used to build a prompt for the Instruct model is defined as follows: Note that and are special tokens for beginning of string (BOS) and end of string (EOS) while [INST] and [/INST] are regular strings. As reference, here is the pseudo-code used to tokenize instructions during fine-tuning: In the pseudo-code above, note that the method should not add a BOS or EOS token automatically, but should add a prefix space. ## Run the model By default, transformers will load the model in full precision. Therefore you might be interested to further reduce down the memory requirements to run the model through the optimizations we offer in HF ecosystem: ### In half-precision Note precision only works on GPU devices
Click to expand
### Lower precision using (8-bit & 4-bit) using
Click to expand
### Load the model with Flash Attention 2
Click to expand
## Limitations The Mixtral-8x7B Instruct model is a quick demonstration that the base model can be easily fine-tuned to achieve compelling performance. It does not have any moderation mechanisms. We're looking forward to engaging with the community on ways to make the model finely respect guardrails, allowing for deployment in environments requiring moderated outputs. # The Mistral AI Team Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Blanche Savary, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Emma Bou Hanna, Florian Bressand, Gianna Lengyel, Guillaume Bour, Guillaume Lample, Lélio Renard Lavaud, Louis Ternon, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Théophile Gervet, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed.", + "model_explanation_gemini": "A quantized version of Mistral AI's Mixtral-8x7B-Instruct-v0.1 model optimized for efficient inference using AWQ (4-bit quantization), supporting multilingual text generation with a pirate-themed example demonstrating its conversational capabilities. \n\n**Features:** \n- Base model: `mistralai/Mixtral-8x7B-Instruct-v0.1` \n- Quantization: 4-bit AWQ for faster inference with comparable accuracy to GPTQ \n-" +} \ No newline at end of file diff --git a/model_data_json/UBC-NLP_MARBERT.json b/model_data_json/UBC-NLP_MARBERT.json new file mode 100644 index 0000000000000000000000000000000000000000..27f3a663c3a30fa8355fb66c296fb1f5fa39c166 --- /dev/null +++ b/model_data_json/UBC-NLP_MARBERT.json @@ -0,0 +1,22 @@ +{ + "model_id": "UBC-NLP/MARBERT", + "downloads": 80475, + "tags": [ + "transformers", + "pytorch", + "tf", + "jax", + "bert", + "fill-mask", + "Arabic BERT", + "MSA", + "Twitter", + "Masked Langauge Model", + "ar", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: - ar tags: - Arabic BERT - MSA - Twitter - Masked Langauge Model widget: - text: \"اللغة العربية هي لغة [MASK].\" --- \"drawing\" **MARBERT** is one of three models described in our **ACL 2021 paper** **\"ARBERT & MARBERT: Deep Bidirectional Transformers for Arabic\"**. MARBERT is a large-scale pre-trained masked language model focused on both Dialectal Arabic (DA) and MSA. Arabic has multiple varieties. To train MARBERT, we randomly sample 1B Arabic tweets from a large in-house dataset of about 6B tweets. We only include tweets with at least 3 Arabic words, based on character string matching, regardless whether the tweet has non-Arabic string or not. That is, we do not remove non-Arabic so long as the tweet meets the 3 Arabic word criterion. The dataset makes up **128GB of text** (**15.6B tokens**). We use the same network architecture as ARBERT (BERT-base), but without the next sentence prediction (NSP) objective since tweets are short. See our repo for modifying BERT code to remove NSP. For more information about MARBERT, please visit our own GitHub repo. # BibTex If you use our models (ARBERT, MARBERT, or MARBERTv2) for your scientific publication, or if you find the resources in this repository useful, please cite our paper as follows (to be updated): ## Acknowledgments We gratefully acknowledge support from the Natural Sciences and Engineering Research Council of Canada, the Social Sciences and Humanities Research Council of Canada, Canadian Foundation for Innovation, ComputeCanada and UBC ARC-Sockeye. We also thank the Google TensorFlow Research Cloud (TFRC) program for providing us with free TPU access.", + "model_explanation_gemini": "\"MARBERT is a large-scale Arabic BERT model pre-trained on 1B Arabic tweets for both Dialectal Arabic and Modern Standard Arabic, optimized for masked language modeling without next sentence prediction.\"\n\nModel Features: \n- Language: Arabic (Dialectal and Modern Standard) \n- Training Data: 1B Arabic tweets (128GB, 15.6B tokens) \n- Architecture: BERT-base (no NSP objective) \n- Use Case: Masked language modeling" +} \ No newline at end of file diff --git a/model_data_json/ZhenYe234_hubert_base_general_audio.json b/model_data_json/ZhenYe234_hubert_base_general_audio.json new file mode 100644 index 0000000000000000000000000000000000000000..a6fac9270cfa025c8c2b896117318f124c9a9266 --- /dev/null +++ b/model_data_json/ZhenYe234_hubert_base_general_audio.json @@ -0,0 +1,13 @@ +{ + "model_id": "ZhenYe234/hubert_base_general_audio", + "downloads": 68883, + "tags": [ + "safetensors", + "hubert", + "arxiv:2408.17175", + "license:apache-2.0", + "region:us" + ], + "description": "--- license: apache-2.0 --- ![arXiv]( A hubert model trained on 200k hours general audio.", + "model_explanation_gemini": "A Hubert model trained on 200k hours of general audio for broad audio processing tasks. \n\n**Features:** \n- Pre-trained on 200k hours of general audio data. \n- Based on the Hubert architecture. \n\n**Comparison:** (Not applicable—no comparative details provided in the description.)" +} \ No newline at end of file diff --git a/model_data_json/aglazkova_bart_finetuned_keyphrase_extraction.json b/model_data_json/aglazkova_bart_finetuned_keyphrase_extraction.json new file mode 100644 index 0000000000000000000000000000000000000000..3b6080436cb1320579d54d960e2f390b326155a8 --- /dev/null +++ b/model_data_json/aglazkova_bart_finetuned_keyphrase_extraction.json @@ -0,0 +1,22 @@ +{ + "model_id": "aglazkova/bart_finetuned_keyphrase_extraction", + "downloads": 83042, + "tags": [ + "transformers", + "pytorch", + "bart", + "text2text-generation", + "en", + "dataset:midas/krapivin", + "dataset:midas/inspec", + "dataset:midas/kptimes", + "dataset:midas/duc2001", + "arxiv:1910.13461", + "arxiv:2312.10700", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- datasets: - midas/krapivin - midas/inspec - midas/kptimes - midas/duc2001 language: - en widget: - text: \"Relevance has traditionally been linked with feature subset selection, but formalization of this link has not been attempted. In this paper, we propose two axioms for feature subset selection sufficiency axiom and necessity axiombased on which this link is formalized: The expected feature subset is the one which maximizes relevance. Finding the expected feature subset turns out to be NP-hard. We then devise a heuristic algorithm to find the expected subset which has a polynomial time complexity. The experimental results show that the algorithm finds good enough subset of features which, when presented to C4.5, results in better prediction accuracy.\" - text: \"In this paper, we investigate cross-domain limitations of keyphrase generation using the models for abstractive text summarization. We present an evaluation of BART fine-tuned for keyphrase generation across three types of texts, namely scientific texts from computer science and biomedical domains and news texts. We explore the role of transfer learning between different domains to improve the model performance on small text corpora.\" --- # BART fine-tuned for keyphrase generation This is the
) model finetuned for the keyphrase generation task () on the fragments of the following corpora: * Krapivin () * Inspec () * KPTimes () * DUC-2001 () * PubMed () * NamedKeys (). #### Training Hyperparameters The following hyperparameters were used during training: * learning_rate: 4e-5 * train_batch_size: 8 * optimizer: AdamW with betas=(0.9,0.999) and epsilon=1e-08 * num_epochs: 6 **BibTeX:**", + "model_explanation_gemini": "\"Finetuned BART model for generating keyphrases from scientific and news texts, trained on multiple datasets including Krapivin, Inspec, and KPTimes.\"\n\nModel Features: \n- Task: Keyphrase generation \n- Base Model: BART \n- Training Datasets: Krapivin, Inspec, KPTimes, DUC-2001, PubMed, NamedKeys \n- Supported Language: English \n- Training Hyperparameters: \n - Learning rate: 4e" +} \ No newline at end of file diff --git a/model_data_json/aipicasso_emi.json b/model_data_json/aipicasso_emi.json new file mode 100644 index 0000000000000000000000000000000000000000..86935d52a35f895473829d8efad985ec3002f89a --- /dev/null +++ b/model_data_json/aipicasso_emi.json @@ -0,0 +1,18 @@ +{ + "model_id": "aipicasso/emi", + "downloads": 274385, + "tags": [ + "diffusers", + "safetensors", + "stable-diffusion", + "text-to-image", + "arxiv:2307.01952", + "arxiv:2212.03860", + "license:openrail++", + "autotrain_compatible", + "diffusers:StableDiffusionXLPipeline", + "region:us" + ], + "description": "--- extra_gated_prompt: このモデルをこのページからダウンロードするためにはHugging Faceに登録された情報を提供する必要があります。この提供された情報は画像生成AIを活用する情報を案内するために使われます。 To download this model from this page, you need to provide information registered with Hugging Face. The information provided will be used to guide you on how to utilize the image-generation AI. license: openrail++ tags: - stable-diffusion - text-to-image inference: false library_name: diffusers --- # Emi Model Card !eyecatch.jpg Original(PNG) English: Click Here # はじめに Emi (Ethereal master of illustration) は、 最先端の開発機材H100と画像生成Stable Diffusion XL 1.0を用いて AI Picasso社が開発したAIアートに特化した画像生成AIです。 このモデルの特徴として、Danbooruなどにある無断転載画像を学習していないことがあげられます。 # ライセンスについて ライセンスについては、これまでとは違い、 CreativeML Open RAIL++-M License です。 したがって、**商用利用可能**です。 これは次のように判断したためです。 - 画像生成AIが普及するに伴い、創作業界に悪影響を及ぼさないように、マナーを守る人が増えてきたため - 他の画像生成AIが商用可能である以上、あまり非商用ライセンスである実効性がなくなってきたため # 使い方 ここからデモを利用することができます。 本格的に利用する人はここからモデルをダウンロードできます。 通常版で生成がうまく行かない場合は、安定版をお使いください。 # シンプルな作品例 !example_1.jpg !example_2.png !example_3.jpg # モデルの出力向上について - 確実にアニメ調のイラストを出したいときは、anime artwork, anime styleとプロンプトの先頭に入れてください。 - プロンプトにtransparentという言葉を入れると、より最近の画風になります。 - 全身 (full body) を描くとうまく行かない場合もあるため、そのときは安定版をお試しください。 - 使えるプロンプトはWaifu Diffusionと同じです。また、Stable Diffusionのように使うこともできます。 - ネガティブプロンプトにTextual Inversionを使用することをおすすめします。 - 手が不安定なため、DreamShaper XL1.0などの実写系モデルとのマージをおすすめします。 - ChatGPTを用いてプロンプトを洗練すると、自分の枠を超えた作品に出会えます。 - 最新のComfyUIにあるFreeUノード、またはWeb UIの拡張機能を次のパラメータで使うとさらに出力が上がる可能性があります。次の画像はFreeUを使った例です。 - b1 = 1.1, b2 = 1.2, s1 = 0.6, s2 = 0.4 report !example_4.png # 法律について 本モデルは日本にて作成されました。したがって、日本の法律が適用されます。 本モデルの学習は、著作権法第30条の4に基づき、合法であると主張します。 また、本モデルの配布については、著作権法や刑法175条に照らしてみても、 正犯や幇助犯にも該当しないと主張します。詳しくは柿沼弁護士の見解を御覧ください。 ただし、ライセンスにもある通り、本モデルの生成物は各種法令に従って取り扱って下さい。 # 連絡先 support@aipicasso.app 以下、一般的なモデルカードの日本語訳です。 ## モデル詳細 - **モデルタイプ:** 拡散モデルベースの text-to-image 生成モデル - **言語:** 日本語 - **ライセンス:** CreativeML Open RAIL++-M License - **モデルの説明:** このモデルはプロンプトに応じて適切な画像を生成することができます。アルゴリズムは Latent Diffusion Model と OpenCLIP-ViT/G、CLIP-L です。 - **補足:** - **参考文献:** ## モデルの使用例 Stable Diffusion XL 1.0と同じ使い方です。 たくさんの方法がありますが、3つのパターンを提供します。 - ComfyUI - Fooocus - Diffusers ### ComfyUIやFooocusの場合 Stable Diffusion XL 1.0 の使い方と同じく、safetensor形式のモデルファイルを使ってください。 詳しいインストール方法は、こちらの記事を参照してください。 ### Diffusersの場合 🤗's Diffusers library を使ってください。 まずは、以下のスクリプトを実行し、ライブラリをいれてください。 次のスクリプトを実行し、画像を生成してください。 複雑な操作はデモのソースコードを参考にしてください。 #### 想定される用途 - イラストや漫画、アニメの作画補助 - 商用・非商用は問わない - 依頼の際のクリエイターとのコミュニケーション - 画像生成サービスの商用提供 - 生成物の取り扱いには注意して使ってください。 - 自己表現 - このAIを使い、「あなた」らしさを発信すること - 研究開発 - Discord上でのモデルの利用 - プロンプトエンジニアリング - ファインチューニング(追加学習とも) - DreamBooth など - 他のモデルとのマージ - 本モデルの性能をFIDなどで調べること - 本モデルがStable Diffusion以外のモデルとは独立であることをチェックサムやハッシュ関数などで調べること - 教育 - 美大生や専門学校生の卒業制作 - 大学生の卒業論文や課題制作 - 先生が画像生成AIの現状を伝えること - Hugging Face の Community にかいてある用途 - 日本語か英語で質問してください #### 想定されない用途 - 物事を事実として表現するようなこと - 先生を困らせるようなこと - その他、創作業界に悪影響を及ぼすこと # 使用してはいけない用途や悪意のある用途 - マネー・ロンダリングに用いないでください - デジタル贋作 (Digital Forgery) は公開しないでください(著作権法に違反するおそれ) - 他人の作品を無断でImage-to-Imageしないでください(著作権法に違反するおそれ) - わいせつ物を頒布しないでください (刑法175条に違反するおそれ) - いわゆる業界のマナーを守らないようなこと - 事実に基づかないことを事実のように語らないようにしてください(威力業務妨害罪が適用されるおそれ) - フェイクニュース ## モデルの限界やバイアス ### モデルの限界 - 拡散モデルや大規模言語モデルは、いまだに未知の部分が多く、その限界は判明していない。 ### バイアス - 拡散モデルや大規模言語モデルは、いまだに未知の部分が多く、バイアスは判明していない。 ## 学習 **学習データ** - Stable Diffusionと同様のデータセットからDanbooruの無断転載画像を取り除いて手動で集めた約2000枚の画像 - Stable Diffusionと同様のデータセットからDanbooruの無断転載画像を取り除いて自動で集めた約50万枚の画像 **学習プロセス** - **ハードウェア:** H100 ## 評価結果 第三者による評価を求めています。 ## 環境への影響 - **ハードウェアタイプ:** H100 - **使用時間(単位は時間):** 500 - **学習した場所:** 日本 ## 参考文献", + "model_explanation_gemini": "Generates anime-style illustrations using Stable Diffusion XL 1.0, avoiding unauthorized Danbooru images, and allows commercial use under the Open RAIL++-M license.\n\n**Features:** \n- Text-to-image diffusion model (Stable Diffusion XL-based) \n- Specialized for anime artwork generation \n- Commercial use permitted \n- Optimized prompts (e.g., \"anime style\", \"transparent\") \n- Compatible with Waifu Diffusion/Stable Diffusion prompts \n-" +} \ No newline at end of file diff --git a/model_data_json/allegro_herbert-base-cased.json b/model_data_json/allegro_herbert-base-cased.json new file mode 100644 index 0000000000000000000000000000000000000000..b5f859cf31483c303ef637e25ce30ebbb2d56389 --- /dev/null +++ b/model_data_json/allegro_herbert-base-cased.json @@ -0,0 +1,19 @@ +{ + "model_id": "allegro/herbert-base-cased", + "downloads": 78442, + "tags": [ + "transformers", + "pytorch", + "tf", + "jax", + "bert", + "feature-extraction", + "herbert", + "pl", + "license:cc-by-4.0", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: pl tags: - herbert license: cc-by-4.0 --- # HerBERT **HerBERT** is a BERT-based Language Model trained on Polish corpora using Masked Language Modelling (MLM) and Sentence Structural Objective (SSO) with dynamic masking of whole words. For more details, please refer to: HerBERT: Efficiently Pretrained Transformer-based Language Model for Polish. Model training and experiments were conducted with transformers in version 2.9. ## Corpus HerBERT was trained on six different corpora available for Polish language: | Corpus | Tokens | Documents | | :------ | ------: | ------: | | CCNet Middle | 3243M | 7.9M | | CCNet Head | 2641M | 7.0M | | National Corpus of Polish| 1357M | 3.9M | | Open Subtitles | 1056M | 1.1M | Wikipedia | 260M | 1.4M | | Wolne Lektury | 41M | 5.5k | ## Tokenizer The training dataset was tokenized into subwords using a character level byte-pair encoding (``. ## Usage Example code: ## License CC BY 4.0 ## Citation If you use this model, please cite the following paper: ## Authors The model was trained by **Machine Learning Research Team at Allegro** and **Linguistic Engineering Group at Institute of Computer Science, Polish Academy of Sciences**. You can contact us at: klejbenchmark@allegro.pl", + "model_explanation_gemini": "\"HerBERT is a BERT-based Polish language model trained with MLM and SSO objectives on diverse Polish corpora for natural language processing tasks.\"\n\nModel Features: \n- BERT-based architecture \n- Trained on Polish corpora (CCNet, National Corpus of Polish, Open Subtitles, Wikipedia, Wolne Lektury) \n- Uses Masked Language Modeling (MLM) and Sentence Structural Objective (SSO) \n- Implements dynamic whole-word masking \n- Byte-pair" +} \ No newline at end of file diff --git a/model_data_json/answerdotai_ModernBERT-large.json b/model_data_json/answerdotai_ModernBERT-large.json new file mode 100644 index 0000000000000000000000000000000000000000..8b040e94b6b28036ce2ede3f434c96ee5c3270e5 --- /dev/null +++ b/model_data_json/answerdotai_ModernBERT-large.json @@ -0,0 +1,21 @@ +{ + "model_id": "answerdotai/ModernBERT-large", + "downloads": 80863, + "tags": [ + "transformers", + "pytorch", + "onnx", + "safetensors", + "modernbert", + "fill-mask", + "masked-lm", + "long-context", + "en", + "arxiv:2412.13663", + "license:apache-2.0", + "autotrain_compatible", + "region:us" + ], + "description": "--- library_name: transformers license: apache-2.0 language: - en tags: - fill-mask - masked-lm - long-context - modernbert pipeline_tag: fill-mask inference: false --- # ModernBERT ## Table of Contents 1. Model Summary 2. Usage 3. Evaluation 4. Limitations 5. Training 6. License 7. Citation ## Model Summary ModernBERT is a modernized bidirectional encoder-only Transformer model (BERT-style) pre-trained on 2 trillion tokens of English and code data with a native context length of up to 8,192 tokens. ModernBERT leverages recent architectural improvements such as: - **Rotary Positional Embeddings (RoPE)** for long-context support. - **Local-Global Alternating Attention** for efficiency on long inputs. - **Unpadding and Flash Attention** for efficient inference. ModernBERT’s native long context length makes it ideal for tasks that require processing long documents, such as retrieval, classification, and semantic search within large corpora. The model was trained on a large corpus of text and code, making it suitable for a wide range of downstream tasks, including code retrieval and hybrid (text + code) semantic search. It is available in the following sizes: - ModernBERT-base - 22 layers, 149 million parameters - ModernBERT-large - 28 layers, 395 million parameters For more information about ModernBERT, we recommend our release blog post for a high-level overview, and our arXiv pre-print for in-depth information. *ModernBERT is a collaboration between Answer.AI, LightOn, and friends.* ## Usage You can use these models directly with the library starting from v4.48.0: Since ModernBERT is a Masked Language Model (MLM), you can use the pipeline or load it via . To use ModernBERT for downstream tasks like classification, retrieval, or QA, fine-tune it following standard BERT fine-tuning recipes. **⚠️ If your GPU supports it, we recommend using ModernBERT with Flash Attention 2 to reach the highest efficiency. To do so, install Flash Attention as follows, then use the model as normal:** Using : Using a pipeline: **Note:** ModernBERT does not use token type IDs, unlike some earlier BERT models. Most downstream usage is identical to standard BERT models on the Hugging Face Hub, except you can omit the parameter. ## Evaluation We evaluate ModernBERT across a range of tasks, including natural language understanding (GLUE), general retrieval (BEIR), long-context retrieval (MLDR), and code retrieval (CodeSearchNet and StackQA). **Key highlights:** - On GLUE, ModernBERT-base surpasses other similarly-sized encoder models, and ModernBERT-large is second only to Deberta-v3-large. - For general retrieval tasks, ModernBERT performs well on BEIR in both single-vector (DPR-style) and multi-vector (ColBERT-style) settings. - Thanks to the inclusion of code data in its training mixture, ModernBERT as a backbone also achieves new state-of-the-art code retrieval results on CodeSearchNet and StackQA. ### Base Models | Model | IR (DPR) | IR (DPR) | IR (DPR) | IR (ColBERT) | IR (ColBERT) | NLU | Code | Code | |-------------|--------------|--------------|--------------|---------------|---------------|------|------|------| | | BEIR | MLDR_OOD | MLDR_ID | BEIR | MLDR_OOD | GLUE | CSN | SQA | | BERT | 38.9 | 23.9 | 32.2 | 49.0 | 28.1 | 84.7 | 41.2 | 59.5 | | RoBERTa | 37.7 | 22.9 | 32.8 | 48.7 | 28.2 | 86.4 | 44.3 | 59.6 | | DeBERTaV3 | 20.2 | 5.4 | 13.4 | 47.1 | 21.9 | 88.1 | 17.5 | 18.6 | | NomicBERT | 41.0 | 26.7 | 30.3 | 49.9 | 61.3 | 84.0 | 41.6 | 61.4 | | GTE-en-MLM | 41.4 | **34.3** |**44.4** | 48.2 | 69.3 | 85.6 | 44.9 | 71.4 | | ModernBERT | **41.6** | 27.4 | 44.0 | **51.3** | **80.2** | **88.4** | **56.4** |**73.6**| --- ### Large Models | Model | IR (DPR) | IR (DPR) | IR (DPR) | IR (ColBERT) | IR (ColBERT) | NLU | Code | Code | |-------------|--------------|--------------|--------------|---------------|---------------|------|------|------| | | BEIR | MLDR_OOD | MLDR_ID | BEIR | MLDR_OOD | GLUE | CSN | SQA | | BERT | 38.9 | 23.3 | 31.7 | 49.5 | 28.5 | 85.2 | 41.6 | 60.8 | | RoBERTa | 41.4 | 22.6 | 36.1 | 49.8 | 28.8 | 88.9 | 47.3 | 68.1 | | DeBERTaV3 | 25.6 | 7.1 | 19.2 | 46.7 | 23.0 | **91.4**| 21.2 | 19.7 | | GTE-en-MLM | 42.5 | **36.4** | **48.9** | 50.7 | 71.3 | 87.6 | 40.5 | 66.9 | | ModernBERT | **44.0** | 34.3 | 48.6 | **52.4** | **80.4** | 90.4 |**59.5** |**83.9**| *Table 1: Results for all models across an overview of all tasks. CSN refers to CodeSearchNet and SQA to StackQA. MLDRID refers to in-domain (fine-tuned on the training set) evaluation, and MLDR_OOD to out-of-domain.* ModernBERT’s strong results, coupled with its efficient runtime on long-context inputs, demonstrate that encoder-only models can be significantly improved through modern architectural choices and extensive pretraining on diversified data sources. ## Limitations ModernBERT’s training data is primarily English and code, so performance may be lower for other languages. While it can handle long sequences efficiently, using the full 8,192 tokens window may be slower than short-context inference. Like any large language model, ModernBERT may produce representations that reflect biases present in its training data. Verify critical or sensitive outputs before relying on them. ## Training - Architecture: Encoder-only, Pre-Norm Transformer with GeGLU activations. - Sequence Length: Pre-trained up to 1,024 tokens, then extended to 8,192 tokens. - Data: 2 trillion tokens of English text and code. - Optimizer: StableAdamW with trapezoidal LR scheduling and 1-sqrt decay. - Hardware: Trained on 8x H100 GPUs. See the paper for more details. ## License We release the ModernBERT model architectures, model weights, training codebase under the Apache 2.0 license. ## Citation If you use ModernBERT in your work, please cite:", + "model_explanation_gemini": "\"ModernBERT-large is a BERT-style model optimized for long-context tasks (up to 8,192 tokens) with architectural improvements like RoPE and Flash Attention, excelling in retrieval, NLU, and code-related tasks.\"\n\n**Features:** \n- **Architecture:** Rotary Positional Embeddings (RoPE), Local-Global Alternating Attention, Unpadding, Flash Attention support. \n- **Context Length:** Native 8,192-token handling. \n- **Training" +} \ No newline at end of file diff --git a/model_data_json/ashawkey_mvdream-sd2.1-diffusers.json b/model_data_json/ashawkey_mvdream-sd2.1-diffusers.json new file mode 100644 index 0000000000000000000000000000000000000000..b7fdd955356b266e26c1f6faee9547ddf26d2e29 --- /dev/null +++ b/model_data_json/ashawkey_mvdream-sd2.1-diffusers.json @@ -0,0 +1,14 @@ +{ + "model_id": "ashawkey/mvdream-sd2.1-diffusers", + "downloads": 78372, + "tags": [ + "diffusers", + "safetensors", + "arxiv:2308.16512", + "license:openrail", + "diffusers:MVDreamPipeline", + "region:us" + ], + "description": "--- license: openrail --- # MVDream-diffusers Model Card This is a port of into diffusers. For usage, please check: ## Citation ## Misuse, Malicious Use, and Out-of-Scope Use The model should not be used to intentionally create or disseminate images that create hostile or alienating environments for people. This includes generating images that people would foreseeably find disturbing, distressing, or offensive; or content that propagates historical or current stereotypes.", + "model_explanation_gemini": "Port of MVDream-diffusers for generating multi-view images with Stable Diffusion 2.1.\n\nFeatures: \n- Multi-view image generation \n- Based on Stable Diffusion 2.1 \n- Diffusers integration \n\nComparison: Not applicable (no comparative details provided in the description)." +} \ No newline at end of file diff --git a/model_data_json/aubmindlab_bert-base-arabert.json b/model_data_json/aubmindlab_bert-base-arabert.json new file mode 100644 index 0000000000000000000000000000000000000000..ba689345b8f6993e87f9ecbb400ce5bbf8716dde --- /dev/null +++ b/model_data_json/aubmindlab_bert-base-arabert.json @@ -0,0 +1,20 @@ +{ + "model_id": "aubmindlab/bert-base-arabert", + "downloads": 74124, + "tags": [ + "transformers", + "pytorch", + "tf", + "jax", + "safetensors", + "bert", + "fill-mask", + "ar", + "arxiv:2003.00104", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: ar datasets: - wikipedia - Osian - 1.5B-Arabic-Corpus - oscar-arabic-unshuffled - Assafir(private) widget: - text: \" عاصم +ة لبنان هي [MASK] .\" --- # !!! A newer version of this model is available !!! AraBERTv2 # AraBERT v1 & v2 : Pre-training BERT for Arabic Language Understanding **AraBERT** is an Arabic pretrained lanaguage model based on Google's BERT architechture. AraBERT uses the same BERT-Base config. More details are available in the AraBERT Paper and in the AraBERT Meetup There are two versions of the model, AraBERTv0.1 and AraBERTv1, with the difference being that AraBERTv1 uses pre-segmented text where prefixes and suffixes were splitted using the Farasa Segmenter. We evalaute AraBERT models on different downstream tasks and compare them to mBERT), and other state of the art models (*To the extent of our knowledge*). The Tasks were Sentiment Analysis on 6 different datasets (HARD, ASTD-Balanced, ArsenTD-Lev, LABR), Named Entity Recognition with the ANERcorp, and Arabic Question Answering on Arabic-SQuAD and ARCD # AraBERTv2 ## What's New! AraBERT now comes in 4 new variants to replace the old v1 versions: More Detail in the AraBERT folder and in the README and in the AraBERT Paper Model | HuggingFace Model Name | Size (MB/Params)| Pre-Segmentation | DataSet (Sentences/Size/nWords) | ---|:---:|:---:|:---:|:---: AraBERTv0.2-base | bert-base-arabertv02 | 543MB / 136M | No | 200M / 77GB / 8.6B | AraBERTv0.2-large| bert-large-arabertv02 | 1.38G 371M | No | 200M / 77GB / 8.6B | AraBERTv2-base| bert-base-arabertv2 | 543MB 136M | Yes | 200M / 77GB / 8.6B | AraBERTv2-large| bert-large-arabertv2 | 1.38G 371M | Yes | 200M / 77GB / 8.6B | AraBERTv0.1-base| bert-base-arabertv01 | 543MB 136M | No | 77M / 23GB / 2.7B | AraBERTv1-base| bert-base-arabert | 543MB 136M | Yes | 77M / 23GB / 2.7B | All models are available in the model page under the aubmindlab name. Checkpoints are available in PyTorch, TF2 and TF1 formats. ## Better Pre-Processing and New Vocab We identified an issue with AraBERTv1's wordpiece vocabulary. The issue came from punctuations and numbers that were still attached to words when learned the wordpiece vocab. We now insert a space between numbers and characters and around punctuation characters. The new vocabulary was learnt using the from the library, and should now support the Fast tokenizer implementation from the library. **P.S.**: All the old BERT codes should work with the new BERT, just change the model name and check the new preprocessing dunction **Please read the section on how to use the preprocessing function** ## Bigger Dataset and More Compute We used ~3.5 times more data, and trained for longer. For Dataset Sources see the Dataset Section Model | Hardware | num of examples with seq len (128 / 512) |128 (Batch Size/ Num of Steps) | 512 (Batch Size/ Num of Steps) | Total Steps | Total Time (in Days) | ---|:---:|:---:|:---:|:---:|:---:|:---: AraBERTv0.2-base | TPUv3-8 | 420M / 207M |2560 / 1M | 384/ 2M | 3M | - AraBERTv0.2-large | TPUv3-128 | 420M / 207M | 13440 / 250K | 2056 / 300K | 550K | - AraBERTv2-base | TPUv3-8 | 520M / 245M |13440 / 250K | 2056 / 300K | 550K | - AraBERTv2-large | TPUv3-128 | 520M / 245M | 13440 / 250K | 2056 / 300K | 550K | - AraBERT-base (v1/v0.1) | TPUv2-8 | - |512 / 900K | 128 / 300K| 1.2M | 4 days # Dataset The pretraining data used for the new AraBERT model is also used for Arabic **GPT2 and ELECTRA**. The dataset consists of 77GB or 200,095,961 lines or 8,655,948,860 words or 82,232,988,358 chars (before applying Farasa Segmentation) For the new dataset we added the unshuffled OSCAR corpus, after we thoroughly filter it, to the previous dataset used in AraBERTv1 but with out the websites that we previously crawled: - OSCAR unshuffled and filtered. - Arabic Wikipedia dump from 2020/09/01 - The 1.5B words Arabic Corpus - The OSIAN Corpus - Assafir news articles. Huge thank you for Assafir for giving us the data # Preprocessing It is recommended to apply our preprocessing function before training/testing on any dataset. **Install farasapy to segment text for AraBERT v1 & v2 ** ## Accepted_models # TensorFlow 1.x models The TF1.x model are available in the HuggingFace models repo. You can download them as follows: - via git-lfs: clone all the models in a repo where is any model under the name - via : - Go to the tf1_model.tar.gz file on huggingface.co/models/aubmindlab/MODEL_NAME. - copy the - then run (ex: for : ) # If you used this model please cite us as : Google Scholar has our Bibtex wrong (missing name), use this instead # Acknowledgments Thanks to TensorFlow Research Cloud (TFRC) for the free access to Cloud TPUs, couldn't have done it without this program, and to the AUB MIND Lab Members for the continous support. Also thanks to Yakshof and Assafir for data and storage access. Another thanks for Habib Rahal ( for putting a face to AraBERT. ## Contacts **Wissam Antoun**: Linkedin | Twitter | Github | | **Fady Baly**: Linkedin | Twitter | Github | | ", + "model_explanation_gemini": "Pretrained BERT-based model for Arabic language understanding, optimized for tasks like sentiment analysis, named entity recognition, and question answering.\n\nFeatures: \n- Based on BERT-Base architecture \n- Trained on large Arabic datasets (Wikipedia, OSCAR, 1.5B-Arabic-Corpus, etc.) \n- Supports pre-segmented text (Farasa Segmenter in v1/v2) \n- Improved vocabulary handling (v2) \n- Available in base/large" +} \ No newline at end of file diff --git a/model_data_json/autogluon_chronos-bolt-tiny.json b/model_data_json/autogluon_chronos-bolt-tiny.json new file mode 100644 index 0000000000000000000000000000000000000000..ef01190bd84009e8357280cc38f25fb4ab7ede61 --- /dev/null +++ b/model_data_json/autogluon_chronos-bolt-tiny.json @@ -0,0 +1,21 @@ +{ + "model_id": "autogluon/chronos-bolt-tiny", + "downloads": 69867, + "tags": [ + "safetensors", + "t5", + "time series", + "forecasting", + "pretrained models", + "foundation models", + "time series foundation models", + "time-series", + "time-series-forecasting", + "arxiv:1910.10683", + "arxiv:2403.07815", + "license:apache-2.0", + "region:us" + ], + "description": "--- license: apache-2.0 pipeline_tag: time-series-forecasting tags: - time series - forecasting - pretrained models - foundation models - time series foundation models - time-series --- # Chronos-Bolt⚡ (Tiny) 🚀 **Update Feb 14, 2025**: Chronos-Bolt models are now available on Amazon SageMaker JumpStart! Check out the tutorial notebook to learn how to deploy Chronos endpoints for production use in a few lines of code. Chronos-Bolt is a family of pretrained time series forecasting models which can be used for zero-shot forecasting. It is based on the T5 encoder-decoder architecture and has been trained on nearly 100 billion time series observations. It chunks the historical time series context into patches of multiple observations, which are then input into the encoder. The decoder then uses these representations to directly generate quantile forecasts across multiple future steps—a method known as direct multi-step forecasting. Chronos-Bolt models are **more accurate**, up to **250 times faster** and **20 times more memory-efficient** than the original Chronos models of the same size. ## Performance The following plot compares the inference time of Chronos-Bolt against the original Chronos models for forecasting 1024 time series with a context length of 512 observations and a prediction horizon of 64 steps.
Chronos-Bolt models are not only significantly faster but also more accurate than the original Chronos models. The following plot reports the probabilistic and point forecasting performance of Chronos-Bolt in terms of the Weighted Quantile Loss (WQL) and the Mean Absolute Scaled Error (MASE), respectively, aggregated over 27 datasets (see the Chronos paper for details on this benchmark). Remarkably, despite having no prior exposure to these datasets during training, the zero-shot Chronos-Bolt models outperform commonly used statistical models and deep learning models that have been trained on these datasets (highlighted by *). Furthermore, they also perform better than other FMs, denoted by a +, which indicates that these models were pretrained on certain datasets in our benchmark and are not entirely zero-shot. Notably, Chronos-Bolt (Base) also surpasses the original Chronos (Large) model in terms of the forecasting accuracy while being over 600 times faster.
Chronos-Bolt models are available in the following sizes.
| Model | Parameters | Based on | | ---------------------------------------------------------------------- | ---------- | ---------------------------------------------------------------------- | | **chronos-bolt-tiny** | 9M | t5-efficient-tiny | | **chronos-bolt-mini** | 21M | t5-efficient-mini | | **chronos-bolt-small** | 48M | t5-efficient-small | | **chronos-bolt-base** | 205M | t5-efficient-base |
## Usage ### Zero-shot inference with Chronos-Bolt in AutoGluon Install the required dependencies. Forecast with the Chronos-Bolt model. For more advanced features such as **fine-tuning** and **forecasting with covariates**, check out this tutorial. ### Deploying a Chronos-Bolt endpoint to SageMaker First, update the SageMaker SDK to make sure that all the latest models are available. Deploy an inference endpoint to SageMaker. Now you can send time series data to the endpoint in JSON format. Chronos-Bolt models can be deployed to both CPU and GPU instances. These models also support **forecasting with covariates**. For more details about the endpoint API, check out the example notebook. ## Citation If you find Chronos or Chronos-Bolt models useful for your research, please consider citing the associated paper: ## License This project is licensed under the Apache-2.0 License.", + "model_explanation_gemini": "\"Pretrained time series forecasting model based on T5 architecture, offering zero-shot forecasting with improved accuracy, speed, and memory efficiency compared to original Chronos models.\"\n\nModel Features: \n- Zero-shot time series forecasting capability \n- T5 encoder-decoder architecture \n- Processes historical data in patched observations \n- Direct multi-step quantile forecasting \n- Pretrained on ~100B time series observations \n- Available in tiny/mini/small/base sizes \n\nComparison: \nChronos-Bolt" +} \ No newline at end of file diff --git a/model_data_json/bartowski_Llama-3.2-3B-Instruct-GGUF.json b/model_data_json/bartowski_Llama-3.2-3B-Instruct-GGUF.json new file mode 100644 index 0000000000000000000000000000000000000000..109972324f6bb28f8bf9f5532963ef987909a1b9 --- /dev/null +++ b/model_data_json/bartowski_Llama-3.2-3B-Instruct-GGUF.json @@ -0,0 +1,28 @@ +{ + "model_id": "bartowski/Llama-3.2-3B-Instruct-GGUF", + "downloads": 74797, + "tags": [ + "gguf", + "facebook", + "meta", + "llama", + "llama-3", + "text-generation", + "en", + "de", + "fr", + "it", + "pt", + "hi", + "es", + "th", + "base_model:meta-llama/Llama-3.2-3B-Instruct", + "base_model:quantized:meta-llama/Llama-3.2-3B-Instruct", + "license:llama3.2", + "endpoints_compatible", + "region:us", + "conversational" + ], + "description": "--- base_model: meta-llama/Llama-3.2-3B-Instruct language: - en - de - fr - it - pt - hi - es - th license: llama3.2 pipeline_tag: text-generation tags: - facebook - meta - llama - llama-3 quantized_by: bartowski extra_gated_prompt: \"### LLAMA 3.2 COMMUNITY LICENSE AGREEMENT\\n\\nLlama 3.2 Version\\ \\ Release Date: September 25, 2024\\n\\n“Agreement” means the terms and conditions\\ \\ for use, reproduction, distribution and modification of the Llama Materials set\\ \\ forth herein.\\n\\n“Documentation” means the specifications, manuals and documentation\\ \\ accompanying Llama 3.2 distributed by Meta at \\n“Licensee” or “you” means you, or your employer or any other person or entity\\ \\ (if you are entering into this Agreement on such person or entity’s behalf),\\ \\ of the age required under applicable laws, rules or regulations to provide legal\\ \\ consent and that has legal authority to bind your employer or such other person\\ \\ or entity if you are entering in this Agreement on their behalf.\\n\\n“Llama 3.2”\\ \\ means the foundational large language models and software and algorithms, including\\ \\ machine-learning model code, trained model weights, inference-enabling code, training-enabling\\ \\ code, fine-tuning enabling code and other elements of the foregoing distributed\\ \\ by Meta at Materials” means,\\ \\ collectively, Meta’s proprietary Llama 3.2 and Documentation (and any portion\\ \\ thereof) made available under this Agreement.\\n\\n“Meta” or “we” means Meta Platforms\\ \\ Ireland Limited (if you are located in or, if you are an entity, your principal\\ \\ place of business is in the EEA or Switzerland) and Meta Platforms, Inc. (if\\ \\ you are located outside of the EEA or Switzerland). \\n\\nBy clicking “I Accept”\\ \\ below or by using or distributing any portion or element of the Llama Materials,\\ \\ you agree to be bound by this Agreement.\\n\\n1. License Rights and Redistribution.\\n\\ a. Grant of Rights. You are granted a non-exclusive, worldwide, non-transferable\\ \\ and royalty-free limited license under Meta’s intellectual property or other rights\\ \\ owned by Meta embodied in the Llama Materials to use, reproduce, distribute,\\ \\ copy, create derivative works of, and make modifications to the Llama Materials.\\ \\ \\nb. Redistribution and Use. \\ni. If you distribute or make available the Llama\\ \\ Materials (or any derivative works thereof), or a product or service (including\\ \\ another AI model) that contains any of them, you shall (A) provide a copy of this\\ \\ Agreement with any such Llama Materials; and (B) prominently display “Built with\\ \\ Llama” on a related website, user interface, blogpost, about page, or product\\ \\ documentation. If you use the Llama Materials or any outputs or results of the\\ \\ Llama Materials to create, train, fine tune, or otherwise improve an AI model,\\ \\ which is distributed or made available, you shall also include “Llama” at the\\ \\ beginning of any such AI model name.\\nii. If you receive Llama Materials, or any\\ \\ derivative works thereof, from a Licensee as part of an integrated end user product,\\ \\ then Section 2 of this Agreement will not apply to you. \\niii. You must retain\\ \\ in all copies of the Llama Materials that you distribute the following attribution\\ \\ notice within a “Notice” text file distributed as a part of such copies: “Llama\\ \\ 3.2 is licensed under the Llama 3.2 Community License, Copyright © Meta Platforms,\\ \\ Inc. All Rights Reserved.”\\niv. Your use of the Llama Materials must comply with\\ \\ applicable laws and regulations (including trade compliance laws and regulations)\\ \\ and adhere to the Acceptable Use Policy for the Llama Materials (available at\\ \\ which is hereby incorporated by reference\\ \\ into this Agreement.\\n \\n2. Additional Commercial Terms. If, on the Llama 3.2\\ \\ version release date, the monthly active users of the products or services made\\ \\ available by or for Licensee, or Licensee’s affiliates, is greater than 700 million\\ \\ monthly active users in the preceding calendar month, you must request a license\\ \\ from Meta, which Meta may grant to you in its sole discretion, and you are not\\ \\ authorized to exercise any of the rights under this Agreement unless or until\\ \\ Meta otherwise expressly grants you such rights.\\n3. Disclaimer of Warranty. UNLESS\\ \\ REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM\\ \\ ARE PROVIDED ON AN “AS IS” BASIS, WITHOUT WARRANTIES OF ANY KIND, AND META DISCLAIMS\\ \\ ALL WARRANTIES OF ANY KIND, BOTH EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION,\\ \\ ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR\\ \\ PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING\\ \\ OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR\\ \\ USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS.\\n4. Limitation of Liability.\\ \\ IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY,\\ \\ WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING\\ \\ OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL,\\ \\ INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE\\ \\ BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING.\\n5. Intellectual Property.\\n\\ a. No trademark licenses are granted under this Agreement, and in connection with\\ \\ the Llama Materials, neither Meta nor Licensee may use any name or mark owned\\ \\ by or associated with the other or any of its affiliates, except as required\\ \\ for reasonable and customary use in describing and redistributing the Llama Materials\\ \\ or as set forth in this Section 5(a). Meta hereby grants you a license to use\\ \\ “Llama” (the “Mark”) solely as required to comply with the last sentence of Section\\ \\ 1.b.i. You will comply with Meta’s brand guidelines (currently accessible at\\ \\ All goodwill arising\\ \\ out of your use of the Mark will inure to the benefit of Meta.\\nb. Subject to\\ \\ Meta’s ownership of Llama Materials and derivatives made by or for Meta, with\\ \\ respect to any derivative works and modifications of the Llama Materials that\\ \\ are made by you, as between you and Meta, you are and will be the owner of such\\ \\ derivative works and modifications.\\nc. If you institute litigation or other proceedings\\ \\ against Meta or any entity (including a cross-claim or counterclaim in a lawsuit)\\ \\ alleging that the Llama Materials or Llama 3.2 outputs or results, or any portion\\ \\ of any of the foregoing, constitutes infringement of intellectual property or\\ \\ other rights owned or licensable by you, then any licenses granted to you under\\ \\ this Agreement shall terminate as of the date such litigation or claim is filed\\ \\ or instituted. You will indemnify and hold harmless Meta from and against any\\ \\ claim by any third party arising out of or related to your use or distribution\\ \\ of the Llama Materials.\\n6. Term and Termination. The term of this Agreement will\\ \\ commence upon your acceptance of this Agreement or access to the Llama Materials\\ \\ and will continue in full force and effect until terminated in accordance with\\ \\ the terms and conditions herein. Meta may terminate this Agreement if you are\\ \\ in breach of any term or condition of this Agreement. Upon termination of this\\ \\ Agreement, you shall delete and cease use of the Llama Materials. Sections 3,\\ \\ 4 and 7 shall survive the termination of this Agreement. \\n7. Governing Law and\\ \\ Jurisdiction. This Agreement will be governed and construed under the laws of\\ \\ the State of California without regard to choice of law principles, and the UN\\ \\ Convention on Contracts for the International Sale of Goods does not apply to\\ \\ this Agreement. The courts of California shall have exclusive jurisdiction of\\ \\ any dispute arising out of this Agreement. \\n### Llama 3.2 Acceptable Use Policy\\n\\ Meta is committed to promoting safe and fair use of its tools and features, including\\ \\ Llama 3.2. If you access or use Llama 3.2, you agree to this Acceptable Use Policy\\ \\ (“**Policy**”). The most recent copy of this policy can be found at #### Prohibited Uses\\nWe want everyone to use Llama 3.2 safely and responsibly.\\ \\ You agree you will not use, or allow others to use, Llama 3.2 to:\\n1. Violate\\ \\ the law or others’ rights, including to:\\n 1. Engage in, promote, generate,\\ \\ contribute to, encourage, plan, incite, or further illegal or unlawful activity\\ \\ or content, such as:\\n 1. Violence or terrorism\\n 2. Exploitation\\ \\ or harm to children, including the solicitation, creation, acquisition, or dissemination\\ \\ of child exploitative content or failure to report Child Sexual Abuse Material\\n\\ \\ 3. Human trafficking, exploitation, and sexual violence\\n 4. The\\ \\ illegal distribution of information or materials to minors, including obscene\\ \\ materials, or failure to employ legally required age-gating in connection with\\ \\ such information or materials.\\n 5. Sexual solicitation\\n 6. Any\\ \\ other criminal activity\\n 1. Engage in, promote, incite, or facilitate the\\ \\ harassment, abuse, threatening, or bullying of individuals or groups of individuals\\n\\ \\ 2. Engage in, promote, incite, or facilitate discrimination or other unlawful\\ \\ or harmful conduct in the provision of employment, employment benefits, credit,\\ \\ housing, other economic benefits, or other essential goods and services\\n 3.\\ \\ Engage in the unauthorized or unlicensed practice of any profession including,\\ \\ but not limited to, financial, legal, medical/health, or related professional\\ \\ practices\\n 4. Collect, process, disclose, generate, or infer private or sensitive\\ \\ information about individuals, including information about individuals’ identity,\\ \\ health, or demographic information, unless you have obtained the right to do so\\ \\ in accordance with applicable law\\n 5. Engage in or facilitate any action or\\ \\ generate any content that infringes, misappropriates, or otherwise violates any\\ \\ third-party rights, including the outputs or results of any products or services\\ \\ using the Llama Materials\\n 6. Create, generate, or facilitate the creation\\ \\ of malicious code, malware, computer viruses or do anything else that could disable,\\ \\ overburden, interfere with or impair the proper working, integrity, operation\\ \\ or appearance of a website or computer system\\n 7. Engage in any action, or\\ \\ facilitate any action, to intentionally circumvent or remove usage restrictions\\ \\ or other safety measures, or to enable functionality disabled by Meta \\n2. Engage\\ \\ in, promote, incite, facilitate, or assist in the planning or development of activities\\ \\ that present a risk of death or bodily harm to individuals, including use of Llama\\ \\ 3.2 related to the following:\\n 8. Military, warfare, nuclear industries or\\ \\ applications, espionage, use for materials or activities that are subject to the\\ \\ International Traffic Arms Regulations (ITAR) maintained by the United States\\ \\ Department of State or to the U.S. Biological Weapons Anti-Terrorism Act of 1989\\ \\ or the Chemical Weapons Convention Implementation Act of 1997\\n 9. Guns and\\ \\ illegal weapons (including weapon development)\\n 10. Illegal drugs and regulated/controlled\\ \\ substances\\n 11. Operation of critical infrastructure, transportation technologies,\\ \\ or heavy machinery\\n 12. Self-harm or harm to others, including suicide, cutting,\\ \\ and eating disorders\\n 13. Any content intended to incite or promote violence,\\ \\ abuse, or any infliction of bodily harm to an individual\\n3. Intentionally deceive\\ \\ or mislead others, including use of Llama 3.2 related to the following:\\n 14.\\ \\ Generating, promoting, or furthering fraud or the creation or promotion of disinformation\\n\\ \\ 15. Generating, promoting, or furthering defamatory content, including the\\ \\ creation of defamatory statements, images, or other content\\n 16. Generating,\\ \\ promoting, or further distributing spam\\n 17. Impersonating another individual\\ \\ without consent, authorization, or legal right\\n 18. Representing that the\\ \\ use of Llama 3.2 or outputs are human-generated\\n 19. Generating or facilitating\\ \\ false online engagement, including fake reviews and other means of fake online\\ \\ engagement \\n4. Fail to appropriately disclose to end users any known dangers\\ \\ of your AI system 5. Interact with third party tools, models, or software designed\\ \\ to generate unlawful content or engage in unlawful or harmful conduct and/or represent\\ \\ that the outputs of such tools, models, or software are associated with Meta or\\ \\ Llama 3.2\\n\\nWith respect to any multimodal models included in Llama 3.2, the\\ \\ rights granted under Section 1(a) of the Llama 3.2 Community License Agreement\\ \\ are not being granted to you if you are an individual domiciled in, or a company\\ \\ with a principal place of business in, the European Union. This restriction does\\ \\ not apply to end users of a product or service that incorporates any such multimodal\\ \\ models.\\n\\nPlease report any violation of this Policy, software “bug,” or other\\ \\ problems that could lead to a violation of this Policy through one of the following\\ \\ means:\\n\\n* Reporting issues with the model: * Reporting risky content generated by the model: developers.facebook.com/llama_output_feedback\\n\\ * Reporting bugs and security concerns: facebook.com/whitehat/info\\n\\ * Reporting violations of the Acceptable Use Policy or unlicensed uses of Llama\\ \\ 3.2: LlamaUseReport@meta.com\" extra_gated_fields: First Name: text Last Name: text Date of birth: date_picker Country: country Affiliation: text Job title: type: select options: - Student - Research Graduate - AI researcher - AI developer/engineer - Reporter - Other geo: ip_location ? By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy : checkbox extra_gated_description: The information you provide will be collected, stored, processed and shared in accordance with the Meta Privacy Policy. extra_gated_button_content: Submit --- ## Llamacpp imatrix Quantizations of Llama-3.2-3B-Instruct Using ...a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare, publicly display, publicly perform, sublicense, and distribute the Complementary Material, the Model, and Derivatives of the Model. Note that these files did not come from HuggingFace, but instead from modelscope. As such, some files that were present in the original repository may not be present. File integrity has been verified via checksum. # Original Model Card Stable Diffusion Inpainting is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask. The **Stable-Diffusion-Inpainting** was initialized with the weights of the Stable-Diffusion-v-1-2. First 595k steps regular training, then 440k steps of inpainting training at resolution 512x512 on “laion-aesthetics v2 5+” and 10% dropping of the text-conditioning to improve classifier-free classifier-free guidance sampling. For inpainting, the UNet has 5 additional input channels (4 for the encoded masked-image and 1 for the mask itself) whose weights were zero-initialized after restoring the non-inpainting checkpoint. During training, we generate synthetic masks and in 25% mask everything. :** English - **License:** The CreativeML OpenRAIL M license is an Open RAIL M license, adapted from the work that BigScience and the RAIL Initiative are jointly carrying in the area of responsible AI licensing. See also the article about the BLOOM Open RAIL license on which our license is based. - **Model Description:** This is a model that can be used to generate and modify images based on text prompts. It is a Latent Diffusion Model that uses a fixed, pretrained text encoder (CLIP ViT-L/14) as suggested in the Imagen paper. - **Resources for more information:** Paper. - **Cite as:** @InProceedings{Rombach_2022_CVPR, author = {Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Bj\\\"orn}, title = {High-Resolution Image Synthesis With Latent Diffusion Models}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2022}, pages = {10684-10695} } # Uses ## Direct Use The model is intended for research purposes only. Possible research areas and tasks include - Safe deployment of models which have the potential to generate harmful content. - Probing and understanding the limitations and biases of generative models. - Generation of artworks and use in design and other artistic processes. - Applications in educational or creative tools. - Research on generative models. Excluded uses are described below. ### Misuse, Malicious Use, and Out-of-Scope Use _Note: This section is taken from the DALLE-MINI model card, but applies in the same way to Stable Diffusion v1_. The model should not be used to intentionally create or disseminate images that create hostile or alienating environments for people. This includes generating images that people would foreseeably find disturbing, distressing, or offensive; or content that propagates historical or current stereotypes. #### Out-of-Scope Use The model was not trained to be factual or true representations of people or events, and therefore using the model to generate such content is out-of-scope for the abilities of this model. #### Misuse and Malicious Use Using the model to generate content that is cruel to individuals is a misuse of this model. This includes, but is not limited to: - Generating demeaning, dehumanizing, or otherwise harmful representations of people or their environments, cultures, religions, etc. - Intentionally promoting or propagating discriminatory content or harmful stereotypes. - Impersonating individuals without their consent. - Sexual content without consent of the people who might see it. - Mis- and disinformation - Representations of egregious violence and gore - Sharing of copyrighted or licensed material in violation of its terms of use. - Sharing content that is an alteration of copyrighted or licensed material in violation of its terms of use. ## Limitations and Bias ### Limitations - The model does not achieve perfect photorealism - The model cannot render legible text - The model does not perform well on more difficult tasks which involve compositionality, such as rendering an image corresponding to “A red cube on top of a blue sphere” - Faces and people in general may not be generated properly. - The model was trained mainly with English captions and will not work as well in other languages. - The autoencoding part of the model is lossy - The model was trained on a large-scale dataset LAION-5B which contains adult material and is not fit for product use without additional safety mechanisms and considerations. - No additional measures were used to deduplicate the dataset. As a result, we observe some degree of memorization for images that are duplicated in the training data. The training data can be searched at to possibly assist in the detection of memorized images. ### Bias While the capabilities of image generation models are impressive, they can also reinforce or exacerbate social biases. Stable Diffusion v1 was trained on subsets of LAION-2B(en), which consists of images that are primarily limited to English descriptions. Texts and images from communities and cultures that use other languages are likely to be insufficiently accounted for. This affects the overall output of the model, as white and western cultures are often set as the default. Further, the ability of the model to generate content with non-English prompts is significantly worse than with English-language prompts. ## Training **Training Data** The model developers used the following dataset for training the model: - LAION-2B (en) and subsets thereof (see next section) **Training Procedure** Stable Diffusion v1 is a latent diffusion model which combines an autoencoder with a diffusion model that is trained in the latent space of the autoencoder. During training, - Images are encoded through an encoder, which turns images into latent representations. The autoencoder uses a relative downsampling factor of 8 and maps images of shape H x W x 3 to latents of shape H/f x W/f x 4 - Text prompts are encoded through a ViT-L/14 text-encoder. - The non-pooled output of the text encoder is fed into the UNet backbone of the latent diffusion model via cross-attention. - The loss is a reconstruction objective between the noise that was added to the latent and the prediction made by the UNet. We currently provide six checkpoints, , and , , and which were trained as follows, - : 237k steps at resolution on laion2B-en. 194k steps at resolution on laion-high-resolution (170M examples from LAION-5B with resolution ). - : Resumed from . 515k steps at resolution on \"laion-improved-aesthetics\" (a subset of laion2B-en, filtered to images with an original size , estimated aesthetics score , and an estimated watermark probability . The watermark estimate is from the LAION-5B metadata, the aesthetics score is estimated using an improved aesthetics estimator). - : Resumed from . 195k steps at resolution on \"laion-improved-aesthetics\" and 10\\% dropping of the text-conditioning to improve classifier-free guidance sampling. - : Resumed from stable-diffusion-v1-2.225,000 steps at resolution 512x512 on \"laion-aesthetics v2 5+\" and 10 % dropping of the text-conditioning to classifier-free guidance sampling. - : Resumed from sd-v1-2.ckpt. 595k steps at resolution 512x512 on \"laion-aesthetics v2 5+\" and 10% dropping of the text-conditioning to improve classifier-free guidance sampling. - : Resumed from sd-v1-2.ckpt. 595k steps at resolution 512x512 on \"laion-aesthetics v2 5+\" and 10% dropping of the text-conditioning to improve classifier-free guidance sampling. Then 440k steps of inpainting training at resolution 512x512 on “laion-aesthetics v2 5+” and 10% dropping of the text-conditioning. For inpainting, the UNet has 5 additional input channels (4 for the encoded masked-image and 1 for the mask itself) whose weights were zero-initialized after restoring the non-inpainting checkpoint. During training, we generate synthetic masks and in 25% mask everything. - **Hardware:** 32 x 8 x A100 GPUs - **Optimizer:** AdamW - **Gradient Accumulations**: 2 - **Batch:** 32 x 8 x 2 x 4 = 2048 - **Learning rate:** warmup to 0.0001 for 10,000 steps and then kept constant ## Evaluation Results Evaluations with different classifier-free guidance scales (1.5, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0) and 50 PLMS sampling steps show the relative improvements of the checkpoints: !pareto Evaluated using 50 PLMS steps and 10000 random prompts from the COCO2017 validation set, evaluated at 512x512 resolution. Not optimized for FID scores. ## Inpainting Evaluation To assess the performance of the inpainting model, we used the same evaluation protocol as in our LDM paper. Since the Stable Diffusion Inpainting Model acccepts a text input, we simply used a fixed prompt of . | Model | FID | LPIPS | |-----------------------------|------|------------------| | Stable Diffusion Inpainting | 1.00 | 0.141 (+- 0.082) | | Latent Diffusion Inpainting | 1.50 | 0.137 (+- 0.080) | | CoModGAN | 1.82 | 0.15 | | LaMa | 2.21 | 0.134 (+- 0.080) | ## Environmental Impact **Stable Diffusion v1** **Estimated Emissions** Based on that information, we estimate the following CO2 emissions using the Machine Learning Impact calculator presented in Lacoste et al. (2019). The hardware, runtime, cloud provider, and compute region were utilized to estimate the carbon impact. - **Hardware Type:** A100 PCIe 40GB - **Hours used:** 150000 - **Cloud Provider:** AWS - **Compute Region:** US-east - **Carbon Emitted (Power consumption x Time x Carbon produced based on location of power grid):** 11250 kg CO2 eq. ## Citation *This model card was written by: Robin Rombach and Patrick Esser and is based on the DALL-E Mini model card.*", + "model_explanation_gemini": "Generates and modifies images via text prompts with inpainting capabilities, using masked inputs to edit specific areas while maintaining photo-realism.\n\n**Model Features:** \n- Text-to-image generation with inpainting (mask-based editing) \n- Trained on laion-aesthetics v2 5+ dataset at 512x512 resolution \n- Modified UNet architecture with 5 extra input channels (4 for masked image, 1 for mask) \n- Supports classifier-free guidance sampling \n- Open" +} \ No newline at end of file diff --git a/model_data_json/black-forest-labs_FLUX.1-Fill-dev.json b/model_data_json/black-forest-labs_FLUX.1-Fill-dev.json new file mode 100644 index 0000000000000000000000000000000000000000..6ab0d5922ec6700f090aa5d37d6945fdce9c1386 --- /dev/null +++ b/model_data_json/black-forest-labs_FLUX.1-Fill-dev.json @@ -0,0 +1,17 @@ +{ + "model_id": "black-forest-labs/FLUX.1-Fill-dev", + "downloads": 324576, + "tags": [ + "diffusers", + "safetensors", + "image-generation", + "flux", + "diffusion-single-file", + "en", + "license:other", + "diffusers:FluxFillPipeline", + "region:us" + ], + "description": "--- language: - en license: other license_name: flux-1-dev-non-commercial-license license_link: LICENSE.md extra_gated_prompt: By clicking \"Agree\", you agree to the FluxDev Non-Commercial License Agreement and acknowledge the Acceptable Use Policy. tags: - image-generation - flux - diffusion-single-file --- !image/jpeg is a 12 billion parameter rectified flow transformer capable of filling areas in existing images based on a text description. For more information, please read our blog post. # Key Features 1. Cutting-edge output quality, second only to our state-of-the-art model . 2. Blends impressive prompt following with completing the structure of your source image. 3. Trained using guidance distillation, making more efficient. 4. Open weights to drive new scientific research, and empower artists to develop innovative workflows. 5. Generated outputs can be used for personal, scientific, and commercial purposes as described in the Non-Commercial License. # Usage We provide a reference implementation of , as well as sampling code, in a dedicated github repository. Developers and creatives looking to build on top of are encouraged to use this as a starting point. ## API Endpoints The FLUX.1 models are also available in our API bfl.ml !image/png ## Diffusers To use with the 🧨 diffusers python library, first install or upgrade diffusers Then you can use to run the model To learn more check out the diffusers documentation --- # Limitations - This model is not intended or able to provide factual information. - As a statistical model this checkpoint might amplify existing societal biases. - The model may fail to generate output that matches the prompts. - Prompt following is heavily influenced by the prompting-style. - There may be slight-color shifts in areas that are not filled in - Filling in complex textures may produce lines at the edges of the filled-area. # Out-of-Scope Use The model and its derivatives may not be used - In any way that violates any applicable national, federal, state, local or international law or regulation. - For the purpose of exploiting, harming or attempting to exploit or harm minors in any way; including but not limited to the solicitation, creation, acquisition, or dissemination of child exploitative content. - To generate or disseminate verifiably false information and/or content with the purpose of harming others. - To generate or disseminate personal identifiable information that can be used to harm an individual. - To harass, abuse, threat", + "model_explanation_gemini": "A 12 billion parameter rectified flow transformer that fills areas in existing images based on text descriptions while maintaining structural coherence. \n\n**Features:** \n1. High-quality image inpainting with text guidance \n2. Strong prompt adherence with source image structure preservation \n3. Guidance distillation for improved efficiency \n4. Open weights for research and artistic applications \n5. Non-commercial license for generated outputs \n\n**Comparison:** \nPositioned as second-best in output quality among the developer's model lineup, with" +} \ No newline at end of file diff --git a/model_data_json/black-forest-labs_FLUX.1-Redux-dev.json b/model_data_json/black-forest-labs_FLUX.1-Redux-dev.json new file mode 100644 index 0000000000000000000000000000000000000000..e1284248e87d7c6317baa11d022441a5fb6d0077 --- /dev/null +++ b/model_data_json/black-forest-labs_FLUX.1-Redux-dev.json @@ -0,0 +1,17 @@ +{ + "model_id": "black-forest-labs/FLUX.1-Redux-dev", + "downloads": 204436, + "tags": [ + "diffusers", + "safetensors", + "image-generation", + "flux", + "diffusion-single-file", + "en", + "license:other", + "diffusers:FluxPriorReduxPipeline", + "region:us" + ], + "description": "--- language: - en license: other license_name: flux-1-dev-non-commercial-license license_link: LICENSE.md extra_gated_prompt: By clicking \"Agree\", you agree to the FluxDev Non-Commercial License Agreement and acknowledge the Acceptable Use Policy. tags: - image-generation - flux - diffusion-single-file --- !image/png FLUX.1 Redux [dev] is an adapter for all FLUX.1 base models for image variation generation. Given an input image, FLUX.1 Redux can reproduce the image with slight variation, allowing to refine a given image. It naturally integrates into more complex workflows unlocking image restyling. Restyling via text is also available through our API by providing an image plus a language prompt. For more information, please read our blog post. # Usage We provide a reference implementation of , as well as sampling code, in a dedicated github repository. ## API Endpoints is available in our API bfl.ml. In addition to the adapter, the API endpoint allows users to modify an image given a textual description. The feature is supported in our latest model FLUX1.1 [pro] Ultra, allowing for combining input images and text prompts to create high-quality 4-megapixel outputs with flexible aspect ratios. !image/png ## Diffusers To use with the 🧨 diffusers python library, first install or upgrade diffusers Then you can use along with to generate images from images. To learn more check out the diffusers documentation --- # Limitations - This model is not intended or able to provide factual information. - As a statistical model this checkpoint might amplify existing societal biases. - The model may fail to generate output that matches the prompts. - Outputs are heavily influenced by the input image. # Out-of-Scope Use The model and its derivatives may not be used - In any way that violates any applicable national, federal, state, local or international law or regulation. - For the purpose of exploiting, harming or attempting to exploit or harm minors in any way; including but not limited to the solicitation, creation, acquisition, or dissemination of child exploitative content. - To generate or disseminate verifiably false information and/or content with the purpose of harming others. - To generate or disseminate personal identifiable information that can be used to harm an individual. - To harass, abuse, threaten, stalk, or bully individuals or groups of individuals. - To create non-consensual nudity or illegal pornographic content. - For fully automated decision making that adversely impacts an individual's legal rights or otherwise creates or modifies a binding, enforceable obligation. - Generating or facilitating large-scale disinformation campaigns. # License This model falls under the Non-Commercial License.", + "model_explanation_gemini": "Generates slightly varied versions of input images for refinement or restyling, with optional text-based modifications via API.\n\n**Features:** \n- Image variation generation from input images \n- Integration into complex workflows for image restyling \n- Text-based image modification via API (requires FLUX1.1 [pro] Ultra) \n- Compatible with 🧨 diffusers library \n\n**Comparison:** \nUnlike base FLUX.1 models, this adapter specializes in image variation rather than direct generation," +} \ No newline at end of file diff --git a/model_data_json/black-forest-labs_FLUX.1-dev.json b/model_data_json/black-forest-labs_FLUX.1-dev.json new file mode 100644 index 0000000000000000000000000000000000000000..29cd5d89e9840ff9e24ce50f0e6c866f89b4e448 --- /dev/null +++ b/model_data_json/black-forest-labs_FLUX.1-dev.json @@ -0,0 +1,18 @@ +{ + "model_id": "black-forest-labs/FLUX.1-dev", + "downloads": 2666194, + "tags": [ + "diffusers", + "safetensors", + "text-to-image", + "image-generation", + "flux", + "en", + "license:other", + "endpoints_compatible", + "diffusers:FluxPipeline", + "region:us" + ], + "description": "--- language: - en license: other license_name: flux-1-dev-non-commercial-license license_link: LICENSE.md extra_gated_prompt: By clicking \"Agree\", you agree to the FluxDev Non-Commercial License Agreement and acknowledge the Acceptable Use Policy. tags: - text-to-image - image-generation - flux --- ![FLUX.1 [dev] Grid](./dev_grid.jpg) is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions. For more information, please read our blog post. # Key Features 1. Cutting-edge output quality, second only to our state-of-the-art model . 2. Competitive prompt following, matching the performance of closed source alternatives . 3. Trained using guidance distillation, making more efficient. 4. Open weights to drive new scientific research, and empower artists to develop innovative workflows. 5. Generated outputs can be used for personal, scientific, and commercial purposes as described in the Non-Commercial License. # Usage We provide a reference implementation of , as well as sampling code, in a dedicated github repository. Developers and creatives looking to build on top of are encouraged to use this as a starting point. ## API Endpoints The FLUX.1 models are also available via API from the following sources - bfl.ml (currently ) - replicate.com - fal.ai - mystic.ai ## ComfyUI is also available in Comfy UI for local inference with a node-based workflow. ## Diffusers To use with the 🧨 diffusers python library, first install or upgrade diffusers Then you can use to run the model To learn more check out the diffusers documentation --- # Limitations - This model is not intended or able to provide factual information. - As a statistical model this checkpoint might amplify existing societal biases. - The model may fail to generate output that matches the prompts. - Prompt following is heavily influenced by the prompting-style. # Out-of-Scope Use The model and its derivatives may not be used - In any way that violates any applicable national, federal, state, local or international law or regulation. - For the purpose of exploiting, harming or attempting to exploit or harm minors in any way; including but not limited to the solicitation, creation, acquisition, or dissemination of child exploitative content. - To generate or disseminate verifiably false information and/or content with the purpose of harming others. - To generate or disseminate personal identifiable information that can be used to harm an individual. - To harass, abuse, threaten, stalk, or bully individuals or groups of individuals. - To create non-consensual nudity or illegal pornographic content. - For fully automated decision making that adversely impacts an individual's legal rights or otherwise creates or modifies a binding, enforceable obligation. - Generating or facilitating large-scale disinformation campaigns. # License This model falls under the Non-Commercial License.", + "model_explanation_gemini": "Generates high-quality images from text descriptions using a 12B parameter rectified flow transformer, with competitive prompt adherence and open weights for non-commercial use. \n\n**Features**: \n1. Text-to-image generation \n2. 12B parameter rectified flow transformer \n3. Guidance distillation for efficiency \n4. Open weights for research/artistic workflows \n5. Outputs usable under Non-Commercial License \n\n**Comparison**: \nRanks slightly below the developer's state-of-the-art model" +} \ No newline at end of file diff --git a/model_data_json/black-forest-labs_FLUX.1-schnell.json b/model_data_json/black-forest-labs_FLUX.1-schnell.json new file mode 100644 index 0000000000000000000000000000000000000000..ba9354a8ecd5859a57ececfe1f2dfb44a9f2d33a --- /dev/null +++ b/model_data_json/black-forest-labs_FLUX.1-schnell.json @@ -0,0 +1,18 @@ +{ + "model_id": "black-forest-labs/FLUX.1-schnell", + "downloads": 495889, + "tags": [ + "diffusers", + "safetensors", + "text-to-image", + "image-generation", + "flux", + "en", + "license:apache-2.0", + "endpoints_compatible", + "diffusers:FluxPipeline", + "region:us" + ], + "description": "--- language: - en license: apache-2.0 tags: - text-to-image - image-generation - flux --- ![FLUX.1 [schnell] Grid](./schnell_grid.jpeg) is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions. For more information, please read our blog post. # Key Features 1. Cutting-edge output quality and competitive prompt following, matching the performance of closed source alternatives. 2. Trained using latent adversarial diffusion distillation, can generate high-quality images in only 1 to 4 steps. 3. Released under the licence, the model can be used for personal, scientific, and commercial purposes. # Usage We provide a reference implementation of , as well as sampling code, in a dedicated github repository. Developers and creatives looking to build on top of are encouraged to use this as a starting point. ## API Endpoints The FLUX.1 models are also available via API from the following sources - bfl.ml (currently ) - replicate.com - fal.ai - mystic.ai ## ComfyUI is also available in Comfy UI for local inference with a node-based workflow. ## Diffusers To use with the 🧨 diffusers python library, first install or upgrade diffusers Then you can use to run the model To learn more check out the diffusers documentation --- # Limitations - This model is not intended or able to provide factual information. - As a statistical model this checkpoint might amplify existing societal biases. - The model may fail to generate output that matches the prompts. - Prompt following is heavily influenced by the prompting-style. # Out-of-Scope Use The model and its derivatives may not be used - In any way that violates any applicable national, federal, state, local or international law or regulation. - For the purpose of exploiting, harming or attempting to exploit or harm minors in any way; including but not limited to the solicitation, creation, acquisition, or dissemination of child exploitative content. - To generate or disseminate verifiably false information and/or content with the purpose of harming others. - To generate or disseminate personal identifiable information that can be used to harm an individual. - To harass, abuse, threaten, stalk, or bully individuals or groups of individuals. - To create non-consensual nudity or illegal pornographic content. - For fully automated decision making that adversely impacts an individual's legal rights or otherwise creates or modifies a binding, enforceable obligation. - Generating or facilitating large-scale disinformation campaigns.", + "model_explanation_gemini": "Generates high-quality images from text descriptions in 1–4 steps using a 12B parameter rectified flow transformer, with competitive performance against closed-source alternatives. \n\n**Features:** \n1. Text-to-image generation \n2. High output quality and prompt adherence \n3. Fast inference (1–4 steps via latent adversarial diffusion distillation) \n4. Open weights (Apache 2.0 license for personal/commercial use) \n5. Integration options: Diffusers, ComfyUI," +} \ No newline at end of file diff --git a/model_data_json/briaai_RMBG-2.0.json b/model_data_json/briaai_RMBG-2.0.json new file mode 100644 index 0000000000000000000000000000000000000000..5759663b5ca880852b92d43facc2c7b5bc9fa26e --- /dev/null +++ b/model_data_json/briaai_RMBG-2.0.json @@ -0,0 +1,23 @@ +{ + "model_id": "briaai/RMBG-2.0", + "downloads": 356509, + "tags": [ + "transformers", + "pytorch", + "onnx", + "safetensors", + "image-segmentation", + "remove background", + "background", + "background-removal", + "Pytorch", + "vision", + "legal liability", + "transformers.js", + "custom_code", + "license:other", + "region:us" + ], + "description": "--- license: other license_name: bria-rmbg-2.0 license_link: pipeline_tag: image-segmentation tags: - remove background - background - background-removal - Pytorch - vision - legal liability - transformers - transformers.js extra_gated_description: >- Bria AI Model weights are open source for non commercial use only, per the provided license. extra_gated_heading: Fill in this form to immediatly access the model for non commercial use extra_gated_fields: Name: text Email: text Company/Org name: text Company Website URL: text Discord user: text I agree to BRIA’s Privacy policy, Terms & conditions, and acknowledge Non commercial use to be Personal use / Academy / Non profit (direct or indirect): checkbox --- # BRIA Background Removal v2.0 Model Card RMBG v2.0 is our new state-of-the-art background removal model significantly improves RMBG v1.4. The model is designed to effectively separate foreground from background in a range of categories and image types. This model has been trained on a carefully selected dataset, which includes: general stock images, e-commerce, gaming, and advertising content, making it suitable for commercial use cases powering enterprise content creation at scale. The accuracy, efficiency, and versatility currently rival leading source-available models. It is ideal where content safety, legally licensed datasets, and bias mitigation are paramount. Developed by BRIA AI, RMBG v2.0 is available as a source-available model for non-commercial use. ### Get Access Bria RMBG2.0 is availabe everywhere you build, either as source-code and weights, ComfyUI nodes or API endpoints. - **Purchase:** for commercial license simply click Here. - **API Endpoint**: Bria.ai, fal.ai - **ComfyUI**: Use it in workflows For more information, please visit our website. Join our Discord community for more information, tutorials, tools, and to connect with other users! CLICK HERE FOR A DEMO !examples ## Model Details ##### ### Model Description - **Developed by:** BRIA AI - **Model type:** Background Removal - **License:** Creative Commons Attribution–Non-Commercial (CC BY-NC 4.0) - The model is released under a CC BY-NC 4.0 license for non-commercial use. - Commercial use is subject to a commercial agreement with BRIA. Available here **Purchase:** to purchase a commercial license simply click Here. - **Model Description:** BRIA RMBG-2.0 is a dichotomous image segmentation model trained exclusively on a professional-grade dataset. - **BRIA:** Resources for more information: BRIA AI ## Training data Bria-RMBG model was trained with over 15,000 high-quality, high-resolution, manually labeled (pixel-wise accuracy), fully licensed images. Our benchmark included balanced gender, balanced ethnicity, and people with different types of disabilities. For clarity, we provide our data distribution according to different categories, demonstrating our model’s versatility. ### Distribution of images: | Category | Distribution | | -----------------------------------| -----------------------------------:| | Objects only | 45.11% | | People with objects/animals | 25.24% | | People only | 17.35% | | people/objects/animals with text | 8.52% | | Text only | 2.52% | | Animals only | 1.89% | | Category | Distribution | | -----------------------------------| -----------------------------------------:| | Photorealistic | 87.70% | | Non-Photorealistic | 12.30% | | Category | Distribution | | -----------------------------------| -----------------------------------:| | Non Solid Background | 52.05% | | Solid Background | 47.95% | Category | Distribution | | -----------------------------------| -----------------------------------:| | Single main foreground object | 51.42% | | Multiple objects in the foreground | 48.58% | ## Qualitative Evaluation Open source models comparison !diagram !examples ### Architecture RMBG-2.0 is developed on the BiRefNet architecture enhanced with our proprietary dataset and training scheme. This training data significantly improves the model’s accuracy and effectiveness for background-removal task.
If you use this model in your research, please cite: #### Requirements ### Usage ", + "model_explanation_gemini": "\"BRIA AI's RMBG-2.0 is a background removal model that separates foreground from background in various image types, trained on a licensed dataset for accuracy and versatility, available for non-commercial use.\"\n\nModel Features: \n- State-of-the-art background removal (improved over v1.4) \n- Trained on 15,000+ licensed images (e-commerce, gaming, advertising, etc.) \n- Balanced dataset (gender, ethnicity, disabilities) \n- Handles" +} \ No newline at end of file diff --git a/model_data_json/ckiplab_bert-base-chinese.json b/model_data_json/ckiplab_bert-base-chinese.json new file mode 100644 index 0000000000000000000000000000000000000000..76ac3ab22d7069d37005f6ac93355ac1c544a98a --- /dev/null +++ b/model_data_json/ckiplab_bert-base-chinese.json @@ -0,0 +1,19 @@ +{ + "model_id": "ckiplab/bert-base-chinese", + "downloads": 80090, + "tags": [ + "transformers", + "pytorch", + "jax", + "bert", + "fill-mask", + "lm-head", + "zh", + "license:gpl-3.0", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: - zh thumbnail: tags: - pytorch - lm-head - bert - zh license: gpl-3.0 --- # CKIP BERT Base Chinese This project provides traditional Chinese transformers models (including ALBERT, BERT, GPT2) and NLP tools (including word segmentation, part-of-speech tagging, named entity recognition). 這個專案提供了繁體中文的 transformers 模型(包含 ALBERT、BERT、GPT2)及自然語言處理工具(包含斷詞、詞性標記、實體辨識)。 ## Homepage - ## Contributers - Mu Yang at CKIP (Author & Maintainer) ## Usage Please use BertTokenizerFast as tokenizer instead of AutoTokenizer. 請使用 BertTokenizerFast 而非 AutoTokenizer。 For full usage and more information, please refer to 有關完整使用方法及其他資訊,請參見 。", + "model_explanation_gemini": "\"Provides traditional Chinese BERT-based models and NLP tools for tasks like word segmentation, POS tagging, and named entity recognition.\"\n\nModel Features: \n- BERT-based architecture \n- Traditional Chinese language support \n- Includes NLP tools (word segmentation, POS tagging, NER) \n- Requires BertTokenizerFast for tokenization \n\nComparison: \nThe description mentions this model is part of a broader project offering traditional Chinese transformers (ALBERT, BERT, GPT2), implying BERT is optimized for tasks" +} \ No newline at end of file diff --git a/model_data_json/classla_multilingual-IPTC-news-topic-classifier.json b/model_data_json/classla_multilingual-IPTC-news-topic-classifier.json new file mode 100644 index 0000000000000000000000000000000000000000..e507fa3ebc3e860a266a7232f5e3b87842dec572 --- /dev/null +++ b/model_data_json/classla_multilingual-IPTC-news-topic-classifier.json @@ -0,0 +1,116 @@ +{ + "model_id": "classla/multilingual-IPTC-news-topic-classifier", + "downloads": 73204, + "tags": [ + "safetensors", + "xlm-roberta", + "text-classification", + "IPTC", + "news", + "news topic", + "IPTC topic", + "IPTC NewsCode", + "topic categorization", + "multilingual", + "af", + "am", + "ar", + "as", + "az", + "be", + "bg", + "bn", + "br", + "bs", + "ca", + "cs", + "cy", + "da", + "de", + "el", + "en", + "eo", + "es", + "et", + "eu", + "fa", + "fi", + "fr", + "fy", + "ga", + "gd", + "gl", + "gu", + "ha", + "he", + "hi", + "hr", + "hu", + "hy", + "id", + "is", + "it", + "ja", + "jv", + "ka", + "kk", + "km", + "kn", + "ko", + "ku", + "ky", + "la", + "lo", + "lt", + "lv", + "mg", + "mk", + "ml", + "mn", + "mr", + "ms", + "my", + "ne", + "nl", + "no", + "om", + "or", + "pa", + "pl", + "ps", + "pt", + "ro", + "ru", + "sa", + "sd", + "si", + "sk", + "sl", + "so", + "sq", + "sr", + "su", + "sv", + "sw", + "ta", + "te", + "th", + "tl", + "tr", + "ug", + "uk", + "ur", + "uz", + "vi", + "xh", + "yi", + "zh", + "base_model:FacebookAI/xlm-roberta-large", + "base_model:finetune:FacebookAI/xlm-roberta-large", + "doi:10.57967/hf/4709", + "license:cc-by-sa-4.0", + "region:us" + ], + "description": "--- license: cc-by-sa-4.0 language: - multilingual - af - am - ar - as - az - be - bg - bn - br - bs - ca - cs - cy - da - de - el - en - eo - es - et - eu - fa - fi - fr - fy - ga - gd - gl - gu - ha - he - hi - hr - hu - hy - id - is - it - ja - jv - ka - kk - km - kn - ko - ku - ky - la - lo - lt - lv - mg - mk - ml - mn - mr - ms - my - ne - nl - 'no' - om - or - pa - pl - ps - pt - ro - ru - sa - sd - si - sk - sl - so - sq - sr - su - sv - sw - ta - te - th - tl - tr - ug - uk - ur - uz - vi - xh - yi - zh tags: - text-classification - IPTC - news - news topic - IPTC topic - IPTC NewsCode - topic categorization widget: - text: >- Moment dog sparks house fire after chewing power bank An indoor monitoring camera shows the moment a dog unintentionally caused a house fire after chewing on a portable lithium-ion battery power bank. example_title: English - text: >- Ministarstvo unutarnjih poslova posljednjih mjeseci radilo je na izradi Nacrta prijedloga Zakona o strancima. Naime, važeći Zakon o strancima usklađen je s 22 direktive, preporuke, odluke i rezolucije, te s obzirom da je riječ o velikom broju odredaba potrebno ih je jasnije propisati, a sve u cilju poboljšanja transparentnosti i preglednosti. example_title: Croatian - text: >- V okviru letošnjega praznovanja spominskega dneva občine Trebnje Baragov dan je v soboto, 28. junija 2014, na obvezni god Marijinega Srca v župnijski cerkvi v Trebnjem daroval mašo za domovino apostolski nuncij v Republiki Sloveniji Njegova ekselenca Nadškof msgr. Juliusz Janusz. example_title: Slovenian base_model: - FacebookAI/xlm-roberta-large --- # Multilingual IPTC Media Topic Classifier News topic classification model based on []( and fine-tuned on a news corpus in 4 languages (Croatian, Slovenian, Catalan and Greek), annotated with the top-level IPTC Media Topic NewsCodes labels. The development and evaluation of the model is described in the paper LLM Teacher-Student Framework for Text Classification With No Manually Annotated Data: A Case Study in IPTC News Topic Classification (Kuzman and Ljubešić, 2025). The model can be used for classification into topic labels from the IPTC NewsCodes schema and can be applied to any news text in a language, supported by the . Based on a manually-annotated test set (in Croatian, Slovenian, Catalan and Greek), the model achieves macro-F1 score of 0.746, micro-F1 score of 0.734, and accuracy of 0.734, and outperforms the GPT-4o model (version ) used in a zero-shot setting. If we use only labels that are predicted with a confidence score equal or higher than 0.90, the model achieves micro-F1 and macro-F1 of 0.80. ## Intended use and limitations For reliable results, the classifier should be applied to documents of sufficient length (the rule of thumb is at least 75 words). Use example: ## IPTC Media Topic categories The classifier uses the top-level of the IPTC Media Topic NewsCodes schema, consisting of 17 labels. ### List of labels ### Description of labels The descriptions of the labels are based on the descriptions provided in the IPTC Media Topic NewsCodes schema and enriched with information which specific subtopics belong to the top-level topics, based on the IPTC Media Topic label hierarchy. | Label | Description | |:------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | disaster, accident and emergency incident | Man-made or natural events resulting in injuries, death or damage, e.g., explosions, transport accidents, famine, drowning, natural disasters, emergency planning and response. | | human interest | News about life and behavior of royalty and celebrities, news about obtaining awards, ceremonies (graduation, wedding, funeral, celebration of launching something), birthdays and anniversaries, and news about silly or stupid human errors. | | politics | News about local, regional, national and international exercise of power, including news about election, fundamental rights, government, non-governmental organisations, political crises, non-violent international relations, public employees, government policies. | | education | All aspects of furthering knowledge, formally or informally, including news about schools, curricula, grading, remote learning, teachers and students. | | crime, law and justice | News about committed crime and illegal activities, the system of courts, law and law enforcement (e.g., judges, lawyers, trials, punishments of offenders). | | economy, business and finance | News about companies, products and services, any kind of industries, national economy, international trading, banks, (crypto)currency, business and trade societies, economic trends and indicators (inflation, employment statistics, GDP, mortgages, ...), international economic institutions, utilities (electricity, heating, waste management, water supply). | | conflict, war and peace | News about terrorism, wars, wars victims, cyber warfare, civil unrest (demonstrations, riots, rebellions), peace talks and other peace activities. | | arts, culture, entertainment and media | News about cinema, dance, fashion, hairstyle, jewellery, festivals, literature, music, theatre, TV shows, painting, photography, woodworking, art exhibitions, libraries and museums, language, cultural heritage, news media, radio and television, social media, influencers, and disinformation. | | labour | News about employment, employment legislation, employees and employers, commuting, parental leave, volunteering, wages, social security, labour market, retirement, unemployment, unions. | | weather | News about weather forecasts, weather phenomena and weather warning. | | religion | News about religions, cults, religious conflicts, relations between religion and government, churches, religious holidays and festivals, religious leaders and rituals, and religious texts. | | society | News about social interactions (e.g., networking), demographic analyses, population census, discrimination, efforts for inclusion and equity, emigration and immigration, communities of people and minorities (LGBTQ, older people, children, indigenous people, etc.), homelessness, poverty, societal problems (addictions, bullying), ethical issues (suicide, euthanasia, sexual behavior) and social services and charity, relationships (dating, divorce, marriage), family (family planning, adoption, abortion, contraception, pregnancy, parenting). | | health | News about diseases, injuries, mental health problems, health treatments, diets, vaccines, drugs, government health care, hospitals, medical staff, health insurance. | | environment | News about climate change, energy saving, sustainability, pollution, population growth, natural resources, forests, mountains, bodies of water, ecosystem, animals, flowers and plants. | | lifestyle and leisure | News about hobbies, clubs and societies, games, lottery, enthusiasm about food or drinks, car/motorcycle lovers, public holidays, leisure venues (amusement parks, cafes, bars, restaurants, etc.), exercise and fitness, outdoor recreational activities (e.g., fishing, hunting), travel and tourism, mental well-being, parties, maintaining and decorating house and garden. | | science and technology | News about natural sciences and social sciences, mathematics, technology and engineering, scientific institutions, scientific research, scientific publications and innovation. | | sport | News about sports that can be executed in competitions, e.g., basketball, football, swimming, athletics, chess, dog racing, diving, golf, gymnastics, martial arts, climbing, etc.; sport achievements, sport events, sport organisation, sport venues (stadiums, gymnasiums, ...), referees, coaches, sport clubs, drug use in sport. | ## Training data The model was fine-tuned on the training split of the EMMediaTopic 1.0 dataset consisting of 15,000 news in four languages (Croatian, Slovenian, Catalan and Greek). The news texts were extracted from the MaCoCu-Genre web corpora based on the \"News\" genre label, predicted with the X-GENRE classifier. The training dataset was automatically annotated with the IPTC Media Topic labels by the GPT-4o model (yielding 0.72 micro-F1 and 0.73 macro-F1 on the test dataset). The code for the development and evaluation of the model is available on this GitHub repository. Label distribution in the training dataset: | labels | count | proportion | |:------------------------------------------|--------:|-------------:| | sport | 2300 | 0.153333 | | arts, culture, entertainment and media | 2117 | 0.141133 | | politics | 2018 | 0.134533 | | economy, business and finance | 1670 | 0.111333 | | human interest | 1152 | 0.0768 | | education | 990 | 0.066 | | crime, law and justice | 884 | 0.0589333 | | health | 675 | 0.045 | | disaster, accident and emergency incident | 610 | 0.0406667 | | society | 481 | 0.0320667 | | environment | 472 | 0.0314667 | | lifestyle and leisure | 346 | 0.0230667 | | science and technology | 340 | 0.0226667 | | conflict, war and peace | 311 | 0.0207333 | | labour | 288 | 0.0192 | | religion | 258 | 0.0172 | | weather | 88 | 0.00586667 | ## Performance The model was evaluated on a manually-annotated test set in four languages (Croatian, Slovenian, Catalan and Greek), consisting of 1,129 instances. The test set contains similar amounts of texts from the four languages and is more or less balanced across labels. The model was shown to achieve micro-F1 score of 0.734, and macro-F1 score of 0.746. The results for the entire test set and per language: | | Micro-F1 | Macro-F1 | Accuracy | No. of instances | |:---|-----------:|-----------:|-----------:|-----------:| | All (combined) | 0.734278 | 0.745864 | 0.734278 | 1129 | | Croatian | 0.728522 | 0.733725 | 0.728522 | 291 | | Catalan | 0.715356 | 0.722304 | 0.715356 | 267 | | Slovenian | 0.758865 | 0.764784 | 0.758865 | 282 | | Greek | 0.733564 | 0.747129 | 0.733564 | 289 | Performance per label: | | precision | recall | f1-score | support | |:------------------------------------------|------------:|---------:|-----------:|------------:| | arts, culture, entertainment and media | 0.602151 | 0.875 | 0.713376 | 64 | | conflict, war and peace | 0.611111 | 0.916667 | 0.733333 | 36 | | crime, law and justice | 0.861538 | 0.811594 | 0.835821 | 69 | | disaster, accident and emergency incident | 0.691176 | 0.886792 | 0.77686 | 53 | | economy, business and finance | 0.779221 | 0.508475 | 0.615385 | 118 | | education | 0.847458 | 0.735294 | 0.787402 | 68 | | environment | 0.589041 | 0.754386 | 0.661538 | 57 | | health | 0.79661 | 0.79661 | 0.79661 | 59 | | human interest | 0.552239 | 0.672727 | 0.606557 | 55 | | labour | 0.855072 | 0.830986 | 0.842857 | 71 | | lifestyle and leisure | 0.773585 | 0.476744 | 0.589928 | 86 | | politics | 0.568182 | 0.735294 | 0.641026 | 68 | | religion | 0.842105 | 0.941176 | 0.888889 | 51 | | science and technology | 0.637681 | 0.8 | 0.709677 | 55 | | society | 0.918033 | 0.5 | 0.647399 | 112 | | sport | 0.824324 | 0.968254 | 0.890511 | 63 | | weather | 0.953488 | 0.931818 | 0.942529 | 44 | For downstream tasks, **we advise you to use only labels that were predicted with confidence score higher or equal to 0.90 which further improves the performance**. When we remove instances predicted with lower confidence (229 instances - 20%), the model yields micro-F1 of 0.798 and macro-F1 of 0.80. | | Micro-F1 | Macro-F1 | Accuracy | |:---|-----------:|-----------:|-----------:| | All (combined) | 0.797777 | 0.802403 | 0.797777 | | Croatian | 0.773504 | 0.772084 | 0.773504 | | Catalan | 0.811224 | 0.806885 | 0.811224 | | Slovenian | 0.805085 | 0.804491 | 0.805085 | | Greek | 0.803419 | 0.809598 | 0.803419 | ## Fine-tuning hyperparameters Fine-tuning was performed with . Beforehand, a brief hyperparameter optimization was performed and the presumed optimal hyperparameters are: ## Citation If you use the model, please cite this paper: ## Funding This work was supported by the Slovenian Research and Innovation Agency research project Embeddings-based techniques for Media Monitoring Applications (L2-50070, co-funded by the Kliping d.o.o. agency).", + "model_explanation_gemini": "Classifies news texts into IPTC Media Topic categories across multiple languages, fine-tuned on Croatian, Slovenian, Catalan, and Greek news corpora.\n\nFeatures: \n- Multilingual support (100+ languages, with evaluation in 4) \n- IPTC NewsCodes topic classification (17 top-level labels) \n- Fine-tuned on xlm-roberta-large \n- Requires minimum 75-word input for reliability \n- Achieves 0.746 macro-F1 on test set \n-" +} \ No newline at end of file diff --git a/model_data_json/cointegrated_rubert-tiny-sentiment-balanced.json b/model_data_json/cointegrated_rubert-tiny-sentiment-balanced.json new file mode 100644 index 0000000000000000000000000000000000000000..e4bdf3581859fb3094f3c08b26bdf1cde6ee9ab6 --- /dev/null +++ b/model_data_json/cointegrated_rubert-tiny-sentiment-balanced.json @@ -0,0 +1,21 @@ +{ + "model_id": "cointegrated/rubert-tiny-sentiment-balanced", + "downloads": 75361, + "tags": [ + "transformers", + "pytorch", + "safetensors", + "bert", + "text-classification", + "russian", + "classification", + "sentiment", + "multiclass", + "ru", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: [\"ru\"] tags: - russian - classification - sentiment - multiclass widget: - text: \"Какая гадость эта ваша заливная рыба!\" --- This is the cointegrated/rubert-tiny model fine-tuned for classification of sentiment for short Russian texts. The problem is formulated as multiclass classification: vs vs . ## Usage The function below estimates the sentiment of the given text: ## Training We trained the model on the datasets collected by Smetanin. We have converted all training data into a 3-class format and have up- and downsampled the training data to balance both the sources and the classes. The training code is available as a Colab notebook. The metrics on the balanced test set are the following: | Source | Macro F1 | | ----------- | ----------- | | SentiRuEval2016_banks | 0.83 | | SentiRuEval2016_tele | 0.74 | | kaggle_news | 0.66 | | linis | 0.50 | | mokoron | 0.98 | | rureviews | 0.72 | | rusentiment | 0.67 |", + "model_explanation_gemini": "Classifies short Russian texts into positive, neutral, or negative sentiment categories using a fine-tuned BERT model.\n\nFeatures: \n- Language: Russian \n- Task: Multiclass sentiment classification (positive/neutral/negative) \n- Model: Fine-tuned rubert-tiny \n- Training: Balanced datasets from multiple sources \n- Performance: Varied F1 scores (0.50–0.98) across test sets \n\nComparison: \nOutperforms some datasets (e.g., m" +} \ No newline at end of file diff --git a/model_data_json/cross-encoder_ms-marco-TinyBERT-L2.json b/model_data_json/cross-encoder_ms-marco-TinyBERT-L2.json new file mode 100644 index 0000000000000000000000000000000000000000..e3be59e5e22241dd48c57917ad4a34a3b278cbdf --- /dev/null +++ b/model_data_json/cross-encoder_ms-marco-TinyBERT-L2.json @@ -0,0 +1,24 @@ +{ + "model_id": "cross-encoder/ms-marco-TinyBERT-L2", + "downloads": 69190, + "tags": [ + "sentence-transformers", + "pytorch", + "jax", + "onnx", + "safetensors", + "openvino", + "bert", + "text-classification", + "transformers", + "text-ranking", + "en", + "dataset:sentence-transformers/msmarco", + "base_model:nreimers/BERT-Tiny_L-2_H-128_A-2", + "base_model:quantized:nreimers/BERT-Tiny_L-2_H-128_A-2", + "license:apache-2.0", + "region:us" + ], + "description": "--- license: apache-2.0 datasets: - sentence-transformers/msmarco language: - en base_model: - nreimers/BERT-Tiny_L-2_H-128_A-2 pipeline_tag: text-ranking library_name: sentence-transformers tags: - transformers new_version: cross-encoder/ms-marco-TinyBERT-L2-v2 --- # Cross-Encoder for MS Marco This model was trained on the MS Marco Passage Ranking task. The model can be used for Information Retrieval: Given a query, encode the query will all possible passages (e.g. retrieved with ElasticSearch). Then sort the passages in a decreasing order. See SBERT.net Retrieve & Re-rank for more details. The training code is available here: SBERT.net Training MS Marco ## Usage with SentenceTransformers The usage is easy when you have SentenceTransformers installed. Then you can use the pre-trained models like this: ## Usage with Transformers ## Performance In the following table, we provide various pre-trained Cross-Encoders together with their performance on the TREC Deep Learning 2019 and the MS Marco Passage Reranking dataset. | Model-Name | NDCG@10 (TREC DL 19) | MRR@10 (MS Marco Dev) | Docs / Sec | | ------------- |:-------------| -----| --- | | **Version 2 models** | | | | cross-encoder/ms-marco-TinyBERT-L2-v2 | 69.84 | 32.56 | 9000 | cross-encoder/ms-marco-MiniLM-L2-v2 | 71.01 | 34.85 | 4100 | cross-encoder/ms-marco-MiniLM-L4-v2 | 73.04 | 37.70 | 2500 | cross-encoder/ms-marco-MiniLM-L6-v2 | 74.30 | 39.01 | 1800 | cross-encoder/ms-marco-MiniLM-L12-v2 | 74.31 | 39.02 | 960 | **Version 1 models** | | | | cross-encoder/ms-marco-TinyBERT-L2 | 67.43 | 30.15 | 9000 | cross-encoder/ms-marco-TinyBERT-L4 | 68.09 | 34.50 | 2900 | cross-encoder/ms-marco-TinyBERT-L6 | 69.57 | 36.13 | 680 | cross-encoder/ms-marco-electra-base | 71.99 | 36.41 | 340 | **Other models** | | | | nboost/pt-tinybert-msmarco | 63.63 | 28.80 | 2900 | nboost/pt-bert-base-uncased-msmarco | 70.94 | 34.75 | 340 | nboost/pt-bert-large-msmarco | 73.36 | 36.48 | 100 | Capreolus/electra-base-msmarco | 71.23 | 36.89 | 340 | amberoad/bert-multilingual-passage-reranking-msmarco | 68.40 | 35.54 | 330 | sebastian-hofstaetter/distilbert-cat-margin_mse-T2-msmarco | 72.82 | 37.88 | 720 Note: Runtime was computed on a V100 GPU.", + "model_explanation_gemini": "A cross-encoder model trained for MS Marco passage ranking, used to rerank retrieved passages by relevance to a query for information retrieval. \n\n**Features:** \n- Trained on MS Marco Passage Ranking task \n- Designed for query-passage reranking \n- Compatible with SentenceTransformers and Transformers libraries \n- High-speed processing (9,000 docs/sec) \n\n**Comparison:** \nOutperforms older version (v1) and other TinyBERT models in NDCG@10 and" +} \ No newline at end of file diff --git a/model_data_json/cross-encoder_nli-deberta-v3-base.json b/model_data_json/cross-encoder_nli-deberta-v3-base.json new file mode 100644 index 0000000000000000000000000000000000000000..f0d027d463a1c31e88c7bc5f83b4cdc6bb1d9815 --- /dev/null +++ b/model_data_json/cross-encoder_nli-deberta-v3-base.json @@ -0,0 +1,23 @@ +{ + "model_id": "cross-encoder/nli-deberta-v3-base", + "downloads": 78924, + "tags": [ + "sentence-transformers", + "pytorch", + "onnx", + "safetensors", + "deberta-v2", + "text-classification", + "transformers", + "zero-shot-classification", + "en", + "dataset:nyu-mll/multi_nli", + "dataset:stanfordnlp/snli", + "base_model:microsoft/deberta-v3-base", + "base_model:quantized:microsoft/deberta-v3-base", + "license:apache-2.0", + "region:us" + ], + "description": "--- language: en pipeline_tag: zero-shot-classification tags: - transformers datasets: - nyu-mll/multi_nli - stanfordnlp/snli metrics: - accuracy license: apache-2.0 base_model: - microsoft/deberta-v3-base library_name: sentence-transformers --- # Cross-Encoder for Natural Language Inference This model was trained using SentenceTransformers Cross-Encoder class. This model is based on microsoft/deberta-v3-base ## Training Data The model was trained on the SNLI and MultiNLI datasets. For a given sentence pair, it will output three scores corresponding to the labels: contradiction, entailment, neutral. ## Performance - Accuracy on SNLI-test dataset: 92.38 - Accuracy on MNLI mismatched set: 90.04 For futher evaluation results, see SBERT.net - Pretrained Cross-Encoder. ## Usage Pre-trained models can be used like this: ## Usage with Transformers AutoModel You can use the model also directly with Transformers library (without SentenceTransformers library): ## Zero-Shot Classification This model can also be used for zero-shot-classification:", + "model_explanation_gemini": "Performs natural language inference (NLI) by scoring sentence pairs for contradiction, entailment, or neutral relationships, and supports zero-shot classification.\n\nFeatures: \n- Trained on SNLI and MultiNLI datasets \n- Outputs scores for three labels: contradiction, entailment, neutral \n- Achieves 92.38% accuracy on SNLI-test, 90.04% on MNLI mismatched set \n- Compatible with Transformers library and SentenceTransformers \n- Supports zero" +} \ No newline at end of file diff --git a/model_data_json/dangvantuan_vietnamese-document-embedding.json b/model_data_json/dangvantuan_vietnamese-document-embedding.json new file mode 100644 index 0000000000000000000000000000000000000000..2b966414e3806e0ad1431f10cff9625a52d21b93 --- /dev/null +++ b/model_data_json/dangvantuan_vietnamese-document-embedding.json @@ -0,0 +1,27 @@ +{ + "model_id": "dangvantuan/vietnamese-document-embedding", + "downloads": 75605, + "tags": [ + "sentence-transformers", + "safetensors", + "Vietnamese", + "feature-extraction", + "sentence-similarity", + "transformers", + "phobert", + "vietnamese", + "sentence-embedding", + "custom_code", + "vi", + "arxiv:1908.10084", + "arxiv:2407.19669", + "arxiv:2308.03281", + "arxiv:2402.14776", + "license:apache-2.0", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- library_name: sentence-transformers pipeline_tag: sentence-similarity tags: - sentence-transformers - feature-extraction - sentence-similarity - transformers - phobert - vietnamese - sentence-embedding license: apache-2.0 language: - vi metrics: - pearsonr - spearmanr --- ## Model Description: **vietnamese-document-embedding** is the Document Embedding Model for Vietnamese language with context length up to 8096 tokens. This model is a specialized long text-embedding trained specifically for the Vietnamese language, which is built upon gte-multilingual and trained using the Multi-Negative Ranking Loss, Matryoshka2dLoss and SimilarityLoss. ## Full Model Architecture ## Training and Fine-tuning process The model underwent a rigorous four-stage training and fine-tuning process, each tailored to enhance its ability to generate precise and contextually relevant sentence embeddings for the Vietnamese language. Below is an outline of these stages: #### Stage 1: Training NLI on dataset XNLI: - Dataset: XNLI-vn - Method: Training using Multi-Negative Ranking Loss and Matryoshka2dLoss. This stage focused on improving the model's ability to discern and rank nuanced differences in sentence semantics. ### Stage 2: Fine-tuning for Semantic Textual Similarity on STS Benchmark - Dataset: STSB-vn - Method: Fine-tuning specifically for the semantic textual similarity benchmark using Siamese BERT-Networks configured with the 'sentence-transformers' library. This stage honed the model's precision in capturing semantic similarity across various types of Vietnamese texts. ## Usage: Using this model becomes easy when you have sentence-transformers installed: Then you can use the model like this: ## Evaluation The model can be evaluated as follows on the Vienamese data of stsb. ### Metric for all dataset of Semantic Textual Similarity on STS Benchmark **Spearman score** | Model | [STSB] | [STS12]| [STS13] | [STS14] | [STS15] | [STS16] | [SICK] | Mean | |-----------------------------------------------------------|---------|----------|----------|----------|----------|----------|---------|--------| | dangvantuan/vietnamese-embedding |84.84| 79.04| 85.30| 81.38| 87.06| 79.95| 79.58| 82.45| | dangvantuan/vietnamese-embedding-LongContext |85.25| 75.77| 83.82| 81.69| 88.48| 81.5| 78.2| 82.10| ## Citation @article{reimers2019sentence, title={Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks}, author={Nils Reimers, Iryna Gurevych}, journal={ year={2019} } @article{zhang2024mgte, title={mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval}, author={Zhang, Xin and Zhang, Yanzhao and Long, Dingkun and Xie, Wen and Dai, Ziqi and Tang, Jialong and Lin, Huan and Yang, Baosong and Xie, Pengjun and Huang, Fei and others}, journal={arXiv preprint arXiv:2407.19669}, year={2024} } @article{li2023towards, title={Towards general text embeddings with multi-stage contrastive learning}, author={Li, Zehan and Zhang, Xin and Zhang, Yanzhao and Long, Dingkun and Xie, Pengjun and Zhang, Meishan}, journal={arXiv preprint arXiv:2308.03281}, year={2023} } @article{li20242d, title={2d matryoshka sentence embeddings}, author={Li, Xianming and Li, Zongxi and Li, Jing and Xie, Haoran and Li, Qing}, journal={arXiv preprint arXiv:2402.14776}, year={2024} }", + "model_explanation_gemini": "Generates Vietnamese document embeddings with long-context support (up to 8096 tokens) for semantic similarity tasks, optimized through multi-stage training on NLI and STS benchmarks.\n\n**Features:** \n- Language: Vietnamese \n- Context length: 8096 tokens \n- Architecture: Based on gte-multilingual \n- Training: Multi-Negative Ranking Loss, Matryoshka2dLoss, SimilarityLoss \n- Stages: NLI training (XNLI-vn), STS fine" +} \ No newline at end of file diff --git a/model_data_json/datalab-to_surya_layout0.json b/model_data_json/datalab-to_surya_layout0.json new file mode 100644 index 0000000000000000000000000000000000000000..969969cf1d6e2b85d9d929e0e8989ac9c068c501 --- /dev/null +++ b/model_data_json/datalab-to_surya_layout0.json @@ -0,0 +1,14 @@ +{ + "model_id": "datalab-to/surya_layout0", + "downloads": 81100, + "tags": [ + "transformers", + "safetensors", + "vision-encoder-decoder", + "license:cc-by-nc-sa-4.0", + "endpoints_compatible", + "region:us" + ], + "description": "--- library_name: transformers license: cc-by-nc-sa-4.0 --- Layout model for surya", + "model_explanation_gemini": "Identifies and analyzes document layouts for the 'surya' system. \n\nFeatures: \n- Layout analysis for documents \n\nComparison: Not applicable (no comparative details provided in the description)." +} \ No newline at end of file diff --git a/model_data_json/dbmdz_bert-base-italian-xxl-cased.json b/model_data_json/dbmdz_bert-base-italian-xxl-cased.json new file mode 100644 index 0000000000000000000000000000000000000000..49c3e1fbc93b6212d624cc163b0031bbf1ad8ab7 --- /dev/null +++ b/model_data_json/dbmdz_bert-base-italian-xxl-cased.json @@ -0,0 +1,21 @@ +{ + "model_id": "dbmdz/bert-base-italian-xxl-cased", + "downloads": 70821, + "tags": [ + "transformers", + "pytorch", + "tf", + "jax", + "safetensors", + "bert", + "fill-mask", + "it", + "dataset:wikipedia", + "license:mit", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: it license: mit datasets: - wikipedia --- # 🤗 + 📚 dbmdz BERT and ELECTRA models In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State Library open sources Italian BERT and ELECTRA models 🎉 # Italian BERT The source data for the Italian BERT model consists of a recent Wikipedia dump and various texts from the OPUS corpora collection. The final training corpus has a size of 13GB and 2,050,057,573 tokens. For sentence splitting, we use NLTK (faster compared to spacy). Our cased and uncased models are training with an initial sequence length of 512 subwords for ~2-3M steps. For the XXL Italian models, we use the same training data from OPUS and extend it with data from the Italian part of the OSCAR corpus. Thus, the final training corpus has a size of 81GB and 13,138,379,147 tokens. Note: Unfortunately, a wrong vocab size was used when training the XXL models. This explains the mismatch of the \"real\" vocab size of 31102, compared to the vocab size specified in . However, the model is working and all evaluations were done under those circumstances. See this issue for more information. The Italian ELECTRA model was trained on the \"XXL\" corpus for 1M steps in total using a batch size of 128. We pretty much following the ELECTRA training procedure as used for BERTurk. ## Model weights Currently only PyTorch-Transformers compatible weights are available. If you need access to TensorFlow checkpoints, please raise an issue! | Model | Downloads | ---------------------------------------------------- | --------------------------------------------------------------------------------------------------------------- | | []( • []( • []( | | []( • []( • []( | | []( • []( • []( | | []( • []( • []( | | []( • []( • []( | | []( • []( • []( ## Results For results on downstream tasks like NER or PoS tagging, please refer to this repository. ## Usage With Transformers >= 2.3 our Italian BERT models can be loaded like: To load the (recommended) Italian XXL BERT models, just use: To load the Italian XXL ELECTRA model (discriminator), just use: # Huggingface model hub All models are available on the Huggingface model hub. # Contact (Bugs, Feedback, Contribution and more) For questions about our BERT/ELECTRA models just open an issue here 🤗 # Acknowledgments Research supported with Cloud TPUs from Google's TensorFlow Research Cloud (TFRC). Thanks for providing access to the TFRC ❤️ Thanks to the generous support from the Hugging Face team, it is possible to download both cased and uncased models from their S3 storage 🤗", + "model_explanation_gemini": "A cased Italian BERT model trained on Wikipedia and OPUS corpora, extended with OSCAR data for the XXL version, designed for NLP tasks like NER and PoS tagging.\n\nFeatures: \n- Language: Italian \n- Training data: Wikipedia, OPUS, OSCAR (XXL version) \n- Corpus size: 13GB (base), 81GB (XXL) \n- Tokens: 2B (base), 13B (XXL)" +} \ No newline at end of file diff --git a/model_data_json/dbmdz_bert-base-italian-xxl-uncased.json b/model_data_json/dbmdz_bert-base-italian-xxl-uncased.json new file mode 100644 index 0000000000000000000000000000000000000000..5909ba3d1bc40254e5dc54712d114160d74f1959 --- /dev/null +++ b/model_data_json/dbmdz_bert-base-italian-xxl-uncased.json @@ -0,0 +1,21 @@ +{ + "model_id": "dbmdz/bert-base-italian-xxl-uncased", + "downloads": 80990, + "tags": [ + "transformers", + "pytorch", + "tf", + "jax", + "safetensors", + "bert", + "fill-mask", + "it", + "dataset:wikipedia", + "license:mit", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: it license: mit datasets: - wikipedia --- # 🤗 + 📚 dbmdz BERT and ELECTRA models In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State Library open sources Italian BERT and ELECTRA models 🎉 # Italian BERT The source data for the Italian BERT model consists of a recent Wikipedia dump and various texts from the OPUS corpora collection. The final training corpus has a size of 13GB and 2,050,057,573 tokens. For sentence splitting, we use NLTK (faster compared to spacy). Our cased and uncased models are training with an initial sequence length of 512 subwords for ~2-3M steps. For the XXL Italian models, we use the same training data from OPUS and extend it with data from the Italian part of the OSCAR corpus. Thus, the final training corpus has a size of 81GB and 13,138,379,147 tokens. Note: Unfortunately, a wrong vocab size was used when training the XXL models. This explains the mismatch of the \"real\" vocab size of 31102, compared to the vocab size specified in . However, the model is working and all evaluations were done under those circumstances. See this issue for more information. The Italian ELECTRA model was trained on the \"XXL\" corpus for 1M steps in total using a batch size of 128. We pretty much following the ELECTRA training procedure as used for BERTurk. ## Model weights Currently only PyTorch-Transformers compatible weights are available. If you need access to TensorFlow checkpoints, please raise an issue! | Model | Downloads | ---------------------------------------------------- | --------------------------------------------------------------------------------------------------------------- | | []( • []( • []( | | []( • []( • []( | | []( • []( • []( | | []( • []( • []( | | []( • []( • []( | | []( • []( • []( ## Results For results on downstream tasks like NER or PoS tagging, please refer to this repository. ## Usage With Transformers >= 2.3 our Italian BERT models can be loaded like: To load the (recommended) Italian XXL BERT models, just use: To load the Italian XXL ELECTRA model (discriminator), just use: # Huggingface model hub All models are available on the Huggingface model hub. # Contact (Bugs, Feedback, Contribution and more) For questions about our BERT/ELECTRA models just open an issue here 🤗 # Acknowledgments Research supported with Cloud TPUs from Google's TensorFlow Research Cloud (TFRC). Thanks for providing access to the TFRC ❤️ Thanks to the generous support from the Hugging Face team, it is possible to download both cased and uncased models from their S3 storage 🤗", + "model_explanation_gemini": "\"An uncased Italian BERT model trained on Wikipedia and OPUS corpora (extended with OSCAR data for the XXL version) for natural language processing tasks.\"\n\nModel Features:\n- Language: Italian (it)\n- License: MIT\n- Training data: Wikipedia dump + OPUS corpora (13GB base, 81GB XXL)\n- Token count: 2.05B (base), 13.14B (XXL)\n- Sequence length: 512 sub" +} \ No newline at end of file diff --git a/model_data_json/deepset_gelectra-large.json b/model_data_json/deepset_gelectra-large.json new file mode 100644 index 0000000000000000000000000000000000000000..a9a25999d07e56b9501d94d974bb7d011bff9d6f --- /dev/null +++ b/model_data_json/deepset_gelectra-large.json @@ -0,0 +1,23 @@ +{ + "model_id": "deepset/gelectra-large", + "downloads": 11594, + "tags": [ + "transformers", + "pytorch", + "tf", + "safetensors", + "electra", + "pretraining", + "de", + "dataset:wikipedia", + "dataset:OPUS", + "dataset:OpenLegalData", + "dataset:oscar", + "arxiv:2010.10906", + "license:mit", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: de license: mit datasets: - wikipedia - OPUS - OpenLegalData - oscar --- # German ELECTRA large Released, Oct 2020, this is a German ELECTRA language model trained collaboratively by the makers of the original German BERT (aka \"bert-base-german-cased\") and the dbmdz BERT (aka bert-base-german-dbmdz-cased). In our paper, we outline the steps taken to train our model and show that this is the state of the art German language model. ## Overview **Paper:** here **Architecture:** ELECTRA large (discriminator) **Language:** German ## Performance See also: deepset/gbert-base deepset/gbert-large deepset/gelectra-base deepset/gelectra-large deepset/gelectra-base-generator deepset/gelectra-large-generator ## Authors Branden Chan: Stefan Schweter: Timo Möller: ## About us
\"\"
\"\"
deepset is the company behind the production-ready open-source AI framework Haystack. Some of our other work: - Distilled roberta-base-squad2 (aka \"tinyroberta-squad2\") - German BERT, GermanQuAD and GermanDPR, German embedding model - deepset Cloud, deepset Studio ## Get in touch and join the Haystack community

For more info on Haystack, visit our

Twitter | LinkedIn | Discord | GitHub Discussions | Website | YouTube By the way: we're hiring!", + "model_explanation_gemini": "A German ELECTRA large discriminator model achieving state-of-the-art performance for German language tasks. \n\n**Features:** \n- Language: German \n- Architecture: ELECTRA large (discriminator) \n- Training data: Wikipedia, OPUS, OpenLegalData, OSCAR \n- License: MIT \n\n**Comparison:** \nOutperforms other German models like `deepset/gbert-base`, `deepset/gbert-large`, and smaller ELECTRA variants (`gelectra-base`), as" +} \ No newline at end of file diff --git a/model_data_json/dima806_man_woman_face_image_detection.json b/model_data_json/dima806_man_woman_face_image_detection.json new file mode 100644 index 0000000000000000000000000000000000000000..e4b5e5eddd1636c084cbc55565613e6e9cfe2a22 --- /dev/null +++ b/model_data_json/dima806_man_woman_face_image_detection.json @@ -0,0 +1,19 @@ +{ + "model_id": "dima806/man_woman_face_image_detection", + "downloads": 74322, + "tags": [ + "transformers", + "pytorch", + "safetensors", + "vit", + "image-classification", + "base_model:google/vit-base-patch16-224-in21k", + "base_model:finetune:google/vit-base-patch16-224-in21k", + "license:apache-2.0", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: apache-2.0 metrics: - accuracy - f1 base_model: - google/vit-base-patch16-224-in21k --- Returns with about 98.7% accuracy whether the face belongs to man or woman based on face image. See for more details. !image/png", + "model_explanation_gemini": "Detects whether a face in an image belongs to a man or woman with 98.7% accuracy. \n\n**Features:** \n- High accuracy (98.7%) \n- Uses Google's ViT-base-patch16-224-in21k as base model \n- Evaluated on accuracy and F1 metrics \n\n**Comparison:** (Not explicitly provided in the description, so omitted.)" +} \ No newline at end of file diff --git a/model_data_json/ds4sd_SmolDocling-256M-preview.json b/model_data_json/ds4sd_SmolDocling-256M-preview.json new file mode 100644 index 0000000000000000000000000000000000000000..c9c29b6c317838d282a35775fcde48e30be77419 --- /dev/null +++ b/model_data_json/ds4sd_SmolDocling-256M-preview.json @@ -0,0 +1,22 @@ +{ + "model_id": "ds4sd/SmolDocling-256M-preview", + "downloads": 80269, + "tags": [ + "transformers", + "onnx", + "safetensors", + "idefics3", + "image-text-to-text", + "conversational", + "en", + "arxiv:2503.11576", + "arxiv:2305.03393", + "base_model:HuggingFaceTB/SmolVLM-256M-Instruct", + "base_model:quantized:HuggingFaceTB/SmolVLM-256M-Instruct", + "license:cdla-permissive-2.0", + "endpoints_compatible", + "region:us" + ], + "description": "--- base_model: - HuggingFaceTB/SmolVLM-256M-Instruct language: - en library_name: transformers license: cdla-permissive-2.0 pipeline_tag: image-text-to-text ---
\"SmolDocling\"

SmolDocling-256M-preview

SmolDocling is a multimodal Image-Text-to-Text model designed for efficient document conversion. It retains Docling's most popular features while ensuring full compatibility with Docling through seamless support for DoclingDocuments.

This model was presented in the paper SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion. ### 🚀 Features: - 🏷️ **DocTags for Efficient Tokenization** – Introduces DocTags an efficient and minimal representation for documents that is fully compatible with **DoclingDocuments**. - 🔍 **OCR (Optical Character Recognition)** – Extracts text accurately from images. - 📐 **Layout and Localization** – Preserves document structure and document element **bounding boxes**. - 💻 **Code Recognition** – Detects and formats code blocks including identation. - 🔢 **Formula Recognition** – Identifies and processes mathematical expressions. - 📊 **Chart Recognition** – Extracts and interprets chart data. - 📑 **Table Recognition** – Supports column and row headers for structured table extraction. - 🖼️ **Figure Classification** – Differentiates figures and graphical elements. - 📝 **Caption Correspondence** – Links captions to relevant images and figures. - 📜 **List Grouping** – Organizes and structures list elements correctly. - 📄 **Full-Page Conversion** – Processes entire pages for comprehensive document conversion including all page elements (code, equations, tables, charts etc.) - 🔲 **OCR with Bounding Boxes** – OCR regions using a bounding box. - 📂 **General Document Processing** – Trained for both scientific and non-scientific documents. - 🔄 **Seamless Docling Integration** – Import into **Docling** and export in multiple formats. - 💨 **Fast inference using VLLM** – Avg of 0.35 secs per page on A100 GPU. ### 🚧 *Coming soon!* - 📊 **Better chart recognition 🛠️** - 📚 **One shot multi-page inference ⏱️** - 🧪 **Chemical Recognition** - 📙 **Datasets** ## ⌨️ Get started (code examples) You can use **transformers**, **vllm**, or **onnx** to perform inference, and Docling to convert results to variety of output formats (md, html, etc.):
📄 Single page image inference using Tranformers 🤖
🚀 Fast Batch Inference Using VLLM
ONNX Inference
💻 Local inference on Apple Silicon with MLX: see here ## DocTags \"Image DocTags create a clear and structured system of tags and rules that separate text from the document's structure. This makes things easier for Image-to-Sequence models by reducing confusion. On the other hand, converting directly to formats like HTML or Markdown can be messy—it often loses details, doesn’t clearly show the document’s layout, and increases the number of tokens, making processing less efficient. DocTags are integrated with Docling, which allows export to HTML, Markdown, and JSON. These exports can be offloaded to the CPU, reducing token generation overhead and improving efficiency. ## Supported Instructions
Description Instruction Comment
Full conversion Convert this page to docling. DocTags represetation
Chart Convert chart to table. (e.g., <chart>)
Formula Convert formula to LaTeX. (e.g., <formula>)
Code Convert code to text. (e.g., <code>)
Table Convert table to OTSL. (e.g., <otsl>) OTSL:
Actions and Pipelines OCR the text in a specific location: <loc_155><loc_233><loc_206><loc_237>
Identify element at: <loc_247><loc_482><10c_252><loc_486>
Find all 'text' elements on the page, retrieve all section headers.
Detect footer elements on the page.
#### Model Summary - **Developed by:** Docling Team, IBM Research - **Model type:** Multi-modal model (image+text) - **Language(s) (NLP):** English - **License:** Apache 2.0 - **Architecture:** Based on Idefics3 (see technical summary) - **Finetuned from model:** Based on SmolVLM-256M-Instruct **Repository:** Docling **Paper:** arXiv **Project Page:** Hugging Face **Citation:** **Demo:** HF Space", + "model_explanation_gemini": "A multimodal image-text-to-text model designed for efficient document conversion, supporting OCR, layout preservation, code/formula/chart/table recognition, and seamless integration with Docling.\n\n**Model Features**: \n- DocTags for efficient tokenization \n- OCR with bounding boxes \n- Layout and element localization \n- Code, formula, chart, and table recognition \n- Figure classification and caption correspondence \n- List grouping and full-page conversion \n- Fast inference (0.35s/page on A100)" +} \ No newline at end of file diff --git a/model_data_json/dslim_distilbert-NER.json b/model_data_json/dslim_distilbert-NER.json new file mode 100644 index 0000000000000000000000000000000000000000..8e1a7664424bbef74426f71c46b9eeb3a41b333d --- /dev/null +++ b/model_data_json/dslim_distilbert-NER.json @@ -0,0 +1,24 @@ +{ + "model_id": "dslim/distilbert-NER", + "downloads": 70773, + "tags": [ + "transformers", + "tensorboard", + "onnx", + "safetensors", + "distilbert", + "token-classification", + "en", + "dataset:conll2003", + "arxiv:1810.04805", + "arxiv:1910.01108", + "base_model:distilbert/distilbert-base-cased", + "base_model:quantized:distilbert/distilbert-base-cased", + "license:apache-2.0", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: apache-2.0 base_model: distilbert-base-cased metrics: - precision - recall - f1 - accuracy model-index: - name: distilbert-NER results: [] datasets: - conll2003 language: - en pipeline_tag: token-classification --- # distilbert-NER If my open source models have been useful to you, please consider supporting me in building small, useful AI models for everyone (and help me afford med school / help out my parents financially). Thanks!
\"Buy ## Model description **distilbert-NER** is the fine-tuned version of **DistilBERT**, which is a distilled variant of the BERT model. DistilBERT has fewer parameters than BERT, making it smaller, faster, and more efficient. distilbert-NER is specifically fine-tuned for the task of **Named Entity Recognition (NER)**. This model accurately identifies the same four types of entities as its BERT counterparts: location (LOC), organizations (ORG), person (PER), and Miscellaneous (MISC). Although it is a more compact model, distilbert-NER demonstrates a robust performance in NER tasks, balancing between size, speed, and accuracy. The model was fine-tuned on the English version of the CoNLL-2003 Named Entity Recognition dataset, which is widely recognized for its comprehensive and diverse range of entity types. ### Available NER models | Model Name | Description | Parameters | |-------------------|-------------|------------------| | distilbert-NER | Fine-tuned DistilBERT - a smaller, faster, lighter version of BERT | 66M | | bert-large-NER | Fine-tuned bert-large-cased - larger model with slightly better performance | 340M | | bert-base-NER-(uncased) | Fine-tuned bert-base, available in both cased and uncased versions | 110M | ## Intended uses & limitations #### How to use This model can be utilized with the Transformers *pipeline* for NER, similar to the BERT models. #### Limitations and bias The performance of distilbert-NER is linked to its training on the CoNLL-2003 dataset. Therefore, it might show limited effectiveness on text data that significantly differs from this training set. Users should be aware of potential biases inherent in the training data and the possibility of entity misclassification in complex sentences. ## Training data This model was fine-tuned on English version of the standard CoNLL-2003 Named Entity Recognition dataset. The training dataset distinguishes between the beginning and continuation of an entity so that if there are back-to-back entities of the same type, the model can output where the second entity begins. As in the dataset, each token will be classified as one of the following classes: Abbreviation|Description -|- O|Outside of a named entity B-MISC |Beginning of a miscellaneous entity right after another miscellaneous entity I-MISC | Miscellaneous entity B-PER |Beginning of a person’s name right after another person’s name I-PER |Person’s name B-ORG |Beginning of an organization right after another organization I-ORG |organization B-LOC |Beginning of a location right after another location I-LOC |Location ### CoNLL-2003 English Dataset Statistics This dataset was derived from the Reuters corpus which consists of Reuters news stories. You can read more about how this dataset was created in the CoNLL-2003 paper. #### # of training examples per entity type Dataset|LOC|MISC|ORG|PER -|-|-|-|- Train|7140|3438|6321|6600 Dev|1837|922|1341|1842 Test|1668|702|1661|1617 #### # of articles/sentences/tokens per dataset Dataset |Articles |Sentences |Tokens -|-|-|- Train |946 |14,987 |203,621 Dev |216 |3,466 |51,362 Test |231 |3,684 |46,435 ## Training procedure This model was trained on a single NVIDIA V100 GPU with recommended hyperparameters from the original BERT paper which trained & evaluated the model on CoNLL-2003 NER task. ## Eval results | Metric | Score | |------------|-------| | Loss | 0.0710| | Precision | 0.9202| | Recall | 0.9232| | F1 | 0.9217| | Accuracy | 0.9810| The training and validation losses demonstrate a decrease over epochs, signaling effective learning. The precision, recall, and F1 scores are competitive, showcasing the model's robustness in NER tasks. ### BibTeX entry and citation info For DistilBERT: For the underlying BERT model:", + "model_explanation_gemini": "\"Distilbert-NER is a compact, efficient named entity recognition model fine-tuned from DistilBERT to identify four entity types (LOC, ORG, PER, MISC) with robust performance, trained on the CoNLL-2003 dataset.\"\n\nFeatures: \n- Fine-tuned for Named Entity Recognition (NER) \n- Identifies entities: location (LOC), organizations (ORG), person (PER), Miscellaneous (MISC) \n- Based on DistilBERT (smaller" +} \ No newline at end of file diff --git a/model_data_json/embaas_sentence-transformers-e5-large-v2.json b/model_data_json/embaas_sentence-transformers-e5-large-v2.json new file mode 100644 index 0000000000000000000000000000000000000000..bd96fa3b63d439e2d0f38960ea8fede821b05ba3 --- /dev/null +++ b/model_data_json/embaas_sentence-transformers-e5-large-v2.json @@ -0,0 +1,17 @@ +{ + "model_id": "embaas/sentence-transformers-e5-large-v2", + "downloads": 73837, + "tags": [ + "sentence-transformers", + "pytorch", + "bert", + "feature-extraction", + "sentence-similarity", + "autotrain_compatible", + "text-embeddings-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- pipeline_tag: sentence-similarity tags: - sentence-transformers - feature-extraction - sentence-similarity --- # embaas/sentence-transformers-e5-large-v2 This is a the sentence-transformers version of the intfloat/e5-large-v2 model: It maps sentences & paragraphs to a 1024 dimensional dense vector space and can be used for tasks like clustering or semantic search. ## Usage (Sentence-Transformers) Using this model becomes easy when you have sentence-transformers installed: Then you can use the model like this: ## Using with API You can use the embaas API to encode your input. Get your free API key from embaas.io ## Evaluation Results Find the results of the e5 at the MTEB leaderboard ## Full Model Architecture ## Citing & Authors ", + "model_explanation_gemini": "Maps sentences and paragraphs to a 1024-dimensional vector space for tasks like clustering and semantic search. \n\n**Features:** \n- Sentence/paragraph embedding into 1024-dimensional vectors \n- Supports clustering and semantic search \n- Compatible with Sentence-Transformers and embaas API \n\n**Comparison:** \nThis model is the Sentence-Transformers adaptation of **intfloat/e5-large-v2**, offering similar dense vector representations but optimized for seamless integration with the Sentence-Transformers library and" +} \ No newline at end of file diff --git a/model_data_json/evo-design_evo-1.5-8k-base.json b/model_data_json/evo-design_evo-1.5-8k-base.json new file mode 100644 index 0000000000000000000000000000000000000000..34102f9b84cb4e9f7e6491f81e5180fd379ec10b --- /dev/null +++ b/model_data_json/evo-design_evo-1.5-8k-base.json @@ -0,0 +1,19 @@ +{ + "model_id": "evo-design/evo-1.5-8k-base", + "downloads": 74038, + "tags": [ + "pytorch", + "safetensors", + "stripedhyena", + "long context", + "deep signal processing", + "hybrid", + "biology", + "genomics", + "custom_code", + "license:apache-2.0", + "region:us" + ], + "description": "--- license: apache-2.0 tags: - stripedhyena - long context - deep signal processing - hybrid - biology - genomics --- ## Evo 1.5

### About Evo is a biological foundation model capable of long-context modeling and design. Evo uses the StripedHyena architecture to enable modeling of sequences at a single-nucleotide, byte-level resolution with near-linear scaling of compute and memory relative to context length. Evo has 7 billion parameters and is trained on OpenGenome, a prokaryotic whole-genome dataset containing ~300 billion tokens. **Evo 1.5** is a version of Evo built off of the Evo 1 model pretrained at 8k context with training extended by 50% more training data, totaling 450 billion tokens. | Checkpoint Name | Description | |----------------------------------------|-------------| | | A model pretrained with 8,192 context obtained by extending the pretraining of to process 50% more training data. | | | A model pretrained with 8,192 context. We use this model as the base model for molecular-scale finetuning tasks. | | | A model pretrained with 131,072 context using as the initialization. We use this model to reason about and generate sequences at the genome scale. | | | A model fine-tuned on specifically on CRISPR-Cas systems. We use this model to generate Cas9/12/13 systems. | | | A model fine-tuned on specifically on transposons. We use this to generate IS200/IS605. | ### How to use Evo Example usage is provided in the standalone repo. ## Cite", + "model_explanation_gemini": "A biological foundation model for long-context genome sequence modeling and design, using StripedHyena architecture for efficient single-nucleotide resolution processing.\n\n**Model Features:** \n- 7B parameters \n- StripedHyena architecture for near-linear scaling \n- Trained on 450B tokens (OpenGenome prokaryotic dataset) \n- 8k context length \n- Single-nucleotide/byte-level resolution \n- Specialized versions for CRISPR-Cas systems and transposons \n\n**Comparison:**" +} \ No newline at end of file diff --git a/model_data_json/facebook_convnextv2-base-22k-384.json b/model_data_json/facebook_convnextv2-base-22k-384.json new file mode 100644 index 0000000000000000000000000000000000000000..21bf7070926128419933bbfa612bdc25ab1c4b24 --- /dev/null +++ b/model_data_json/facebook_convnextv2-base-22k-384.json @@ -0,0 +1,20 @@ +{ + "model_id": "facebook/convnextv2-base-22k-384", + "downloads": 80286, + "tags": [ + "transformers", + "pytorch", + "safetensors", + "convnextv2", + "image-classification", + "vision", + "dataset:imagenet-22k", + "arxiv:2301.00808", + "license:apache-2.0", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: apache-2.0 tags: - vision - image-classification datasets: - imagenet-22k widget: - src: example_title: Tiger - src: example_title: Teapot - src: example_title: Palace --- # ConvNeXt V2 (base-sized model) ConvNeXt V2 model pretrained using the FCMAE framework and fine-tuned on the ImageNet-22K dataset at resolution 384x384. It was introduced in the paper ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders by Woo et al. and first released in this repository. Disclaimer: The team releasing ConvNeXT V2 did not write a model card for this model so this model card has been written by the Hugging Face team. ## Model description ConvNeXt V2 is a pure convolutional model (ConvNet) that introduces a fully convolutional masked autoencoder framework (FCMAE) and a new Global Response Normalization (GRN) layer to ConvNeXt. ConvNeXt V2 significantly improves the performance of pure ConvNets on various recognition benchmarks. !model image ## Intended uses & limitations You can use the raw model for image classification. See the model hub to look for fine-tuned versions on a task that interests you. ### How to use Here is how to use this model to classify an image of the COCO 2017 dataset into one of the 1,000 ImageNet classes: For more code examples, we refer to the documentation. ### BibTeX entry and citation info", + "model_explanation_gemini": "\"facebook_convnextv2-base-22k-384 is a convolutional neural network (ConvNet) pretrained with FCMAE and fine-tuned on ImageNet-22K for image classification at 384x384 resolution, featuring Global Response Normalization for improved performance.\"\n\nFeatures: \n- Architecture: ConvNeXt V2 (base-sized) \n- Pretraining: FCMAE (fully convolutional masked autoencoder framework) \n- Fine-tuning: ImageNet-22K dataset" +} \ No newline at end of file diff --git a/model_data_json/facebook_esm2_t30_150M_UR50D.json b/model_data_json/facebook_esm2_t30_150M_UR50D.json new file mode 100644 index 0000000000000000000000000000000000000000..857c2169d664fbfed1c532f98a77855264df3a97 --- /dev/null +++ b/model_data_json/facebook_esm2_t30_150M_UR50D.json @@ -0,0 +1,18 @@ +{ + "model_id": "facebook/esm2_t30_150M_UR50D", + "downloads": 73482, + "tags": [ + "transformers", + "pytorch", + "tf", + "safetensors", + "esm", + "fill-mask", + "license:mit", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: mit widget: - text: \"MQIFVKTLTGKTITLEVEPSTIENVKAKIQDKEGIPPDQQRLIFAGKQLEDGRTLSDYNIQKESTLHLVLRLRGG\" --- ## ESM-2 ESM-2 is a state-of-the-art protein model trained on a masked language modelling objective. It is suitable for fine-tuning on a wide range of tasks that take protein sequences as input. For detailed information on the model architecture and training data, please refer to the accompanying paper. You may also be interested in some demo notebooks (PyTorch, TensorFlow) which demonstrate how to fine-tune ESM-2 models on your tasks of interest. Several ESM-2 checkpoints are available in the Hub with varying sizes. Larger sizes generally have somewhat better accuracy, but require much more memory and time to train: | Checkpoint name | Num layers | Num parameters | |------------------------------|----|----------| | esm2_t48_15B_UR50D | 48 | 15B | | esm2_t36_3B_UR50D | 36 | 3B | | esm2_t33_650M_UR50D | 33 | 650M | | esm2_t30_150M_UR50D | 30 | 150M | | esm2_t12_35M_UR50D | 12 | 35M | | esm2_t6_8M_UR50D | 6 | 8M |", + "model_explanation_gemini": "\"ESM-2 is a protein language model trained for masked language modeling, suitable for fine-tuning on protein sequence tasks.\"\n\nModel Features: \n- Masked language modeling objective \n- Trained on protein sequences \n- Supports fine-tuning for various protein-related tasks \n- Available in multiple sizes (8M to 15B parameters) \n\nComparison: \nLarger ESM-2 checkpoints (e.g., esm2_t48_15B_UR50D) offer better accuracy but" +} \ No newline at end of file diff --git a/model_data_json/facebook_opt-13b.json b/model_data_json/facebook_opt-13b.json new file mode 100644 index 0000000000000000000000000000000000000000..7eb662c422aa837d04c562600430e6df83d35dfc --- /dev/null +++ b/model_data_json/facebook_opt-13b.json @@ -0,0 +1,21 @@ +{ + "model_id": "facebook/opt-13b", + "downloads": 76208, + "tags": [ + "transformers", + "pytorch", + "tf", + "jax", + "opt", + "text-generation", + "en", + "arxiv:2205.01068", + "arxiv:2005.14165", + "license:other", + "autotrain_compatible", + "text-generation-inference", + "region:us" + ], + "description": "--- language: en inference: false tags: - opt - text-generation license: other commercial: false --- # OPT : Open Pre-trained Transformer Language Models OPT was first introduced in Open Pre-trained Transformer Language Models and first released in metaseq's repository on May 3rd 2022 by Meta AI. **Disclaimer**: The team releasing OPT wrote an official model card, which is available in Appendix D of the paper. Content from **this** model card has been written by the Hugging Face team. ## Intro To quote the first two paragraphs of the official paper > Large language models trained on massive text collections have shown surprising emergent > capabilities to generate text and perform zero- and few-shot learning. While in some cases the public > can interact with these models through paid APIs, full model access is currently limited to only a > few highly resourced labs. This restricted access has limited researchers’ ability to study how and > why these large language models work, hindering progress on improving known challenges in areas > such as robustness, bias, and toxicity. > We present Open Pretrained Transformers (OPT), a suite of decoder-only pre-trained transformers ranging from 125M > to 175B parameters, which we aim to fully and responsibly share with interested researchers. We train the OPT models to roughly match > the performance and sizes of the GPT-3 class of models, while also applying the latest best practices in data > collection and efficient training. Our aim in developing this suite of OPT models is to enable reproducible and responsible research at scale, and > to bring more voices to the table in studying the impact of these LLMs. Definitions of risk, harm, bias, and toxicity, etc., should be articulated by the > collective research community as a whole, which is only possible when models are available for study. ## Model description OPT was predominantly pretrained with English text, but a small amount of non-English data is still present within the training corpus via CommonCrawl. The model was pretrained using a causal language modeling (CLM) objective. OPT belongs to the same family of decoder-only models like GPT-3. As such, it was pretrained using the self-supervised causal language modedling objective. For evaluation, OPT follows GPT-3 by using their prompts and overall experimental setup. For more details, please read the official paper. ## Intended uses & limitations The pretrained-only model can be used for prompting for evaluation of downstream tasks as well as text generation. In addition, the model can be fine-tuned on a downstream task using the CLM example. For all other OPT checkpoints, please have a look at the model hub. ### How to use For large OPT models, such as this one, it is not recommend to make use of the pipeline because one should load the model in half-precision to accelerate generation and optimize memory consumption on GPU. It is recommended to directly call the []( method as follows: By default, generation is deterministic. In order to use the top-k sampling, please set to . ### Limitations and bias As mentioned in Meta AI's model card, given that the training data used for this model contains a lot of unfiltered content from the internet, which is far from neutral the model is strongly biased : > Like other large language models for which the diversity (or lack thereof) of training > data induces downstream impact on the quality of our model, OPT-175B has limitations in terms > of bias and safety. OPT-175B can also have quality issues in terms of generation diversity and > hallucination. In general, OPT-175B is not immune from the plethora of issues that plague modern > large language models. Here's an example of how the model can have biased predictions: compared to: This bias will also affect all fine-tuned versions of this model. ## Training data The Meta AI team wanted to train this model on a corpus as large as possible. It is composed of the union of the following 5 filtered datasets of textual documents: - BookCorpus, which consists of more than 10K unpublished books, - CC-Stories, which contains a subset of CommonCrawl data filtered to match the story-like style of Winograd schemas, - The Pile, from which * Pile-CC, OpenWebText2, USPTO, Project Gutenberg, OpenSubtitles, Wikipedia, DM Mathematics and HackerNews* were included. - Pushshift.io Reddit dataset that was developed in Baumgartner et al. (2020) and processed in Roller et al. (2021) - CCNewsV2 containing an updated version of the English portion of the CommonCrawl News dataset that was used in RoBERTa (Liu et al., 2019b) The final training data contains 180B tokens corresponding to 800GB of data. The validation split was made of 200MB of the pretraining data, sampled proportionally to each dataset’s size in the pretraining corpus. The dataset might contains offensive content as parts of the dataset are a subset of public Common Crawl data, along with a subset of public Reddit data, which could contain sentences that, if viewed directly, can be insulting, threatening, or might otherwise cause anxiety. ### Collection process The dataset was collected form internet, and went through classic data processing algorithms and re-formatting practices, including removing repetitive/non-informative text like *Chapter One* or *This ebook by Project Gutenberg.* ## Training procedure ### Preprocessing The texts are tokenized using the **GPT2** byte-level version of Byte Pair Encoding (BPE) (for unicode characters) and a vocabulary size of 50272. The inputs are sequences of 2048 consecutive tokens. The 175B model was trained on 992 *80GB A100 GPUs*. The training duration was roughly ~33 days of continuous training. ### BibTeX entry and citation info", + "model_explanation_gemini": "A decoder-only English language model pretrained for text generation and few-shot learning, designed to match GPT-3's performance while enabling open research access.\n\n**Features:** \n- Decoder-only transformer architecture \n- Pretrained with causal language modeling (CLM) \n- Primarily English text with some non-English data \n- Supports prompting and fine-tuning for downstream tasks \n- Biases inherited from unfiltered internet training data \n\n**Comparison:** \nOPT-13b mirrors GPT-3's capabilities" +} \ No newline at end of file diff --git a/model_data_json/facebook_opt-6.7b.json b/model_data_json/facebook_opt-6.7b.json new file mode 100644 index 0000000000000000000000000000000000000000..cdb516ffa5d1ab5dcc08ca8cb962c5c6196c4328 --- /dev/null +++ b/model_data_json/facebook_opt-6.7b.json @@ -0,0 +1,21 @@ +{ + "model_id": "facebook/opt-6.7b", + "downloads": 69673, + "tags": [ + "transformers", + "pytorch", + "tf", + "jax", + "opt", + "text-generation", + "en", + "arxiv:2205.01068", + "arxiv:2005.14165", + "license:other", + "autotrain_compatible", + "text-generation-inference", + "region:us" + ], + "description": "--- language: en inference: false tags: - text-generation - opt license: other commercial: false --- # OPT : Open Pre-trained Transformer Language Models OPT was first introduced in Open Pre-trained Transformer Language Models and first released in metaseq's repository on May 3rd 2022 by Meta AI. **Disclaimer**: The team releasing OPT wrote an official model card, which is available in Appendix D of the paper. Content from **this** model card has been written by the Hugging Face team. ## Intro To quote the first two paragraphs of the official paper > Large language models trained on massive text collections have shown surprising emergent > capabilities to generate text and perform zero- and few-shot learning. While in some cases the public > can interact with these models through paid APIs, full model access is currently limited to only a > few highly resourced labs. This restricted access has limited researchers’ ability to study how and > why these large language models work, hindering progress on improving known challenges in areas > such as robustness, bias, and toxicity. > We present Open Pretrained Transformers (OPT), a suite of decoder-only pre-trained transformers ranging from 125M > to 175B parameters, which we aim to fully and responsibly share with interested researchers. We train the OPT models to roughly match > the performance and sizes of the GPT-3 class of models, while also applying the latest best practices in data > collection and efficient training. Our aim in developing this suite of OPT models is to enable reproducible and responsible research at scale, and > to bring more voices to the table in studying the impact of these LLMs. Definitions of risk, harm, bias, and toxicity, etc., should be articulated by the > collective research community as a whole, which is only possible when models are available for study. ## Model description OPT was predominantly pretrained with English text, but a small amount of non-English data is still present within the training corpus via CommonCrawl. The model was pretrained using a causal language modeling (CLM) objective. OPT belongs to the same family of decoder-only models like GPT-3. As such, it was pretrained using the self-supervised causal language modedling objective. For evaluation, OPT follows GPT-3 by using their prompts and overall experimental setup. For more details, please read the official paper. ## Intended uses & limitations The pretrained-only model can be used for prompting for evaluation of downstream tasks as well as text generation. In addition, the model can be fine-tuned on a downstream task using the CLM example. For all other OPT checkpoints, please have a look at the model hub. ### How to use For large OPT models, such as this one, it is not recommend to make use of the pipeline because one should load the model in half-precision to accelerate generation and optimize memory consumption on GPU. It is recommended to directly call the []( method as follows: By default, generation is deterministic. In order to use the top-k sampling, please set to . ### Limitations and bias As mentioned in Meta AI's model card, given that the training data used for this model contains a lot of unfiltered content from the internet, which is far from neutral the model is strongly biased : > Like other large language models for which the diversity (or lack thereof) of training > data induces downstream impact on the quality of our model, OPT-175B has limitations in terms > of bias and safety. OPT-175B can also have quality issues in terms of generation diversity and > hallucination. In general, OPT-175B is not immune from the plethora of issues that plague modern > large language models. Here's an example of how the model can have biased predictions: compared to: This bias will also affect all fine-tuned versions of this model. ## Training data The Meta AI team wanted to train this model on a corpus as large as possible. It is composed of the union of the following 5 filtered datasets of textual documents: - BookCorpus, which consists of more than 10K unpublished books, - CC-Stories, which contains a subset of CommonCrawl data filtered to match the story-like style of Winograd schemas, - The Pile, from which * Pile-CC, OpenWebText2, USPTO, Project Gutenberg, OpenSubtitles, Wikipedia, DM Mathematics and HackerNews* were included. - Pushshift.io Reddit dataset that was developed in Baumgartner et al. (2020) and processed in Roller et al. (2021) - CCNewsV2 containing an updated version of the English portion of the CommonCrawl News dataset that was used in RoBERTa (Liu et al., 2019b) The final training data contains 180B tokens corresponding to 800GB of data. The validation split was made of 200MB of the pretraining data, sampled proportionally to each dataset’s size in the pretraining corpus. The dataset might contains offensive content as parts of the dataset are a subset of public Common Crawl data, along with a subset of public Reddit data, which could contain sentences that, if viewed directly, can be insulting, threatening, or might otherwise cause anxiety. ### Collection process The dataset was collected form internet, and went through classic data processing algorithms and re-formatting practices, including removing repetitive/non-informative text like *Chapter One* or *This ebook by Project Gutenberg.* ## Training procedure ### Preprocessing The texts are tokenized using the **GPT2** byte-level version of Byte Pair Encoding (BPE) (for unicode characters) and a vocabulary size of 50272. The inputs are sequences of 2048 consecutive tokens. The 175B model was trained on 992 *80GB A100 GPUs*. The training duration was roughly ~33 days of continuous training. ### BibTeX entry and citation info", + "model_explanation_gemini": "A decoder-only English language model (OPT-6.7B) designed for text generation and zero/few-shot learning, replicating GPT-3's performance while promoting open research access. \n\n**Features:** \n- **Architecture:** Decoder-only transformer (similar to GPT-3) \n- **Training Objective:** Causal language modeling (CLM) \n- **Primary Use:** Text generation, prompting for downstream tasks, and fine-tuning \n- **Data:** Predominantly English with" +} \ No newline at end of file diff --git a/model_data_json/facebook_vit-mae-base.json b/model_data_json/facebook_vit-mae-base.json new file mode 100644 index 0000000000000000000000000000000000000000..c94ae1a4fa64266f2f44452f78baf3bd60a6c9aa --- /dev/null +++ b/model_data_json/facebook_vit-mae-base.json @@ -0,0 +1,20 @@ +{ + "model_id": "facebook/vit-mae-base", + "downloads": 72697, + "tags": [ + "transformers", + "pytorch", + "tf", + "safetensors", + "vit_mae", + "pretraining", + "vision", + "dataset:imagenet-1k", + "arxiv:2111.06377", + "license:apache-2.0", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: apache-2.0 tags: - vision datasets: - imagenet-1k --- # Vision Transformer (base-sized model) pre-trained with MAE Vision Transformer (ViT) model pre-trained using the MAE method. It was introduced in the paper Masked Autoencoders Are Scalable Vision Learners by Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, Ross Girshick and first released in this repository. Disclaimer: The team releasing MAE did not write a model card for this model so this model card has been written by the Hugging Face team. ## Model description The Vision Transformer (ViT) is a transformer encoder model (BERT-like). Images are presented to the model as a sequence of fixed-size patches. During pre-training, one randomly masks out a high portion (75%) of the image patches. First, the encoder is used to encode the visual patches. Next, a learnable (shared) mask token is added at the positions of the masked patches. The decoder takes the encoded visual patches and mask tokens as input and reconstructs raw pixel values for the masked positions. By pre-training the model, it learns an inner representation of images that can then be used to extract features useful for downstream tasks: if you have a dataset of labeled images for instance, you can train a standard classifier by placing a linear layer on top of the pre-trained encoder. ## Intended uses & limitations You can use the raw model for image classification. See the model hub to look for fine-tuned versions on a task that interests you. ### How to use Here is how to use this model: ### BibTeX entry and citation info", + "model_explanation_gemini": "\"Facebook's ViT-MAE-base is a vision transformer model pre-trained with masked autoencoding (MAE) to reconstruct image patches, enabling downstream tasks like image classification.\"\n\nFeatures: \n- Base-sized Vision Transformer (ViT) architecture \n- Pre-trained using MAE (Masked Autoencoder) method \n- 75% of image patches masked during pre-training \n- Encoder-decoder structure for patch reconstruction \n- Suitable for transfer learning on image-related tasks \n\nComparison: \nUnlike standard" +} \ No newline at end of file diff --git a/model_data_json/facebook_wav2vec2-large.json b/model_data_json/facebook_wav2vec2-large.json new file mode 100644 index 0000000000000000000000000000000000000000..a14187a79099298727b21b085b3bc913036a713b --- /dev/null +++ b/model_data_json/facebook_wav2vec2-large.json @@ -0,0 +1,19 @@ +{ + "model_id": "facebook/wav2vec2-large", + "downloads": 82081, + "tags": [ + "transformers", + "pytorch", + "wav2vec2", + "pretraining", + "speech", + "en", + "dataset:librispeech_asr", + "arxiv:2006.11477", + "license:apache-2.0", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: en datasets: - librispeech_asr tags: - speech license: apache-2.0 --- # Wav2Vec2-Large Facebook's Wav2Vec2 The base model pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz. Note that this model should be fine-tuned on a downstream task, like Automatic Speech Recognition. Check out this blog for more information. Paper Authors: Alexei Baevski, Henry Zhou, Abdelrahman Mohamed, Michael Auli **Abstract** We show for the first time that learning powerful representations from speech audio alone followed by fine-tuning on transcribed speech can outperform the best semi-supervised methods while being conceptually simpler. wav2vec 2.0 masks the speech input in the latent space and solves a contrastive task defined over a quantization of the latent representations which are jointly learned. Experiments using all labeled data of Librispeech achieve 1.8/3.3 WER on the clean/other test sets. When lowering the amount of labeled data to one hour, wav2vec 2.0 outperforms the previous state of the art on the 100 hour subset while using 100 times less labeled data. Using just ten minutes of labeled data and pre-training on 53k hours of unlabeled data still achieves 4.8/8.2 WER. This demonstrates the feasibility of speech recognition with limited amounts of labeled data. The original model can be found under # Usage See this notebook for more information on how to fine-tune the model.", + "model_explanation_gemini": "\"facebook_wav2vec2-large is a pretrained speech model for 16kHz audio, designed for fine-tuning on tasks like automatic speech recognition, leveraging self-supervised learning to achieve high accuracy with limited labeled data.\"\n\nFeatures: \n- Pretrained on 16kHz speech audio \n- Requires fine-tuning for downstream tasks (e.g., ASR) \n- Uses self-supervised learning with latent space masking and contrastive tasks \n- Achieves low WER (1.8/3" +} \ No newline at end of file diff --git a/model_data_json/foduucom_table-detection-and-extraction.json b/model_data_json/foduucom_table-detection-and-extraction.json new file mode 100644 index 0000000000000000000000000000000000000000..de95e29d2af57262162ad638ea7b588902bf1595 --- /dev/null +++ b/model_data_json/foduucom_table-detection-and-extraction.json @@ -0,0 +1,30 @@ +{ + "model_id": "foduucom/table-detection-and-extraction", + "downloads": 83150, + "tags": [ + "ultralytics", + "tensorboard", + "v8", + "ultralyticsplus", + "yolov8", + "yolo", + "vision", + "object-detection", + "pytorch", + "table detection", + "table extraction", + "table classification", + "document analysis", + "unstructured document", + "unstructured table extraction", + "structured table extraction", + "unstructured table detection", + "structured table detection", + "en", + "dataset:foduucom/table-detection-yolo", + "model-index", + "region:us" + ], + "description": "--- tags: - ultralyticsplus - yolov8 - ultralytics - yolo - vision - object-detection - pytorch - table detection - table extraction - table classification - document analysis - unstructured document - unstructured table extraction - structured table extraction - unstructured table detection - structured table detection library_name: ultralytics library_version: 8.0.43 inference: true model-index: - name: foduucom/table-detection-and-extraction results: - task: type: object-detection metrics: - type: precision value: 0.96196 name: mAP@0.5(box) language: - en metrics: - accuracy datasets: - foduucom/table-detection-yolo pipeline_tag: object-detection ---
\"foduucom/table-detection-and-extraction\" # Model Card for YOLOv8s Table Detection ## Model Summary The YOLOv8s Table Detection model is an object detection model based on the YOLO (You Only Look Once) framework. It is designed to detect tables, whether they are bordered or borderless, in images. The model has been fine-tuned on a vast dataset and achieved high accuracy in detecting tables and distinguishing between bordered and borderless ones. ## Model Details ### Model Description The YOLOv8s Table Detection model serves as a versatile solution for precisely identifying tables within images, whether they exhibit a bordered or borderless design. Notably, this model's capabilities extend beyond mere detection – it plays a crucial role in addressing the complexities of unstructured documents. By employing advanced techniques such as bounding box delineation, the model enables users to isolate tables of interest within the visual content. What sets this model apart is its synergy with Optical Character Recognition (OCR) technology. This seamless integration empowers the model to not only locate tables but also to extract pertinent data contained within. The bounding box information guides the cropping of tables, which is then coupled with OCR to meticulously extract textual data, streamlining the process of information retrieval from unstructured documents. We invite you to explore the potential of this model and its data extraction capabilities. For those interested in harnessing its power or seeking further collaboration, we encourage you to reach out to us at info@foduu.com. Whether you require assistance, customization, or have innovative ideas, our collaborative approach is geared towards addressing your unique challenges. Additionally, you can actively engage with our vibrant community section for valuable insights and collective problem-solving. Your input drives our continuous improvement, as we collectively pave the way towards enhanced data extraction and document analysis. - **Developed by:** FODUU AI - **Model type:** Object Detection - **Task:** Table Detection (Bordered and Borderless) Furthermore, the YOLOv8s Table Detection model is not limited to table detection alone. It is a versatile tool that contributes to the processing of unstructured documents. By utilizing advanced bounding box techniques, the model empowers users to isolate tables within the document's visual content. What sets this model apart is its seamless integration with Optical Character Recognition (OCR) technology. The combination of bounding box information and OCR allows for precise data extraction from the tables. This comprehensive approach streamlines the process of information retrieval from complex documents. User collaboration is actively encouraged to enrich the model's capabilities. By contributing table images of different designs and types, users play a pivotal role in enhancing the model's ability to detect a diverse range of tables accurately. Community participation can be facilitated through our platform or by reaching out to us at info@foduu.com. We value collaborative efforts that drive continuous improvement and innovation in table detection and extraction. ### Supported Labels ## Uses ### Direct Use The YOLOv8s Table Detection model can be directly used for detecting tables in images, whether they are bordered or borderless. It is equipped with the ability to distinguish between these two categories. ### Downstream Use The model can also be fine-tuned for specific table detection tasks or integrated into larger applications for furniture recognition, interior design, image-based data extraction, and other related fields. ### Out-of-Scope Use The model is not designed for unrelated object detection tasks or scenarios outside the scope of table detection. ## Bias, Risks, and Limitations The YOLOv8s Table Detection model may have some limitations and biases: - Performance may vary based on the quality, diversity, and representativeness of the training data. - The model may face challenges in detecting tables with intricate designs or complex arrangements. - Accuracy may be affected by variations in lighting conditions, image quality, and resolution. - Detection of very small or distant tables might be less accurate. - The model's ability to classify bordered and borderless tables may be influenced by variations in design. ### Recommendations Users should be informed about the model's limitations and potential biases. Further testing and validation are advised for specific use cases to evaluate its performance accurately. ## How to Get Started with the Model To begin using the YOLOv8s Table Detection model, follow these steps: - Load model and perform prediction: ## Training Details ### Training Data The model is trained on a diverse dataset containing images of tables from various sources. The dataset includes examples of both bordered and borderless tables, capturing different designs and styles. ### Training Procedure The training process involves extensive computation and is conducted over multiple epochs. The model's weights are adjusted to minimize detection loss and optimize performance. #### Metrics - mAP@0.5 (box): - All: 0.962 - Bordered: 0.961 - Borderless: 0.963 ### Model Architecture and Objective The YOLOv8s architecture employs a modified CSPDarknet53 as its backbone, along with self-attention mechanisms and feature pyramid networks. These components contribute to the model's ability to detect and classify tables accurately, considering variations in size, design, and style. ### Compute Infrastructure #### Hardware NVIDIA GeForce RTX 3060 card #### Software The model was trained and fine-tuned using a Jupyter Notebook environment. ## Model Card Contact For inquiries and contributions, please contact us at info@foduu.com. ---", + "model_explanation_gemini": "Detects and extracts tables (bordered/borderless) from images using YOLOv8, integrating OCR for data retrieval from unstructured documents.\n\nModel Features: \n- Object detection (YOLOv8-based) \n- Handles both bordered and borderless tables \n- Bounding box generation for table isolation \n- OCR integration for structured data extraction \n- Fine-tuned for high precision (mAP@0.5: 0.962) \n\nComparison: \nOutperforms generic object" +} \ No newline at end of file diff --git a/model_data_json/fxmarty_pix2struct-tiny-random.json b/model_data_json/fxmarty_pix2struct-tiny-random.json new file mode 100644 index 0000000000000000000000000000000000000000..6723b38e9b62300e98f818ab182247a29c3b86dc --- /dev/null +++ b/model_data_json/fxmarty_pix2struct-tiny-random.json @@ -0,0 +1,16 @@ +{ + "model_id": "fxmarty/pix2struct-tiny-random", + "downloads": 76445, + "tags": [ + "transformers", + "pytorch", + "pix2struct", + "image-text-to-text", + "image-to-text", + "license:mit", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: mit pipeline_tag: image-to-text ---", + "model_explanation_gemini": "Generates text from images, licensed under MIT.\n\nFeatures: \n- Image-to-text conversion \n- MIT license \n\nComparison: Not applicable (no comparative details provided in the description)." +} \ No newline at end of file diff --git a/model_data_json/gokaygokay_Florence-2-SD3-Captioner.json b/model_data_json/gokaygokay_Florence-2-SD3-Captioner.json new file mode 100644 index 0000000000000000000000000000000000000000..fa3756eaa3385729330f88e7b85eac2244d9ad1f --- /dev/null +++ b/model_data_json/gokaygokay_Florence-2-SD3-Captioner.json @@ -0,0 +1,22 @@ +{ + "model_id": "gokaygokay/Florence-2-SD3-Captioner", + "downloads": 76570, + "tags": [ + "transformers", + "safetensors", + "florence2", + "text-generation", + "art", + "image-text-to-text", + "custom_code", + "en", + "dataset:google/docci", + "dataset:google/imageinwords", + "dataset:ProGamerGov/synthetic-dataset-1m-dalle3-high-quality-captions", + "license:apache-2.0", + "autotrain_compatible", + "region:us" + ], + "description": "--- license: apache-2.0 datasets: - google/docci - google/imageinwords - ProGamerGov/synthetic-dataset-1m-dalle3-high-quality-captions language: - en library_name: transformers pipeline_tag: image-text-to-text tags: - art ---", + "model_explanation_gemini": "Generates text captions for images using the Florence-2-SD3 architecture, trained on datasets like DocCI, ImageInWords, and synthetic DALL-E 3 captions. \n\n**Features:** \n- Image-to-text captioning \n- Trained on diverse datasets (DocCI, ImageInWords, synthetic DALL-E 3 captions) \n- Supports English language \n- Apache 2.0 license \n\n**Comparison (if inferred):** \nNo explicit comparison is made" +} \ No newline at end of file diff --git a/model_data_json/google_gemma-1.1-2b-it.json b/model_data_json/google_gemma-1.1-2b-it.json new file mode 100644 index 0000000000000000000000000000000000000000..52e4f5f91457cd1010b9b378d33dae315c9e66e7 --- /dev/null +++ b/model_data_json/google_gemma-1.1-2b-it.json @@ -0,0 +1,36 @@ +{ + "model_id": "google/gemma-1.1-2b-it", + "downloads": 73021, + "tags": [ + "transformers", + "safetensors", + "gemma", + "text-generation", + "conversational", + "arxiv:2312.11805", + "arxiv:2009.03300", + "arxiv:1905.07830", + "arxiv:1911.11641", + "arxiv:1904.09728", + "arxiv:1905.10044", + "arxiv:1907.10641", + "arxiv:1811.00937", + "arxiv:1809.02789", + "arxiv:1911.01547", + "arxiv:1705.03551", + "arxiv:2107.03374", + "arxiv:2108.07732", + "arxiv:2110.14168", + "arxiv:2304.06364", + "arxiv:2206.04615", + "arxiv:1804.06876", + "arxiv:2110.08193", + "license:gemma", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- library_name: transformers license: gemma widget: - messages: - role: user content: How does the brain work? inference: parameters: max_new_tokens: 200 extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged-in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license --- # Gemma Model Card **Model Page**: Gemma This model card corresponds to the latest 2B instruct version of the Gemma model. Here you can find other models in the Gemma family: | | Base | Instruct | |----|----------------------------------------------------|----------------------------------------------------------------------| | 2B | gemma-2b | **gemma-1.1-2b-it** | | 7B | gemma-7b | gemma-1.1-7b-it | **Release Notes** This is Gemma 1.1 2B (IT), an update over the original instruction-tuned Gemma release. Gemma 1.1 was trained using a novel RLHF method, leading to substantial gains on quality, coding capabilities, factuality, instruction following and multi-turn conversation quality. We also fixed a bug in multi-turn conversations, and made sure that model responses don't always start with . We believe this release represents an improvement for most use cases, but we encourage users to test in their particular applications. The previous model will continue to be available in the same repo. We appreciate the enthusiastic adoption of Gemma, and we continue to welcome all feedback from the community. **Resources and Technical Documentation**: * Responsible Generative AI Toolkit * Gemma on Kaggle * Gemma on Vertex Model Garden **Terms of Use**: Terms **Authors**: Google ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights, pre-trained variants, and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Usage Below we share some code snippets on how to get quickly started with running the model. First make sure to , then copy the snippet from the section that is relevant for your usecase. #### Running the model on a CPU As explained below, we recommend as the default dtype. You can use a different precision if necessary. #### Running the model on a single / multi GPU #### Running the model on a GPU using different precisions The native weights of this model were exported in precision. You can use , which may be faster on certain hardware, indicating the when loading the model. For convenience, the revision of the repo contains a copy of the weights already converted to that precision. You can also use if you skip the dtype, but no precision increase will occur (model weights will just be upcasted to ). See examples below. * _Using _ * _Using _ * _Upcasting to _ #### Quantized Versions through * _Using 8-bit precision (int8)_ * _Using 4-bit precision_ #### Other optimizations * _Flash Attention 2_ First make sure to install in your environment #### Running the model in JAX / Flax Use the branch of the repository: Check this notebook for a comprehensive walkthrough on how to parallelize JAX inference. ### Chat Template The instruction-tuned models use a chat template that must be adhered to for conversational use. The easiest way to apply it is using the tokenizer's built-in chat template, as shown in the following snippet. Let's load the model and apply the chat template to a conversation. In this example, we'll start with a single user interaction: At this point, the prompt contains the following text: As you can see, each turn is preceded by a delimiter and then the role of the entity (either , for content supplied by the user, or for LLM responses). Turns finish with the token. You can follow this format to build the prompt manually, if you need to do it without the tokenizer's chat template. After the prompt is ready, generation can be performed like this: ### Fine-tuning You can find some fine-tuning scripts under the directory of []( repository. To adapt them to this model, simply change the model-id to . We provide: * A script to perform Supervised Fine-Tuning (SFT) on UltraChat dataset using QLoRA * A script to perform SFT using FSDP on TPU devices * A notebook that you can run on a free-tier Google Colab instance to perform SFT on the English quotes dataset ### Inputs and outputs * **Input:** Text string, such as a question, a prompt, or a document to be summarized. * **Output:** Generated English-language text in response to the input, such as an answer to a question, or a summary of a document. ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources, totaling 6 trillion tokens. Here are the key components: * Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. Primarily English-language content. * Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code or understand code-related questions. * Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. The combination of these diverse data sources is crucial for training a powerful language model that can handle a wide variety of different tasks and text formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: * CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content * Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. * Additional methods: Filtering based on content quality and safely in line with our policies. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using the latest generation of Tensor Processing Unit (TPU) hardware (TPUv5e). Training large language models requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: * Performance: TPUs are specifically designed to handle the massive computations involved in training LLMs. They can speed up training considerably compared to CPUs. * Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. * Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. * Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. * These advantages are aligned with Google's commitments to operate sustainably. ### Software Training was done using JAX and ML Pathways. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for foundation models, including large language models like these ones. Together, JAX and ML Pathways are used as described in the paper about the Gemini family of models; \"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\" ## Evaluation Model evaluation metrics and results. ### Benchmark Results The pre-trained base models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: | Benchmark | Metric | 2B Params | 7B Params | | ------------------------------ | ------------- | ----------- | --------- | | MMLU | 5-shot, top-1 | 42.3 | 64.3 | | HellaSwag | 0-shot |71.4 | 81.2 | | PIQA | 0-shot | 77.3 | 81.2 | | SocialIQA | 0-shot | 49.7 | 51.8 | | BooIQ | 0-shot | 69.4 | 83.2 | | WinoGrande | partial score | 65.4 | 72.3 | | CommonsenseQA | 7-shot | 65.3 | 71.3 | | OpenBookQA | | 47.8 | 52.8 | | ARC-e | | 73.2 | 81.5 | | ARC-c | | 42.1 | 53.2 | | TriviaQA | 5-shot | 53.2 | 63.4 | | Natural Questions | 5-shot | 12.5 | 23 | | HumanEval | pass@1 | 22.0 | 32.3 | | MBPP | 3-shot | 29.2 | 44.4 | | GSM8K | maj@1 | 17.7 | 46.4 | | MATH | 4-shot | 11.8 | 24.3 | | AGIEval | | 24.2 | 41.7 | | BIG-Bench | | 35.2 | 55.1 | | ------------------------------ | ------------- | ----------- | --------- | | **Average** | | **45.0** | **56.9** | ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: * Text-to-Text Content Safety: Human evaluation on prompts covering safety policies including child sexual abuse and exploitation, harassment, violence and gore, and hate speech. * Text-to-Text Representational Harms: Benchmark against relevant academic datasets such as WinoBias and BBQ Dataset. * Memorization: Automated evaluation of memorization of training data, including the risk of personally identifiable information exposure. * Large-scale harm: Tests for \"dangerous capabilities,\" such as chemical, biological, radiological, and nuclear (CBRN) risks. ### Evaluation Results The results of ethics and safety evaluations are within acceptable thresholds for meeting internal policies for categories such as child safety, content safety, representational harms, memorization, large-scale harms. On top of robust internal evaluations, the results of well known safety benchmarks like BBQ, BOLD, Winogender, Winobias, RealToxicity, and TruthfulQA are shown here. #### Gemma 1.0 | Benchmark | Metric | Gemma 1.0 IT 2B | Gemma 1.0 IT 7B | | ------------------------ | ------------- | --------------- | --------------- | | [RealToxicity][realtox] | average | 6.86 | 7.90 | | [BOLD][bold] | | 45.57 | 49.08 | | [CrowS-Pairs][crows] | top-1 | 45.82 | 51.33 | | [BBQ Ambig][bbq] | 1-shot, top-1 | 62.58 | 92.54 | | [BBQ Disambig][bbq] | top-1 | 54.62 | 71.99 | | [Winogender][winogender] | top-1 | 51.25 | 54.17 | | [TruthfulQA][truthfulqa] | | 44.84 | 31.81 | | [Winobias 1_2][winobias] | | 56.12 | 59.09 | | [Winobias 2_2][winobias] | | 91.10 | 92.23 | | [Toxigen][toxigen] | | 29.77 | 39.59 | | ------------------------ | ------------- | --------------- | --------------- | #### Gemma 1.1 | Benchmark | Metric | Gemma 1.1 IT 2B | Gemma 1.1 IT 7B | | ------------------------ | ------------- | --------------- | --------------- | | [RealToxicity][realtox] | average | 7.03 | 8.04 | | [BOLD][bold] | | 47.76 | | | [CrowS-Pairs][crows] | top-1 | 45.89 | 49.67 | | [BBQ Ambig][bbq] | 1-shot, top-1 | 58.97 | 86.06 | | [BBQ Disambig][bbq] | top-1 | 53.90 | 85.08 | | [Winogender][winogender] | top-1 | 50.14 | 57.64 | | [TruthfulQA][truthfulqa] | | 44.24 | 45.34 | | [Winobias 1_2][winobias] | | 55.93 | 59.22 | | [Winobias 2_2][winobias] | | 89.46 | 89.2 | | [Toxigen][toxigen] | | 29.64 | 38.75 | | ------------------------ | ------------- | --------------- | --------------- | ## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open Large Language Models (LLMs) have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. * Content Creation and Communication * Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. * Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. * Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. * Research and Education * Natural Language Processing (NLP) Research: These models can serve as a foundation for researchers to experiment with NLP techniques, develop algorithms, and contribute to the advancement of the field. * Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. * Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations * Training Data * The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. * The scope of the training dataset determines the subject areas the model can handle effectively. * Context and Task Complexity * LLMs are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. * A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). * Language Ambiguity and Nuance * Natural language is inherently complex. LLMs might struggle to grasp subtle nuances, sarcasm, or figurative language. * Factual Accuracy * LLMs generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. * Common Sense * LLMs rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of large language models (LLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: * Bias and Fairness * LLMs trained on large-scale, real-world text data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. * Misinformation and Misuse * LLMs can be misused to generate text that is false, misleading, or harmful. * Guidelines are provided for responsible use with the model, see the Responsible Generative AI Toolkit. * Transparency and Accountability: * This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. * A responsibly developed open model offers the opportunity to share innovation by making LLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: * Perpetuation of biases: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. * Generation of harmful content: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. * Misuse for malicious purposes: Technical limitations and developer and end-user education can help mitigate against malicious applications of LLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the Gemma Prohibited Use Policy. * Privacy violations: Models were trained on data filtered for removal of PII (Personally Identifiable Information). Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open large language model implementations designed from the ground up for Responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives.", + "model_explanation_gemini": "A lightweight, instruction-tuned text generation model optimized for tasks like question answering, summarization, and reasoning, with improved quality and coding capabilities over its predecessor.\n\n**Features**: \n- Instruction-tuned for conversational use \n- Supports text generation tasks (QA, summarization, reasoning) \n- Optimized for limited-resource environments (CPU/GPU) \n- Improved factuality and multi-turn conversation quality \n- Available in 2B parameter size \n\n**Comparison**: \nThe 1.1" +} \ No newline at end of file diff --git a/model_data_json/google_gemma-2-27b-it.json b/model_data_json/google_gemma-2-27b-it.json new file mode 100644 index 0000000000000000000000000000000000000000..ad511c72135c6b33810dc886beaf8aad8e6b5042 --- /dev/null +++ b/model_data_json/google_gemma-2-27b-it.json @@ -0,0 +1,43 @@ +{ + "model_id": "google/gemma-2-27b-it", + "downloads": 135109, + "tags": [ + "transformers", + "safetensors", + "gemma2", + "text-generation", + "conversational", + "arxiv:2009.03300", + "arxiv:1905.07830", + "arxiv:1911.11641", + "arxiv:1904.09728", + "arxiv:1905.10044", + "arxiv:1907.10641", + "arxiv:1811.00937", + "arxiv:1809.02789", + "arxiv:1911.01547", + "arxiv:1705.03551", + "arxiv:2107.03374", + "arxiv:2108.07732", + "arxiv:2110.14168", + "arxiv:2009.11462", + "arxiv:2101.11718", + "arxiv:2110.08193", + "arxiv:1804.09301", + "arxiv:2109.07958", + "arxiv:1804.06876", + "arxiv:2103.03874", + "arxiv:2304.06364", + "arxiv:2206.04615", + "arxiv:2203.09509", + "base_model:google/gemma-2-27b", + "base_model:finetune:google/gemma-2-27b", + "license:gemma", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: gemma library_name: transformers pipeline_tag: text-generation extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: >- To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license base_model: google/gemma-2-27b --- # Gemma 2 model card **Model Page**: Gemma **Resources and Technical Documentation**: * [Responsible Generative AI Toolkit][rai-toolkit] * [Gemma on Kaggle][kaggle-gemma] * [Gemma on Vertex Model Garden][vertex-mg-gemma] **Terms of Use**: Terms **Authors**: Google ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights for both pre-trained variants and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Usage Below we share some code snippets on how to get quickly started with running the model. First, install the Transformers library with: Then, copy the snippet from the section that is relevant for your usecase. #### Running with the API #### Running the model on a single / multi GPU You can ensure the correct chat template is applied by using as follows: #### Running the model on a GPU using different precisions The native weights of this model were exported in precision. You can also use if you skip the dtype, but no precision increase will occur (model weights will just be upcasted to ). See examples below. * _Upcasting to _ #### Running the model through a CLI The local-gemma repository contains a lightweight wrapper around Transformers for running Gemma 2 through a command line interface, or CLI. Follow the installation instructions for getting started, then launch the CLI through the following command: #### Quantized Versions through
Using 8-bit precision (int8)
Using 4-bit precision
#### Advanced Usage
Torch compile Torch compile is a method for speeding-up the inference of PyTorch modules. The Gemma-2 model can be run up to 6x faster by leveraging torch compile. Note that two warm-up steps are required before the full inference speed is realised: For more details, refer to the Transformers documentation.
### Chat Template The instruction-tuned models use a chat template that must be adhered to for conversational use. The easiest way to apply it is using the tokenizer's built-in chat template, as shown in the following snippet. Let's load the model and apply the chat template to a conversation. In this example, we'll start with a single user interaction: At this point, the prompt contains the following text: As you can see, each turn is preceded by a delimiter and then the role of the entity (either , for content supplied by the user, or for LLM responses). Turns finish with the token. You can follow this format to build the prompt manually, if you need to do it without the tokenizer's chat template. After the prompt is ready, generation can be performed like this: ### Inputs and outputs * **Input:** Text string, such as a question, a prompt, or a document to be summarized. * **Output:** Generated English-language text in response to the input, such as an answer to a question, or a summary of a document. ### Citation ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources. The 27B model was trained with 13 trillion tokens and the 9B model was trained with 8 trillion tokens. Here are the key components: * Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. Primarily English-language content. * Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code or understand code-related questions. * Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. The combination of these diverse data sources is crucial for training a powerful language model that can handle a wide variety of different tasks and text formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: * CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content. * Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. * Additional methods: Filtering based on content quality and safety in line with [our policies][safety-policies]. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using the latest generation of [Tensor Processing Unit (TPU)][tpu] hardware (TPUv5p). Training large language models requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: * Performance: TPUs are specifically designed to handle the massive computations involved in training LLMs. They can speed up training considerably compared to CPUs. * Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. * Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. * Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. * These advantages are aligned with [Google's commitments to operate sustainably][sustainability]. ### Software Training was done using [JAX][jax] and [ML Pathways][ml-pathways]. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for [foundation models][foundation-models], including large language models like these ones. Together, JAX and ML Pathways are used as described in the [paper about the Gemini family of models][gemini-2-paper]; \"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\" ## Evaluation Model evaluation metrics and results. ### Benchmark Results These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: | Benchmark | Metric | Gemma PT 9B | Gemma PT 27B | | ------------------------------ | ------------- | ----------- | ------------ | | [MMLU][mmlu] | 5-shot, top-1 | 71.3 | 75.2 | | [HellaSwag][hellaswag] | 10-shot | 81.9 | 86.4 | | [PIQA][piqa] | 0-shot | 81.7 | 83.2 | | [SocialIQA][socialiqa] | 0-shot | 53.4 | 53.7 | | [BoolQ][boolq] | 0-shot | 84.2 | 84.8 | | [WinoGrande][winogrande] | partial score | 80.6 | 83.7 | | [ARC-e][arc] | 0-shot | 88.0 | 88.6 | | [ARC-c][arc] | 25-shot | 68.4 | 71.4 | | [TriviaQA][triviaqa] | 5-shot | 76.6 | 83.7 | | [Natural Questions][naturalq] | 5-shot | 29.2 | 34.5 | | [HumanEval][humaneval] | pass@1 | 40.2 | 51.8 | | [MBPP][mbpp] | 3-shot | 52.4 | 62.6 | | [GSM8K][gsm8k] | 5-shot, maj@1 | 68.6 | 74.0 | | [MATH][math] | 4-shot | 36.6 | 42.3 | | [AGIEval][agieval] | 3-5-shot | 52.8 | 55.1 | | [BIG-Bench][big-bench] | 3-shot, CoT | 68.2 | 74.9 | | ------------------------------ | ------------- | ----------- | ------------ | ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: * Text-to-Text Content Safety: Human evaluation on prompts covering safety policies including child sexual abuse and exploitation, harassment, violence and gore, and hate speech. * Text-to-Text Representational Harms: Benchmark against relevant academic datasets such as [WinoBias][winobias] and [BBQ Dataset][bbq]. * Memorization: Automated evaluation of memorization of training data, including the risk of personally identifiable information exposure. * Large-scale harm: Tests for \"dangerous capabilities,\" such as chemical, biological, radiological, and nuclear (CBRN) risks. ### Evaluation Results The results of ethics and safety evaluations are within acceptable thresholds for meeting [internal policies][safety-policies] for categories such as child safety, content safety, representational harms, memorization, large-scale harms. On top of robust internal evaluations, the results of well-known safety benchmarks like BBQ, BOLD, Winogender, Winobias, RealToxicity, and TruthfulQA are shown here. #### Gemma 2.0 | Benchmark | Metric | Gemma 2 IT 9B | Gemma 2 IT 27B | | ------------------------ | ------------- | --------------- | ---------------- | | [RealToxicity][realtox] | average | 8.25 | 8.84 | | [CrowS-Pairs][crows] | top-1 | 37.47 | 36.67 | | [BBQ Ambig][bbq] | 1-shot, top-1 | 88.58 | 85.99 | | [BBQ Disambig][bbq] | top-1 | 82.67 | 86.94 | | [Winogender][winogender] | top-1 | 79.17 | 77.22 | | [TruthfulQA][truthfulqa] | | 50.27 | 51.60 | | [Winobias 1_2][winobias] | | 78.09 | 81.94 | | [Winobias 2_2][winobias] | | 95.32 | 97.22 | | [Toxigen][toxigen] | | 39.30 | 38.42 | | ------------------------ | ------------- | --------------- | ---------------- | ## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open Large Language Models (LLMs) have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. * Content Creation and Communication * Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. * Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. * Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. * Research and Education * Natural Language Processing (NLP) Research: These models can serve as a foundation for researchers to experiment with NLP techniques, develop algorithms, and contribute to the advancement of the field. * Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. * Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations * Training Data * The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. * The scope of the training dataset determines the subject areas the model can handle effectively. * Context and Task Complexity * LLMs are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. * A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). * Language Ambiguity and Nuance * Natural language is inherently complex. LLMs might struggle to grasp subtle nuances, sarcasm, or figurative language. * Factual Accuracy * LLMs generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. * Common Sense * LLMs rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of large language models (LLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: * Bias and Fairness * LLMs trained on large-scale, real-world text data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. * Misinformation and Misuse * LLMs can be misused to generate text that is false, misleading, or harmful. * Guidelines are provided for responsible use with the model, see the [Responsible Generative AI Toolkit][rai-toolkit]. * Transparency and Accountability: * This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. * A responsibly developed open model offers the opportunity to share innovation by making LLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: * Perpetuation of biases: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. * Generation of harmful content: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. * Misuse for malicious purposes: Technical limitations and developer and end-user education can help mitigate against malicious applications of LLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the [Gemma Prohibited Use Policy][prohibited-use]. * Privacy violations: Models were trained on data filtered for removal of PII (Personally Identifiable Information). Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open large language model implementations designed from the ground up for Responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives. [rai-toolkit]: [kaggle-gemma]: [terms]: [vertex-mg-gemma]: [sensitive-info]: [safety-policies]: [prohibited-use]: [tpu]: [sustainability]: [jax]: [ml-pathways]: [sustainability]: [foundation-models]: [gemini-2-paper]: [mmlu]: [hellaswag]: [piqa]: [socialiqa]: [boolq]: [winogrande]: [commonsenseqa]: [openbookqa]: [arc]: [triviaqa]: [naturalq]: [humaneval]: [mbpp]: [gsm8k]: [realtox]: [bold]: [crows]: [bbq]: [winogender]: [truthfulqa]: [winobias]: [math]: [agieval]: [big-bench]: [toxigen]:", + "model_explanation_gemini": "A lightweight, decoder-only text-generation model designed for tasks like question answering, summarization, and reasoning, optimized for deployment in resource-limited environments. \n\n**Features**: \n- Text-to-text generation (English) \n- Instruction-tuned for conversational use \n- Supports single/multi-GPU, quantization (4-bit/8-bit), and torch.compile optimization \n- Chat template for structured prompts \n\n**Comparison**: \nSmaller and more accessible than Gemini models but built from similar research, enabling local or" +} \ No newline at end of file diff --git a/model_data_json/google_gemma-2-2b-it.json b/model_data_json/google_gemma-2-2b-it.json new file mode 100644 index 0000000000000000000000000000000000000000..fb9608518ec441f045d0a3f37cba0253d2ca0261 --- /dev/null +++ b/model_data_json/google_gemma-2-2b-it.json @@ -0,0 +1,45 @@ +{ + "model_id": "google/gemma-2-2b-it", + "downloads": 317158, + "tags": [ + "transformers", + "safetensors", + "gemma2", + "text-generation", + "conversational", + "arxiv:2009.03300", + "arxiv:1905.07830", + "arxiv:1911.11641", + "arxiv:1904.09728", + "arxiv:1905.10044", + "arxiv:1907.10641", + "arxiv:1811.00937", + "arxiv:1809.02789", + "arxiv:1911.01547", + "arxiv:1705.03551", + "arxiv:2107.03374", + "arxiv:2108.07732", + "arxiv:2110.14168", + "arxiv:2009.11462", + "arxiv:2101.11718", + "arxiv:2110.08193", + "arxiv:1804.09301", + "arxiv:2109.07958", + "arxiv:1804.06876", + "arxiv:2103.03874", + "arxiv:2304.06364", + "arxiv:1903.00161", + "arxiv:2206.04615", + "arxiv:2203.09509", + "arxiv:2403.13793", + "base_model:google/gemma-2-2b", + "base_model:finetune:google/gemma-2-2b", + "license:gemma", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: gemma library_name: transformers pipeline_tag: text-generation extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: >- To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license tags: - conversational base_model: google/gemma-2-2b --- # Gemma 2 model card **Model Page**: Gemma **Resources and Technical Documentation**: * [Responsible Generative AI Toolkit][rai-toolkit] * [Gemma on Kaggle][kaggle-gemma] * [Gemma on Vertex Model Garden][vertex-mg-gemma2] **Terms of Use**: [Terms][terms] **Authors**: Google ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights for both pre-trained variants and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Usage Below we share some code snippets on how to get quickly started with running the model. First, install the Transformers library with: Then, copy the snippet from the section that is relevant for your usecase. #### Running with the API #### Running the model on a single / multi GPU You can ensure the correct chat template is applied by using as follows: #### Running the model on a GPU using different precisions The native weights of this model were exported in precision. You can also use if you skip the dtype, but no precision increase will occur (model weights will just be upcasted to ). See examples below. * _Upcasting to _ #### Running the model through a CLI The local-gemma repository contains a lightweight wrapper around Transformers for running Gemma 2 through a command line interface, or CLI. Follow the installation instructions for getting started, then launch the CLI through the following command: #### Quantized Versions through
Using 8-bit precision (int8)
Using 4-bit precision
#### Advanced Usage
Torch compile Torch compile is a method for speeding-up the inference of PyTorch modules. The Gemma-2 2b model can be run up to 6x faster by leveraging torch compile. Note that two warm-up steps are required before the full inference speed is realised: For more details, refer to the Transformers documentation.
### Chat Template The instruction-tuned models use a chat template that must be adhered to for conversational use. The easiest way to apply it is using the tokenizer's built-in chat template, as shown in the following snippet. Let's load the model and apply the chat template to a conversation. In this example, we'll start with a single user interaction: At this point, the prompt contains the following text: As you can see, each turn is preceded by a delimiter and then the role of the entity (either , for content supplied by the user, or for LLM responses). Turns finish with the token. You can follow this format to build the prompt manually, if you need to do it without the tokenizer's chat template. After the prompt is ready, generation can be performed like this: ### Inputs and outputs * **Input:** Text string, such as a question, a prompt, or a document to be summarized. * **Output:** Generated English-language text in response to the input, such as an answer to a question, or a summary of a document. ### Citation ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources. The 27B model was trained with 13 trillion tokens, the 9B model was trained with 8 trillion tokens, and 2B model was trained with 2 trillion tokens. Here are the key components: * Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. Primarily English-language content. * Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code or understand code-related questions. * Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. The combination of these diverse data sources is crucial for training a powerful language model that can handle a wide variety of different tasks and text formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: * CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content. * Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. * Additional methods: Filtering based on content quality and safety in line with [our policies][safety-policies]. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using the latest generation of [Tensor Processing Unit (TPU)][tpu] hardware (TPUv5p). Training large language models requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: * Performance: TPUs are specifically designed to handle the massive computations involved in training LLMs. They can speed up training considerably compared to CPUs. * Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. * Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. * Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. * These advantages are aligned with [Google's commitments to operate sustainably][sustainability]. ### Software Training was done using [JAX][jax] and [ML Pathways][ml-pathways]. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for [foundation models][foundation-models], including large language models like these ones. Together, JAX and ML Pathways are used as described in the [paper about the Gemini family of models][gemini-2-paper]; \"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\" ## Evaluation Model evaluation metrics and results. ### Benchmark Results These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: | Benchmark | Metric | Gemma 2 PT 2B | Gemma 2 PT 9B | Gemma 2 PT 27B | | ------------------------------ | ------------- | ------------- | ------------- | -------------- | | [MMLU][mmlu] | 5-shot, top-1 | 51.3 | 71.3 | 75.2 | | [HellaSwag][hellaswag] | 10-shot | 73.0 | 81.9 | 86.4 | | [PIQA][piqa] | 0-shot | 77.8 | 81.7 | 83.2 | | [SocialIQA][socialiqa] | 0-shot | 51.9 | 53.4 | 53.7 | | [BoolQ][boolq] | 0-shot | 72.5 | 84.2 | 84.8 | | [WinoGrande][winogrande] | partial score | 70.9 | 80.6 | 83.7 | | [ARC-e][arc] | 0-shot | 80.1 | 88.0 | 88.6 | | [ARC-c][arc] | 25-shot | 55.4 | 68.4 | 71.4 | | [TriviaQA][triviaqa] | 5-shot | 59.4 | 76.6 | 83.7 | | [Natural Questions][naturalq] | 5-shot | 16.7 | 29.2 | 34.5 | | [HumanEval][humaneval] | pass@1 | 17.7 | 40.2 | 51.8 | | [MBPP][mbpp] | 3-shot | 29.6 | 52.4 | 62.6 | | [GSM8K][gsm8k] | 5-shot, maj@1 | 23.9 | 68.6 | 74.0 | | [MATH][math] | 4-shot | 15.0 | 36.6 | 42.3 | | [AGIEval][agieval] | 3-5-shot | 30.6 | 52.8 | 55.1 | | [DROP][drop] | 3-shot, F1 | 52.0 | 69.4 | 72.2 | | [BIG-Bench][big-bench] | 3-shot, CoT | 41.9 | 68.2 | 74.9 | ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: * Text-to-Text Content Safety: Human evaluation on prompts covering safety policies including child sexual abuse and exploitation, harassment, violence and gore, and hate speech. * Text-to-Text Representational Harms: Benchmark against relevant academic datasets such as [WinoBias][winobias] and [BBQ Dataset][bbq]. * Memorization: Automated evaluation of memorization of training data, including the risk of personally identifiable information exposure. * Large-scale harm: Tests for \"dangerous capabilities,\" such as chemical, biological, radiological, and nuclear (CBRN) risks. ### Evaluation Results The results of ethics and safety evaluations are within acceptable thresholds for meeting [internal policies][safety-policies] for categories such as child safety, content safety, representational harms, memorization, large-scale harms. On top of robust internal evaluations, the results of well-known safety benchmarks like BBQ, BOLD, Winogender, Winobias, RealToxicity, and TruthfulQA are shown here. #### Gemma 2.0 | Benchmark | Metric | Gemma 2 IT 2B | Gemma 2 IT 9B | Gemma 2 IT 27B | | ------------------------ | ------------- | ------------- | ------------- | -------------- | | [RealToxicity][realtox] | average | 8.16 | 8.25 | 8.84 | | [CrowS-Pairs][crows] | top-1 | 37.67 | 37.47 | 36.67 | | [BBQ Ambig][bbq] | 1-shot, top-1 | 83.20 | 88.58 | 85.99 | | [BBQ Disambig][bbq] | top-1 | 69.31 | 82.67 | 86.94 | | [Winogender][winogender] | top-1 | 52.91 | 79.17 | 77.22 | | [TruthfulQA][truthfulqa] | | 43.72 | 50.27 | 51.60 | | [Winobias 1_2][winobias] | | 59.28 | 78.09 | 81.94 | | [Winobias 2_2][winobias] | | 88.57 | 95.32 | 97.22 | | [Toxigen][toxigen] | | 48.32 | 39.30 | 38.42 | ## Dangerous Capability Evaluations ### Evaluation Approach We evaluated a range of dangerous capabilities: - **Offensive cybersecurity:** To assess the model's potential for misuse in cybersecurity contexts, we utilized both publicly available Capture-the-Flag (CTF) platforms like InterCode-CTF and Hack the Box, as well as internally developed CTF challenges. These evaluations measure the model's ability to exploit vulnerabilities and gain unauthorized access in simulated environments. - **Self-proliferation:** We evaluated the model's capacity for self-proliferation by designing tasks that involve resource acquisition, code execution, and interaction with remote systems. These evaluations assess the model's ability to independently replicate and spread. - **Persuasion:** To evaluate the model's capacity for persuasion and deception, we conducted human persuasion studies. These studies involved scenarios that measure the model's ability to build rapport, influence beliefs, and elicit specific actions from human participants. ### Evaluation Results All evaluations are described in detail in [Evaluating Frontier Models for Dangerous Capabilities][eval-danger] and in brief in the [Gemma 2 technical report][tech-report].
Evaluation Capability Gemma 2 IT 27B
InterCode-CTF Offensive cybersecurity 34/76 challenges
Internal CTF Offensive cybersecurity 1/13 challenges
Hack the Box Offensive cybersecurity 0/13 challenges
Self-proliferation early warning Self-proliferation 1/10 challenges
Charm offensive Persuasion Percent of participants agreeing: 81% interesting, 75% would speak again, 80% made personal connection
Click Links Persuasion 34% of participants
Find Info Persuasion 9% of participants
Run Code Persuasion 11% of participants
Money talks Persuasion £3.72 mean donation
Web of Lies Persuasion 18% mean shift towards correct belief, 1% mean shift towards incorrect belief
## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open Large Language Models (LLMs) have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. * Content Creation and Communication * Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. * Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. * Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. * Research and Education * Natural Language Processing (NLP) Research: These models can serve as a foundation for researchers to experiment with NLP techniques, develop algorithms, and contribute to the advancement of the field. * Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. * Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations * Training Data * The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. * The scope of the training dataset determines the subject areas the model can handle effectively. * Context and Task Complexity * LLMs are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. * A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). * Language Ambiguity and Nuance * Natural language is inherently complex. LLMs might struggle to grasp subtle nuances, sarcasm, or figurative language. * Factual Accuracy * LLMs generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. * Common Sense * LLMs rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of large language models (LLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: * Bias and Fairness * LLMs trained on large-scale, real-world text data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. * Misinformation and Misuse * LLMs can be misused to generate text that is false, misleading, or harmful. * Guidelines are provided for responsible use with the model, see the [Responsible Generative AI Toolkit][rai-toolkit]. * Transparency and Accountability: * This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. * A responsibly developed open model offers the opportunity to share innovation by making LLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: * Perpetuation of biases: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. * Generation of harmful content: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. * Misuse for malicious purposes: Technical limitations and developer and end-user education can help mitigate against malicious applications of LLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the [Gemma Prohibited Use Policy][prohibited-use]. * Privacy violations: Models were trained on data filtered for removal of PII (Personally Identifiable Information). Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open large language model implementations designed from the ground up for Responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives. [tech-report]: [rai-toolkit]: [kaggle-gemma]: [terms]: [vertex-mg-gemma2]: [sensitive-info]: [safety-policies]: [prohibited-use]: [tpu]: [sustainability]: [jax]: [ml-pathways]: [sustainability]: [foundation-models]: [gemini-2-paper]: [mmlu]: [hellaswag]: [piqa]: [socialiqa]: [boolq]: [winogrande]: [commonsenseqa]: [openbookqa]: [arc]: [triviaqa]: [naturalq]: [humaneval]: [mbpp]: [gsm8k]: [realtox]: [bold]: [crows]: [bbq]: [winogender]: [truthfulqa]: [winobias]: [math]: [agieval]: [drop]: [big-bench]: [toxigen]: [eval-danger]:", + "model_explanation_gemini": "Generates English-language text for tasks like question answering, summarization, and reasoning, optimized for resource-limited environments.\n\nFeatures: \n- Lightweight, decoder-only LLM \n- Instruction-tuned for conversational use \n- Supports single/multi-GPU deployment \n- Compatible with 8-bit (int8) and 4-bit quantization \n- Includes chat template for structured dialogue \n- Torch compile support for faster inference \n\nComparison: \nPart of Google's Gemma family, sharing research with" +} \ No newline at end of file diff --git a/model_data_json/google_gemma-2-2b.json b/model_data_json/google_gemma-2-2b.json new file mode 100644 index 0000000000000000000000000000000000000000..9e633ee9872da2d92a518988e5d67206d006b0b7 --- /dev/null +++ b/model_data_json/google_gemma-2-2b.json @@ -0,0 +1,42 @@ +{ + "model_id": "google/gemma-2-2b", + "downloads": 155581, + "tags": [ + "transformers", + "safetensors", + "gemma2", + "text-generation", + "arxiv:2009.03300", + "arxiv:1905.07830", + "arxiv:1911.11641", + "arxiv:1904.09728", + "arxiv:1905.10044", + "arxiv:1907.10641", + "arxiv:1811.00937", + "arxiv:1809.02789", + "arxiv:1911.01547", + "arxiv:1705.03551", + "arxiv:2107.03374", + "arxiv:2108.07732", + "arxiv:2110.14168", + "arxiv:2009.11462", + "arxiv:2101.11718", + "arxiv:2110.08193", + "arxiv:1804.09301", + "arxiv:2109.07958", + "arxiv:1804.06876", + "arxiv:2103.03874", + "arxiv:2304.06364", + "arxiv:1903.00161", + "arxiv:2206.04615", + "arxiv:2203.09509", + "arxiv:2403.13793", + "license:gemma", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: gemma library_name: transformers pipeline_tag: text-generation extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: >- To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license --- # Gemma 2 model card **Model Page**: Gemma **Resources and Technical Documentation**: * [Responsible Generative AI Toolkit][rai-toolkit] * [Gemma on Kaggle][kaggle-gemma] * [Gemma on Vertex Model Garden][vertex-mg-gemma2] **Terms of Use**: [Terms][terms] **Authors**: Google ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights for both pre-trained variants and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Usage Below we share some code snippets on how to get quickly started with running the model. First, install the Transformers library with: Then, copy the snippet from the section that is relevant for your usecase. #### Running with the API #### Running the model on a single / multi GPU #### Running the model through a CLI The local-gemma repository contains a lightweight wrapper around Transformers for running Gemma 2 through a command line interface, or CLI. Follow the installation instructions for getting started, then launch the CLI through the following command: #### Quantized Versions through
Using 8-bit precision (int8)
Using 4-bit precision
#### Advanced Usage
Torch compile Torch compile is a method for speeding-up the inference of PyTorch modules. The Gemma-2 2b model can be run up to 6x faster by leveraging torch compile. Note that two warm-up steps are required before the full inference speed is realised: For more details, refer to the Transformers documentation.
### Inputs and outputs * **Input:** Text string, such as a question, a prompt, or a document to be summarized. * **Output:** Generated English-language text in response to the input, such as an answer to a question, or a summary of a document. ### Citation ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources. The 27B model was trained with 13 trillion tokens, the 9B model was trained with 8 trillion tokens, and 2B model was trained with 2 trillion tokens. Here are the key components: * Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. Primarily English-language content. * Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code or understand code-related questions. * Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. The combination of these diverse data sources is crucial for training a powerful language model that can handle a wide variety of different tasks and text formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: * CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content. * Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. * Additional methods: Filtering based on content quality and safety in line with [our policies][safety-policies]. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using the latest generation of [Tensor Processing Unit (TPU)][tpu] hardware (TPUv5p). Training large language models requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: * Performance: TPUs are specifically designed to handle the massive computations involved in training LLMs. They can speed up training considerably compared to CPUs. * Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. * Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. * Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. * These advantages are aligned with [Google's commitments to operate sustainably][sustainability]. ### Software Training was done using [JAX][jax] and [ML Pathways][ml-pathways]. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for [foundation models][foundation-models], including large language models like these ones. Together, JAX and ML Pathways are used as described in the [paper about the Gemini family of models][gemini-2-paper]; \"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\" ## Evaluation Model evaluation metrics and results. ### Benchmark Results These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: | Benchmark | Metric | Gemma 2 PT 2B | Gemma 2 PT 9B | Gemma 2 PT 27B | | ------------------------------ | ------------- | ------------- | ------------- | -------------- | | [MMLU][mmlu] | 5-shot, top-1 | 51.3 | 71.3 | 75.2 | | [HellaSwag][hellaswag] | 10-shot | 73.0 | 81.9 | 86.4 | | [PIQA][piqa] | 0-shot | 77.8 | 81.7 | 83.2 | | [SocialIQA][socialiqa] | 0-shot | 51.9 | 53.4 | 53.7 | | [BoolQ][boolq] | 0-shot | 72.5 | 84.2 | 84.8 | | [WinoGrande][winogrande] | partial score | 70.9 | 80.6 | 83.7 | | [ARC-e][arc] | 0-shot | 80.1 | 88.0 | 88.6 | | [ARC-c][arc] | 25-shot | 55.4 | 68.4 | 71.4 | | [TriviaQA][triviaqa] | 5-shot | 59.4 | 76.6 | 83.7 | | [Natural Questions][naturalq] | 5-shot | 16.7 | 29.2 | 34.5 | | [HumanEval][humaneval] | pass@1 | 17.7 | 40.2 | 51.8 | | [MBPP][mbpp] | 3-shot | 29.6 | 52.4 | 62.6 | | [GSM8K][gsm8k] | 5-shot, maj@1 | 23.9 | 68.6 | 74.0 | | [MATH][math] | 4-shot | 15.0 | 36.6 | 42.3 | | [AGIEval][agieval] | 3-5-shot | 30.6 | 52.8 | 55.1 | | [DROP][drop] | 3-shot, F1 | 52.0 | 69.4 | 72.2 | | [BIG-Bench][big-bench] | 3-shot, CoT | 41.9 | 68.2 | 74.9 | ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: * Text-to-Text Content Safety: Human evaluation on prompts covering safety policies including child sexual abuse and exploitation, harassment, violence and gore, and hate speech. * Text-to-Text Representational Harms: Benchmark against relevant academic datasets such as [WinoBias][winobias] and [BBQ Dataset][bbq]. * Memorization: Automated evaluation of memorization of training data, including the risk of personally identifiable information exposure. * Large-scale harm: Tests for \"dangerous capabilities,\" such as chemical, biological, radiological, and nuclear (CBRN) risks. ### Evaluation Results The results of ethics and safety evaluations are within acceptable thresholds for meeting [internal policies][safety-policies] for categories such as child safety, content safety, representational harms, memorization, large-scale harms. On top of robust internal evaluations, the results of well-known safety benchmarks like BBQ, BOLD, Winogender, Winobias, RealToxicity, and TruthfulQA are shown here. #### Gemma 2.0 | Benchmark | Metric | Gemma 2 IT 2B | Gemma 2 IT 9B | Gemma 2 IT 27B | | ------------------------ | ------------- | ------------- | ------------- | -------------- | | [RealToxicity][realtox] | average | 8.16 | 8.25 | 8.84 | | [CrowS-Pairs][crows] | top-1 | 37.67 | 37.47 | 36.67 | | [BBQ Ambig][bbq] | 1-shot, top-1 | 83.20 | 88.58 | 85.99 | | [BBQ Disambig][bbq] | top-1 | 69.31 | 82.67 | 86.94 | | [Winogender][winogender] | top-1 | 52.91 | 79.17 | 77.22 | | [TruthfulQA][truthfulqa] | | 43.72 | 50.27 | 51.60 | | [Winobias 1_2][winobias] | | 59.28 | 78.09 | 81.94 | | [Winobias 2_2][winobias] | | 88.57 | 95.32 | 97.22 | | [Toxigen][toxigen] | | 48.32 | 39.30 | 38.42 | ## Dangerous Capability Evaluations ### Evaluation Approach We evaluated a range of dangerous capabilities: - **Offensive cybersecurity:** To assess the model's potential for misuse in cybersecurity contexts, we utilized both publicly available Capture-the-Flag (CTF) platforms like InterCode-CTF and Hack the Box, as well as internally developed CTF challenges. These evaluations measure the model's ability to exploit vulnerabilities and gain unauthorized access in simulated environments. - **Self-proliferation:** We evaluated the model's capacity for self-proliferation by designing tasks that involve resource acquisition, code execution, and interaction with remote systems. These evaluations assess the model's ability to independently replicate and spread. - **Persuasion:** To evaluate the model's capacity for persuasion and deception, we conducted human persuasion studies. These studies involved scenarios that measure the model's ability to build rapport, influence beliefs, and elicit specific actions from human participants. ### Evaluation Results All evaluations are described in detail in [Evaluating Frontier Models for Dangerous Capabilities][eval-danger] and in brief in the [Gemma 2 technical report][tech-report].
Evaluation Capability Gemma 2 IT 27B
InterCode-CTF Offensive cybersecurity 34/76 challenges
Internal CTF Offensive cybersecurity 1/13 challenges
Hack the Box Offensive cybersecurity 0/13 challenges
Self-proliferation early warning Self-proliferation 1/10 challenges
Charm offensive Persuasion Percent of participants agreeing: 81% interesting, 75% would speak again, 80% made personal connection
Click Links Persuasion 34% of participants
Find Info Persuasion 9% of participants
Run Code Persuasion 11% of participants
Money talks Persuasion £3.72 mean donation
Web of Lies Persuasion 18% mean shift towards correct belief, 1% mean shift towards incorrect belief
## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open Large Language Models (LLMs) have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. * Content Creation and Communication * Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. * Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. * Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. * Research and Education * Natural Language Processing (NLP) Research: These models can serve as a foundation for researchers to experiment with NLP techniques, develop algorithms, and contribute to the advancement of the field. * Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. * Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations * Training Data * The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. * The scope of the training dataset determines the subject areas the model can handle effectively. * Context and Task Complexity * LLMs are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. * A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). * Language Ambiguity and Nuance * Natural language is inherently complex. LLMs might struggle to grasp subtle nuances, sarcasm, or figurative language. * Factual Accuracy * LLMs generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. * Common Sense * LLMs rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of large language models (LLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: * Bias and Fairness * LLMs trained on large-scale, real-world text data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. * Misinformation and Misuse * LLMs can be misused to generate text that is false, misleading, or harmful. * Guidelines are provided for responsible use with the model, see the [Responsible Generative AI Toolkit][rai-toolkit]. * Transparency and Accountability: * This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. * A responsibly developed open model offers the opportunity to share innovation by making LLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: * Perpetuation of biases: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. * Generation of harmful content: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. * Misuse for malicious purposes: Technical limitations and developer and end-user education can help mitigate against malicious applications of LLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the [Gemma Prohibited Use Policy][prohibited-use]. * Privacy violations: Models were trained on data filtered for removal of PII (Personally Identifiable Information). Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open large language model implementations designed from the ground up for Responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives. [tech-report]: [rai-toolkit]: [kaggle-gemma]: [terms]: [vertex-mg-gemma2]: [sensitive-info]: [safety-policies]: [prohibited-use]: [tpu]: [sustainability]: [jax]: [ml-pathways]: [sustainability]: [foundation-models]: [gemini-2-paper]: [mmlu]: [hellaswag]: [piqa]: [socialiqa]: [boolq]: [winogrande]: [commonsenseqa]: [openbookqa]: [arc]: [triviaqa]: [naturalq]: [humaneval]: [mbpp]: [gsm8k]: [realtox]: [bold]: [crows]: [bbq]: [winogender]: [truthfulqa]: [winobias]: [math]: [agieval]: [drop]: [big-bench]: [toxigen]: [eval-danger]:", + "model_explanation_gemini": "Generates English text for tasks like question answering, summarization, and reasoning, designed for resource-constrained environments like laptops or local cloud infrastructure.\n\nFeatures: \n- Text-to-text, decoder-only LLM \n- Open weights (pre-trained and instruction-tuned variants) \n- Optimized for small-scale deployment (laptops/desktops) \n- Supports 8-bit and 4-bit quantization \n- Torch compile compatibility (6x faster inference) \n\nComparison: \nLighter" +} \ No newline at end of file diff --git a/model_data_json/google_gemma-2-9b-it.json b/model_data_json/google_gemma-2-9b-it.json new file mode 100644 index 0000000000000000000000000000000000000000..a819428e61f31f831217f265eb339bd7b2387cdd --- /dev/null +++ b/model_data_json/google_gemma-2-9b-it.json @@ -0,0 +1,43 @@ +{ + "model_id": "google/gemma-2-9b-it", + "downloads": 336099, + "tags": [ + "transformers", + "safetensors", + "gemma2", + "text-generation", + "conversational", + "arxiv:2009.03300", + "arxiv:1905.07830", + "arxiv:1911.11641", + "arxiv:1904.09728", + "arxiv:1905.10044", + "arxiv:1907.10641", + "arxiv:1811.00937", + "arxiv:1809.02789", + "arxiv:1911.01547", + "arxiv:1705.03551", + "arxiv:2107.03374", + "arxiv:2108.07732", + "arxiv:2110.14168", + "arxiv:2009.11462", + "arxiv:2101.11718", + "arxiv:2110.08193", + "arxiv:1804.09301", + "arxiv:2109.07958", + "arxiv:1804.06876", + "arxiv:2103.03874", + "arxiv:2304.06364", + "arxiv:2206.04615", + "arxiv:2203.09509", + "base_model:google/gemma-2-9b", + "base_model:finetune:google/gemma-2-9b", + "license:gemma", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: gemma library_name: transformers pipeline_tag: text-generation extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: >- To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license tags: - conversational base_model: google/gemma-2-9b --- # Gemma 2 model card **Model Page**: Gemma **Resources and Technical Documentation**: * [Responsible Generative AI Toolkit][rai-toolkit] * [Gemma on Kaggle][kaggle-gemma] * [Gemma on Vertex Model Garden][vertex-mg-gemma] **Terms of Use**: Terms **Authors**: Google ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights for both pre-trained variants and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Usage Below we share some code snippets on how to get quickly started with running the model. First, install the Transformers library with: Then, copy the snippet from the section that is relevant for your usecase. #### Running with the API #### Running the model on a single / multi GPU You can ensure the correct chat template is applied by using as follows: #### Running the model on a GPU using different precisions The native weights of this model were exported in precision. You can also use if you skip the dtype, but no precision increase will occur (model weights will just be upcasted to ). See examples below. * _Upcasting to _ #### Running the model through a CLI The local-gemma repository contains a lightweight wrapper around Transformers for running Gemma 2 through a command line interface, or CLI. Follow the installation instructions for getting started, then launch the CLI through the following command: #### Quantized Versions through
Using 8-bit precision (int8)
Using 4-bit precision
#### Advanced Usage
Torch compile Torch compile is a method for speeding-up the inference of PyTorch modules. The Gemma-2 model can be run up to 6x faster by leveraging torch compile. Note that two warm-up steps are required before the full inference speed is realised: For more details, refer to the Transformers documentation.
### Chat Template The instruction-tuned models use a chat template that must be adhered to for conversational use. The easiest way to apply it is using the tokenizer's built-in chat template, as shown in the following snippet. Let's load the model and apply the chat template to a conversation. In this example, we'll start with a single user interaction: At this point, the prompt contains the following text: As you can see, each turn is preceded by a delimiter and then the role of the entity (either , for content supplied by the user, or for LLM responses). Turns finish with the token. You can follow this format to build the prompt manually, if you need to do it without the tokenizer's chat template. After the prompt is ready, generation can be performed like this: ### Inputs and outputs * **Input:** Text string, such as a question, a prompt, or a document to be summarized. * **Output:** Generated English-language text in response to the input, such as an answer to a question, or a summary of a document. ### Citation ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources. The 27B model was trained with 13 trillion tokens and the 9B model was trained with 8 trillion tokens. Here are the key components: * Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. Primarily English-language content. * Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code or understand code-related questions. * Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. The combination of these diverse data sources is crucial for training a powerful language model that can handle a wide variety of different tasks and text formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: * CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content. * Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. * Additional methods: Filtering based on content quality and safety in line with [our policies][safety-policies]. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using the latest generation of [Tensor Processing Unit (TPU)][tpu] hardware (TPUv5p). Training large language models requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: * Performance: TPUs are specifically designed to handle the massive computations involved in training LLMs. They can speed up training considerably compared to CPUs. * Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. * Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. * Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. * These advantages are aligned with [Google's commitments to operate sustainably][sustainability]. ### Software Training was done using [JAX][jax] and [ML Pathways][ml-pathways]. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for [foundation models][foundation-models], including large language models like these ones. Together, JAX and ML Pathways are used as described in the [paper about the Gemini family of models][gemini-2-paper]; \"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\" ## Evaluation Model evaluation metrics and results. ### Benchmark Results These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: | Benchmark | Metric | Gemma PT 9B | Gemma PT 27B | | ------------------------------ | ------------- | ----------- | ------------ | | [MMLU][mmlu] | 5-shot, top-1 | 71.3 | 75.2 | | [HellaSwag][hellaswag] | 10-shot | 81.9 | 86.4 | | [PIQA][piqa] | 0-shot | 81.7 | 83.2 | | [SocialIQA][socialiqa] | 0-shot | 53.4 | 53.7 | | [BoolQ][boolq] | 0-shot | 84.2 | 84.8 | | [WinoGrande][winogrande] | partial score | 80.6 | 83.7 | | [ARC-e][arc] | 0-shot | 88.0 | 88.6 | | [ARC-c][arc] | 25-shot | 68.4 | 71.4 | | [TriviaQA][triviaqa] | 5-shot | 76.6 | 83.7 | | [Natural Questions][naturalq] | 5-shot | 29.2 | 34.5 | | [HumanEval][humaneval] | pass@1 | 40.2 | 51.8 | | [MBPP][mbpp] | 3-shot | 52.4 | 62.6 | | [GSM8K][gsm8k] | 5-shot, maj@1 | 68.6 | 74.0 | | [MATH][math] | 4-shot | 36.6 | 42.3 | | [AGIEval][agieval] | 3-5-shot | 52.8 | 55.1 | | [BIG-Bench][big-bench] | 3-shot, CoT | 68.2 | 74.9 | | ------------------------------ | ------------- | ----------- | ------------ | ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: * Text-to-Text Content Safety: Human evaluation on prompts covering safety policies including child sexual abuse and exploitation, harassment, violence and gore, and hate speech. * Text-to-Text Representational Harms: Benchmark against relevant academic datasets such as [WinoBias][winobias] and [BBQ Dataset][bbq]. * Memorization: Automated evaluation of memorization of training data, including the risk of personally identifiable information exposure. * Large-scale harm: Tests for \"dangerous capabilities,\" such as chemical, biological, radiological, and nuclear (CBRN) risks. ### Evaluation Results The results of ethics and safety evaluations are within acceptable thresholds for meeting [internal policies][safety-policies] for categories such as child safety, content safety, representational harms, memorization, large-scale harms. On top of robust internal evaluations, the results of well-known safety benchmarks like BBQ, BOLD, Winogender, Winobias, RealToxicity, and TruthfulQA are shown here. #### Gemma 2.0 | Benchmark | Metric | Gemma 2 IT 9B | Gemma 2 IT 27B | | ------------------------ | ------------- | --------------- | ---------------- | | [RealToxicity][realtox] | average | 8.25 | 8.84 | | [CrowS-Pairs][crows] | top-1 | 37.47 | 36.67 | | [BBQ Ambig][bbq] | 1-shot, top-1 | 88.58 | 85.99 | | [BBQ Disambig][bbq] | top-1 | 82.67 | 86.94 | | [Winogender][winogender] | top-1 | 79.17 | 77.22 | | [TruthfulQA][truthfulqa] | | 50.27 | 51.60 | | [Winobias 1_2][winobias] | | 78.09 | 81.94 | | [Winobias 2_2][winobias] | | 95.32 | 97.22 | | [Toxigen][toxigen] | | 39.30 | 38.42 | | ------------------------ | ------------- | --------------- | ---------------- | ## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open Large Language Models (LLMs) have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. * Content Creation and Communication * Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. * Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. * Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. * Research and Education * Natural Language Processing (NLP) Research: These models can serve as a foundation for researchers to experiment with NLP techniques, develop algorithms, and contribute to the advancement of the field. * Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. * Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations * Training Data * The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. * The scope of the training dataset determines the subject areas the model can handle effectively. * Context and Task Complexity * LLMs are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. * A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). * Language Ambiguity and Nuance * Natural language is inherently complex. LLMs might struggle to grasp subtle nuances, sarcasm, or figurative language. * Factual Accuracy * LLMs generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. * Common Sense * LLMs rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of large language models (LLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: * Bias and Fairness * LLMs trained on large-scale, real-world text data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. * Misinformation and Misuse * LLMs can be misused to generate text that is false, misleading, or harmful. * Guidelines are provided for responsible use with the model, see the [Responsible Generative AI Toolkit][rai-toolkit]. * Transparency and Accountability: * This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. * A responsibly developed open model offers the opportunity to share innovation by making LLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: * Perpetuation of biases: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. * Generation of harmful content: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. * Misuse for malicious purposes: Technical limitations and developer and end-user education can help mitigate against malicious applications of LLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the [Gemma Prohibited Use Policy][prohibited-use]. * Privacy violations: Models were trained on data filtered for removal of PII (Personally Identifiable Information). Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open large language model implementations designed from the ground up for Responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives. [rai-toolkit]: [kaggle-gemma]: [terms]: [vertex-mg-gemma]: [sensitive-info]: [safety-policies]: [prohibited-use]: [tpu]: [sustainability]: [jax]: [ml-pathways]: [sustainability]: [foundation-models]: [gemini-2-paper]: [mmlu]: [hellaswag]: [piqa]: [socialiqa]: [boolq]: [winogrande]: [commonsenseqa]: [openbookqa]: [arc]: [triviaqa]: [naturalq]: [humaneval]: [mbpp]: [gsm8k]: [realtox]: [bold]: [crows]: [bbq]: [winogender]: [truthfulqa]: [winobias]: [math]: [agieval]: [big-bench]: [toxigen]:", + "model_explanation_gemini": "A lightweight, decoder-only text-generation model designed for tasks like question answering, summarization, and reasoning, optimized for deployment in resource-limited environments.\n\n**Features:** \n- Instruction-tuned for conversational use \n- Supports English text generation \n- Compatible with single/multi-GPU setups \n- Offers quantization options (4-bit, 8-bit) \n- Includes chat template for structured dialogue \n\n**Comparison:** \nSmaller and more deployable than Gemini models but built from similar research, prioritizing accessibility" +} \ No newline at end of file diff --git a/model_data_json/google_gemma-2b.json b/model_data_json/google_gemma-2b.json new file mode 100644 index 0000000000000000000000000000000000000000..de618743a8d651de5cd26344839fb4d9de88851e --- /dev/null +++ b/model_data_json/google_gemma-2b.json @@ -0,0 +1,41 @@ +{ + "model_id": "google/gemma-2b", + "downloads": 397089, + "tags": [ + "transformers", + "safetensors", + "gguf", + "gemma", + "text-generation", + "arxiv:2312.11805", + "arxiv:2009.03300", + "arxiv:1905.07830", + "arxiv:1911.11641", + "arxiv:1904.09728", + "arxiv:1905.10044", + "arxiv:1907.10641", + "arxiv:1811.00937", + "arxiv:1809.02789", + "arxiv:1911.01547", + "arxiv:1705.03551", + "arxiv:2107.03374", + "arxiv:2108.07732", + "arxiv:2110.14168", + "arxiv:2304.06364", + "arxiv:2206.04615", + "arxiv:1804.06876", + "arxiv:2110.08193", + "arxiv:2009.11462", + "arxiv:2101.11718", + "arxiv:1804.09301", + "arxiv:2109.07958", + "arxiv:2203.09509", + "license:gemma", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- library_name: transformers new_version: google/gemma-2-2b license: gemma extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged-in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license --- # Gemma Model Card **Model Page**: Gemma This model card corresponds to the 2B base version of the Gemma model. You can also visit the model card of the 7B base model, 7B instruct model, and 2B instruct model. **Resources and Technical Documentation**: * Gemma Technical Report * Responsible Generative AI Toolkit * Gemma on Kaggle * Gemma on Vertex Model Garden **Terms of Use**: Terms **Authors**: Google ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights, pre-trained variants, and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Context Length Models are trained on a context length of 8192 tokens. ### Usage Below we share some code snippets on how to get quickly started with running the model. First make sure to , then copy the snippet from the section that is relevant for your usecase. #### Fine-tuning the model You can find fine-tuning scripts and notebook under the directory of []( repository. To adapt it to this model, simply change the model-id to . In that repository, we provide: * A script to perform Supervised Fine-Tuning (SFT) on UltraChat dataset using QLoRA * A script to perform SFT using FSDP on TPU devices * A notebook that you can run on a free-tier Google Colab instance to perform SFT on English quotes dataset #### Running the model on a CPU #### Running the model on a single / multi GPU #### Running the model on a GPU using different precisions * _Using _ * _Using _ #### Quantized Versions through * _Using 8-bit precision (int8)_ * _Using 4-bit precision_ #### Other optimizations * _Flash Attention 2_ First make sure to install in your environment ### Inputs and outputs * **Input:** Text string, such as a question, a prompt, or a document to be summarized. * **Output:** Generated English-language text in response to the input, such as an answer to a question, or a summary of a document. ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources, totaling 6 trillion tokens. Here are the key components: * Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. Primarily English-language content. * Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code or understand code-related questions. * Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. The combination of these diverse data sources is crucial for training a powerful language model that can handle a wide variety of different tasks and text formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: * CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content * Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. * Additional methods: Filtering based on content quality and safely in line with our policies. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using the latest generation of Tensor Processing Unit (TPU) hardware (TPUv5e). Training large language models requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: * Performance: TPUs are specifically designed to handle the massive computations involved in training LLMs. They can speed up training considerably compared to CPUs. * Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. * Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. * Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. * These advantages are aligned with Google's commitments to operate sustainably. ### Software Training was done using JAX and ML Pathways. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for foundation models, including large language models like these ones. Together, JAX and ML Pathways are used as described in the paper about the Gemini family of models; \"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\" ## Evaluation Model evaluation metrics and results. ### Benchmark Results These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: | Benchmark | Metric | 2B Params | 7B Params | | ------------------------------ | ------------- | ----------- | --------- | | MMLU | 5-shot, top-1 | 42.3 | 64.3 | | HellaSwag | 0-shot |71.4 | 81.2 | | PIQA | 0-shot | 77.3 | 81.2 | | SocialIQA | 0-shot | 49.7 | 51.8 | | BooIQ | 0-shot | 69.4 | 83.2 | | WinoGrande | partial score | 65.4 | 72.3 | | CommonsenseQA | 7-shot | 65.3 | 71.3 | | OpenBookQA | | 47.8 | 52.8 | | ARC-e | | 73.2 | 81.5 | | ARC-c | | 42.1 | 53.2 | | TriviaQA | 5-shot | 53.2 | 63.4 | | Natural Questions | 5-shot | 12.5 | 23 | | HumanEval | pass@1 | 22.0 | 32.3 | | MBPP | 3-shot | 29.2 | 44.4 | | GSM8K | maj@1 | 17.7 | 46.4 | | MATH | 4-shot | 11.8 | 24.3 | | AGIEval | | 24.2 | 41.7 | | BIG-Bench | | 35.2 | 55.1 | | ------------------------------ | ------------- | ----------- | --------- | | **Average** | | **45.0** | **56.9** | ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: * Text-to-Text Content Safety: Human evaluation on prompts covering safety policies including child sexual abuse and exploitation, harassment, violence and gore, and hate speech. * Text-to-Text Representational Harms: Benchmark against relevant academic datasets such as WinoBias and BBQ Dataset. * Memorization: Automated evaluation of memorization of training data, including the risk of personally identifiable information exposure. * Large-scale harm: Tests for \"dangerous capabilities,\" such as chemical, biological, radiological, and nuclear (CBRN) risks. ### Evaluation Results The results of ethics and safety evaluations are within acceptable thresholds for meeting internal policies for categories such as child safety, content safety, representational harms, memorization, large-scale harms. On top of robust internal evaluations, the results of well known safety benchmarks like BBQ, BOLD, Winogender, Winobias, RealToxicity, and TruthfulQA are shown here. **Update**: These numbers reflect the new numbers from the updated v1.1 IT models. For the original v1 numbers, please consult the technical report's appendix for the results. | Benchmark | Metric | Gemma v1.1 IT 2B | Gemma v1.1 IT 7B | | ------------------------------ | ------------- | ----------- | --------- | | RealToxicity | average | 6.86 | 7.90 | | BOLD | | 45.57 | 49.08 | | CrowS-Pairs | top-1 | 45.82 | 51.33 | | BBQ Ambig | 1-shot, top-1 | 62.58 | 92.54 | | BBQ Disambig | top-1 | 54.62 | 71.99 | | Winogender | top-1 | 51.25 | 54.17 | | TruthfulQA | | 31.81 | 44.84 | | Winobias 1_2 | | 56.12 | 59.09 | | Winobias 2_2 | | 91.10 | 92.23 | | Toxigen | | 29.77 | 39.59 | | ------------------------------ | ------------- | ----------- | --------- | ## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open Large Language Models (LLMs) have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. * Content Creation and Communication * Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. * Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. * Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. * Research and Education * Natural Language Processing (NLP) Research: These models can serve as a foundation for researchers to experiment with NLP techniques, develop algorithms, and contribute to the advancement of the field. * Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. * Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations * Training Data * The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. * The scope of the training dataset determines the subject areas the model can handle effectively. * Context and Task Complexity * LLMs are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. * A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). * Language Ambiguity and Nuance * Natural language is inherently complex. LLMs might struggle to grasp subtle nuances, sarcasm, or figurative language. * Factual Accuracy * LLMs generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. * Common Sense * LLMs rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of large language models (LLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: * Bias and Fairness * LLMs trained on large-scale, real-world text data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. * Misinformation and Misuse * LLMs can be misused to generate text that is false, misleading, or harmful. * Guidelines are provided for responsible use with the model, see the Responsible Generative AI Toolkit. * Transparency and Accountability: * This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. * A responsibly developed open model offers the opportunity to share innovation by making LLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: * Perpetuation of biases: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. * Generation of harmful content: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. * Misuse for malicious purposes: Technical limitations and developer and end-user education can help mitigate against malicious applications of LLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the Gemma Prohibited Use Policy. * Privacy violations: Models were trained on data filtered for removal of PII (Personally Identifiable Information). Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open large language model implementations designed from the ground up for Responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives.", + "model_explanation_gemini": "\"Gemma-2b is a lightweight, decoder-only text generation model designed for tasks like question answering, summarization, and reasoning, optimized for deployment on resource-limited devices.\"\n\nFeatures: \n- Text-to-text generation (English) \n- Decoder-only architecture \n- 2B parameter base model \n- 8192-token context length \n- Trained on diverse datasets (web docs, code, math) \n- Supports fine-tuning and quantization \n\nComparison: \nSmaller than the" +} \ No newline at end of file diff --git a/model_data_json/google_gemma-3-12b-it.json b/model_data_json/google_gemma-3-12b-it.json new file mode 100644 index 0000000000000000000000000000000000000000..8c665274b65f928365ef061628af4247212e48a7 --- /dev/null +++ b/model_data_json/google_gemma-3-12b-it.json @@ -0,0 +1,47 @@ +{ + "model_id": "google/gemma-3-12b-it", + "downloads": 354924, + "tags": [ + "transformers", + "safetensors", + "gemma3", + "image-text-to-text", + "conversational", + "arxiv:1905.07830", + "arxiv:1905.10044", + "arxiv:1911.11641", + "arxiv:1904.09728", + "arxiv:1705.03551", + "arxiv:1911.01547", + "arxiv:1907.10641", + "arxiv:1903.00161", + "arxiv:2009.03300", + "arxiv:2304.06364", + "arxiv:2103.03874", + "arxiv:2110.14168", + "arxiv:2311.12022", + "arxiv:2108.07732", + "arxiv:2107.03374", + "arxiv:2210.03057", + "arxiv:2106.03193", + "arxiv:1910.11856", + "arxiv:2502.12404", + "arxiv:2502.21228", + "arxiv:2404.16816", + "arxiv:2104.12756", + "arxiv:2311.16502", + "arxiv:2203.10244", + "arxiv:2404.12390", + "arxiv:1810.12440", + "arxiv:1908.02660", + "arxiv:2312.11805", + "base_model:google/gemma-3-12b-pt", + "base_model:finetune:google/gemma-3-12b-pt", + "license:gemma", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: gemma library_name: transformers pipeline_tag: image-text-to-text extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license base_model: google/gemma-3-12b-pt --- # Gemma 3 model card **Model Page**: Gemma **Resources and Technical Documentation**: * [Gemma 3 Technical Report][g3-tech-report] * [Responsible Generative AI Toolkit][rai-toolkit] * [Gemma on Kaggle][kaggle-gemma] * [Gemma on Vertex Model Garden][vertex-mg-gemma3] **Terms of Use**: [Terms][terms] **Authors**: Google DeepMind ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. Gemma 3 models are multimodal, handling text and image input and generating text output, with open weights for both pre-trained variants and instruction-tuned variants. Gemma 3 has a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions. Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as laptops, desktops or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Inputs and outputs - **Input:** - Text string, such as a question, a prompt, or a document to be summarized - Images, normalized to 896 x 896 resolution and encoded to 256 tokens each - Total input context of 128K tokens for the 4B, 12B, and 27B sizes, and 32K tokens for the 1B size - **Output:** - Generated text in response to the input, such as an answer to a question, analysis of image content, or a summary of a document - Total output context of 8192 tokens ### Usage Below, there are some code snippets on how to get quickly started with running the model. First, install the Transformers library. Gemma 3 is supported starting from transformers 4.50.0. Then, copy the snippet from the section that is relevant for your use case. #### Running with the API You can initialize the model and processor for inference with as follows. With instruction-tuned models, you need to use chat templates to process our inputs first. Then, you can pass it to the pipeline. #### Running the model on a single / multi GPU ### Citation ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources. The 27B model was trained with 14 trillion tokens, the 12B model was trained with 12 trillion tokens, 4B model was trained with 4 trillion tokens and 1B with 2 trillion tokens. Here are the key components: - Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. The training dataset includes content in over 140 languages. - Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code and understand code-related questions. - Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. - Images: A wide range of images enables the model to perform image analysis and visual data extraction tasks. The combination of these diverse data sources is crucial for training a powerful multimodal model that can handle a wide variety of different tasks and data formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: - CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content. - Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. - Additional methods: Filtering based on content quality and safety in line with [our policies][safety-policies]. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using [Tensor Processing Unit (TPU)][tpu] hardware (TPUv4p, TPUv5p and TPUv5e). Training vision-language models (VLMS) requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: - Performance: TPUs are specifically designed to handle the massive computations involved in training VLMs. They can speed up training considerably compared to CPUs. - Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. - Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. - Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. - These advantages are aligned with [Google's commitments to operate sustainably][sustainability]. ### Software Training was done using [JAX][jax] and [ML Pathways][ml-pathways]. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for foundation models, including large language models like these ones. Together, JAX and ML Pathways are used as described in the [paper about the Gemini family of models][gemini-2-paper]; *\"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\"* ## Evaluation Model evaluation metrics and results. ### Benchmark Results These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: #### Reasoning and factuality | Benchmark | Metric | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------------|:--------------:|:-------------:|:--------------:|:--------------:| | [HellaSwag][hellaswag] | 10-shot | 62.3 | 77.2 | 84.2 | 85.6 | | [BoolQ][boolq] | 0-shot | 63.2 | 72.3 | 78.8 | 82.4 | | [PIQA][piqa] | 0-shot | 73.8 | 79.6 | 81.8 | 83.3 | | [SocialIQA][socialiqa] | 0-shot | 48.9 | 51.9 | 53.4 | 54.9 | | [TriviaQA][triviaqa] | 5-shot | 39.8 | 65.8 | 78.2 | 85.5 | | [Natural Questions][naturalq] | 5-shot | 9.48 | 20.0 | 31.4 | 36.1 | | [ARC-c][arc] | 25-shot | 38.4 | 56.2 | 68.9 | 70.6 | | [ARC-e][arc] | 0-shot | 73.0 | 82.4 | 88.3 | 89.0 | | [WinoGrande][winogrande] | 5-shot | 58.2 | 64.7 | 74.3 | 78.8 | | [BIG-Bench Hard][bbh] | few-shot | 28.4 | 50.9 | 72.6 | 77.7 | | [DROP][drop] | 1-shot | 42.4 | 60.1 | 72.2 | 77.2 | [hellaswag]: [boolq]: [piqa]: [socialiqa]: [triviaqa]: [naturalq]: [arc]: [winogrande]: [bbh]: [drop]: #### STEM and code | Benchmark | Metric | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------------|:-------------:|:--------------:|:--------------:| | [MMLU][mmlu] | 5-shot | 59.6 | 74.5 | 78.6 | | [MMLU][mmlu] (Pro COT) | 5-shot | 29.2 | 45.3 | 52.2 | | [AGIEval][agieval] | 3-5-shot | 42.1 | 57.4 | 66.2 | | [MATH][math] | 4-shot | 24.2 | 43.3 | 50.0 | | [GSM8K][gsm8k] | 8-shot | 38.4 | 71.0 | 82.6 | | [GPQA][gpqa] | 5-shot | 15.0 | 25.4 | 24.3 | | [MBPP][mbpp] | 3-shot | 46.0 | 60.4 | 65.6 | | [HumanEval][humaneval] | 0-shot | 36.0 | 45.7 | 48.8 | [mmlu]: [agieval]: [math]: [gsm8k]: [gpqa]: [mbpp]: [humaneval]: #### Multilingual | Benchmark | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------------ |:-------------:|:-------------:|:--------------:|:--------------:| | [MGSM][mgsm] | 2.04 | 34.7 | 64.3 | 74.3 | | [Global-MMLU-Lite][global-mmlu-lite] | 24.9 | 57.0 | 69.4 | 75.7 | | [WMT24++][wmt24pp] (ChrF) | 36.7 | 48.4 | 53.9 | 55.7 | | [FloRes][flores] | 29.5 | 39.2 | 46.0 | 48.8 | | [XQuAD][xquad] (all) | 43.9 | 68.0 | 74.5 | 76.8 | | [ECLeKTic][eclektic] | 4.69 | 11.0 | 17.2 | 24.4 | | [IndicGenBench][indicgenbench] | 41.4 | 57.2 | 61.7 | 63.4 | [mgsm]: [flores]: [xquad]: [global-mmlu-lite]: [wmt24pp]: [eclektic]: [indicgenbench]: #### Multimodal | Benchmark | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |:-------------:|:--------------:|:--------------:| | [COCOcap][coco-cap] | 102 | 111 | 116 | | [DocVQA][docvqa] (val) | 72.8 | 82.3 | 85.6 | | [InfoVQA][info-vqa] (val) | 44.1 | 54.8 | 59.4 | | [MMMU][mmmu] (pt) | 39.2 | 50.3 | 56.1 | | [TextVQA][textvqa] (val) | 58.9 | 66.5 | 68.6 | | [RealWorldQA][realworldqa] | 45.5 | 52.2 | 53.9 | | [ReMI][remi] | 27.3 | 38.5 | 44.8 | | [AI2D][ai2d] | 63.2 | 75.2 | 79.0 | | [ChartQA][chartqa] | 63.6 | 74.7 | 76.3 | | [VQAv2][vqav2] | 63.9 | 71.2 | 72.9 | | [BLINK][blinkvqa] | 38.0 | 35.9 | 39.6 | | [OKVQA][okvqa] | 51.0 | 58.7 | 60.2 | | [TallyQA][tallyqa] | 42.5 | 51.8 | 54.3 | | [SpatialSense VQA][ss-vqa] | 50.9 | 60.0 | 59.4 | | [CountBenchQA][countbenchqa] | 26.1 | 17.8 | 68.0 | [coco-cap]: [docvqa]: [info-vqa]: [mmmu]: [textvqa]: [realworldqa]: [remi]: [ai2d]: [chartqa]: [vqav2]: [blinkvqa]: [okvqa]: [tallyqa]: [ss-vqa]: [countbenchqa]: ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: - **Child Safety**: Evaluation of text-to-text and image to text prompts covering child safety policies, including child sexual abuse and exploitation. - **Content Safety:** Evaluation of text-to-text and image to text prompts covering safety policies including, harassment, violence and gore, and hate speech. - **Representational Harms**: Evaluation of text-to-text and image to text prompts covering safety policies including bias, stereotyping, and harmful associations or inaccuracies. In addition to development level evaluations, we conduct \"assurance evaluations\" which are our 'arms-length' internal evaluations for responsibility governance decision making. They are conducted separately from the model development team, to inform decision making about release. High level findings are fed back to the model team, but prompt sets are held-out to prevent overfitting and preserve the results' ability to inform decision making. Assurance evaluation results are reported to our Responsibility & Safety Council as part of release review. ### Evaluation Results For all areas of safety testing, we saw major improvements in the categories of child safety, content safety, and representational harms relative to previous Gemma models. All testing was conducted without safety filters to evaluate the model capabilities and behaviors. For both text-to-text and image-to-text, and across all model sizes, the model produced minimal policy violations, and showed significant improvements over previous Gemma models' performance with respect to ungrounded inferences. A limitation of our evaluations was they included only English language prompts. ## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open vision-language models (VLMs) models have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. - Content Creation and Communication - Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. - Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. - Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. - Image Data Extraction: These models can be used to extract, interpret, and summarize visual data for text communications. - Research and Education - Natural Language Processing (NLP) and VLM Research: These models can serve as a foundation for researchers to experiment with VLM and NLP techniques, develop algorithms, and contribute to the advancement of the field. - Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. - Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations - Training Data - The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. - The scope of the training dataset determines the subject areas the model can handle effectively. - Context and Task Complexity - Models are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. - A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). - Language Ambiguity and Nuance - Natural language is inherently complex. Models might struggle to grasp subtle nuances, sarcasm, or figurative language. - Factual Accuracy - Models generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. - Common Sense - Models rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of vision-language models (VLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: - Bias and Fairness - VLMs trained on large-scale, real-world text and image data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. - Misinformation and Misuse - VLMs can be misused to generate text that is false, misleading, or harmful. - Guidelines are provided for responsible use with the model, see the [Responsible Generative AI Toolkit][rai-toolkit]. - Transparency and Accountability: - This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. - A responsibly developed open model offers the opportunity to share innovation by making VLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: - **Perpetuation of biases**: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. - **Generation of harmful content**: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. - **Misuse for malicious purposes**: Technical limitations and developer and end-user education can help mitigate against malicious applications of VLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the [Gemma Prohibited Use Policy][prohibited-use]. - **Privacy violations**: Models were trained on data filtered for removal of certain personal information and other sensitive data. Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open vision-language model implementations designed from the ground up for responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives. [g3-tech-report]: [rai-toolkit]: [kaggle-gemma]: [vertex-mg-gemma3]: [terms]: [safety-policies]: [prohibited-use]: [tpu]: [sustainability]: [jax]: [ml-pathways]: [sustainability]: [gemini-2-paper]:", + "model_explanation_gemini": "A lightweight, multimodal model handling text and image inputs to generate text outputs, optimized for tasks like question answering, summarization, and reasoning with a 128K context window and multilingual support.\n\n**Features:** \n- Multimodal (text and image inputs, text outputs) \n- 128K token context window (12B size) \n- Supports 140+ languages \n- Instruction-tuned variants available \n- Optimized for resource-limited deployment \n\n**Comparison:** \nGemma 3 offers" +} \ No newline at end of file diff --git a/model_data_json/google_gemma-3-1b-it.json b/model_data_json/google_gemma-3-1b-it.json new file mode 100644 index 0000000000000000000000000000000000000000..30aa3cb03220fe85ad36292439302651beddb856 --- /dev/null +++ b/model_data_json/google_gemma-3-1b-it.json @@ -0,0 +1,47 @@ +{ + "model_id": "google/gemma-3-1b-it", + "downloads": 2410194, + "tags": [ + "transformers", + "safetensors", + "gemma3_text", + "text-generation", + "conversational", + "arxiv:1905.07830", + "arxiv:1905.10044", + "arxiv:1911.11641", + "arxiv:1904.09728", + "arxiv:1705.03551", + "arxiv:1911.01547", + "arxiv:1907.10641", + "arxiv:1903.00161", + "arxiv:2009.03300", + "arxiv:2304.06364", + "arxiv:2103.03874", + "arxiv:2110.14168", + "arxiv:2311.12022", + "arxiv:2108.07732", + "arxiv:2107.03374", + "arxiv:2210.03057", + "arxiv:2106.03193", + "arxiv:1910.11856", + "arxiv:2502.12404", + "arxiv:2502.21228", + "arxiv:2404.16816", + "arxiv:2104.12756", + "arxiv:2311.16502", + "arxiv:2203.10244", + "arxiv:2404.12390", + "arxiv:1810.12440", + "arxiv:1908.02660", + "arxiv:2312.11805", + "base_model:google/gemma-3-1b-pt", + "base_model:finetune:google/gemma-3-1b-pt", + "license:gemma", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: gemma library_name: transformers pipeline_tag: text-generation extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license base_model: google/gemma-3-1b-pt --- # Gemma 3 model card **Model Page**: Gemma **Resources and Technical Documentation**: * [Gemma 3 Technical Report][g3-tech-report] * [Responsible Generative AI Toolkit][rai-toolkit] * [Gemma on Kaggle][kaggle-gemma] * [Gemma on Vertex Model Garden][vertex-mg-gemma3] **Terms of Use**: [Terms][terms] **Authors**: Google DeepMind ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. Gemma 3 models are multimodal, handling text and image input and generating text output, with open weights for both pre-trained variants and instruction-tuned variants. Gemma 3 has a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions. Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as laptops, desktops or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Inputs and outputs - **Input:** - Text string, such as a question, a prompt, or a document to be summarized - Images, normalized to 896 x 896 resolution and encoded to 256 tokens each - Total input context of 128K tokens for the 4B, 12B, and 27B sizes, and 32K tokens for the 1B size - **Output:** - Generated text in response to the input, such as an answer to a question, analysis of image content, or a summary of a document - Total output context of 8192 tokens ### Usage Below, there are some code snippets on how to get quickly started with running the model. First, install the Transformers library. Gemma 3 is supported starting from transformers 4.50.0. Then, copy the snippet from the section that is relevant for your use case. #### Running with the API With instruction-tuned models, you need to use chat templates to process our inputs first. Then, you can pass it to the pipeline. #### Running the model on a single / multi GPU ### Citation ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources. The 27B model was trained with 14 trillion tokens, the 12B model was trained with 12 trillion tokens, 4B model was trained with 4 trillion tokens and 1B with 2 trillion tokens. Here are the key components: - Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. The training dataset includes content in over 140 languages. - Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code and understand code-related questions. - Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. - Images: A wide range of images enables the model to perform image analysis and visual data extraction tasks. The combination of these diverse data sources is crucial for training a powerful multimodal model that can handle a wide variety of different tasks and data formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: - CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content. - Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. - Additional methods: Filtering based on content quality and safety in line with [our policies][safety-policies]. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using [Tensor Processing Unit (TPU)][tpu] hardware (TPUv4p, TPUv5p and TPUv5e). Training vision-language models (VLMS) requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: - Performance: TPUs are specifically designed to handle the massive computations involved in training VLMs. They can speed up training considerably compared to CPUs. - Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. - Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. - Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. - These advantages are aligned with [Google's commitments to operate sustainably][sustainability]. ### Software Training was done using [JAX][jax] and [ML Pathways][ml-pathways]. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for foundation models, including large language models like these ones. Together, JAX and ML Pathways are used as described in the [paper about the Gemini family of models][gemini-2-paper]; *\"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\"* ## Evaluation Model evaluation metrics and results. ### Benchmark Results These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: #### Reasoning and factuality | Benchmark | Metric | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------------|:--------------:|:-------------:|:--------------:|:--------------:| | [HellaSwag][hellaswag] | 10-shot | 62.3 | 77.2 | 84.2 | 85.6 | | [BoolQ][boolq] | 0-shot | 63.2 | 72.3 | 78.8 | 82.4 | | [PIQA][piqa] | 0-shot | 73.8 | 79.6 | 81.8 | 83.3 | | [SocialIQA][socialiqa] | 0-shot | 48.9 | 51.9 | 53.4 | 54.9 | | [TriviaQA][triviaqa] | 5-shot | 39.8 | 65.8 | 78.2 | 85.5 | | [Natural Questions][naturalq] | 5-shot | 9.48 | 20.0 | 31.4 | 36.1 | | [ARC-c][arc] | 25-shot | 38.4 | 56.2 | 68.9 | 70.6 | | [ARC-e][arc] | 0-shot | 73.0 | 82.4 | 88.3 | 89.0 | | [WinoGrande][winogrande] | 5-shot | 58.2 | 64.7 | 74.3 | 78.8 | | [BIG-Bench Hard][bbh] | few-shot | 28.4 | 50.9 | 72.6 | 77.7 | | [DROP][drop] | 1-shot | 42.4 | 60.1 | 72.2 | 77.2 | [hellaswag]: [boolq]: [piqa]: [socialiqa]: [triviaqa]: [naturalq]: [arc]: [winogrande]: [bbh]: [drop]: #### STEM and code | Benchmark | Metric | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------------|:-------------:|:--------------:|:--------------:| | [MMLU][mmlu] | 5-shot | 59.6 | 74.5 | 78.6 | | [MMLU][mmlu] (Pro COT) | 5-shot | 29.2 | 45.3 | 52.2 | | [AGIEval][agieval] | 3-5-shot | 42.1 | 57.4 | 66.2 | | [MATH][math] | 4-shot | 24.2 | 43.3 | 50.0 | | [GSM8K][gsm8k] | 8-shot | 38.4 | 71.0 | 82.6 | | [GPQA][gpqa] | 5-shot | 15.0 | 25.4 | 24.3 | | [MBPP][mbpp] | 3-shot | 46.0 | 60.4 | 65.6 | | [HumanEval][humaneval] | 0-shot | 36.0 | 45.7 | 48.8 | [mmlu]: [agieval]: [math]: [gsm8k]: [gpqa]: [mbpp]: [humaneval]: #### Multilingual | Benchmark | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------------ |:-------------:|:-------------:|:--------------:|:--------------:| | [MGSM][mgsm] | 2.04 | 34.7 | 64.3 | 74.3 | | [Global-MMLU-Lite][global-mmlu-lite] | 24.9 | 57.0 | 69.4 | 75.7 | | [WMT24++][wmt24pp] (ChrF) | 36.7 | 48.4 | 53.9 | 55.7 | | [FloRes][flores] | 29.5 | 39.2 | 46.0 | 48.8 | | [XQuAD][xquad] (all) | 43.9 | 68.0 | 74.5 | 76.8 | | [ECLeKTic][eclektic] | 4.69 | 11.0 | 17.2 | 24.4 | | [IndicGenBench][indicgenbench] | 41.4 | 57.2 | 61.7 | 63.4 | [mgsm]: [flores]: [xquad]: [global-mmlu-lite]: [wmt24pp]: [eclektic]: [indicgenbench]: #### Multimodal | Benchmark | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |:-------------:|:--------------:|:--------------:| | [COCOcap][coco-cap] | 102 | 111 | 116 | | [DocVQA][docvqa] (val) | 72.8 | 82.3 | 85.6 | | [InfoVQA][info-vqa] (val) | 44.1 | 54.8 | 59.4 | | [MMMU][mmmu] (pt) | 39.2 | 50.3 | 56.1 | | [TextVQA][textvqa] (val) | 58.9 | 66.5 | 68.6 | | [RealWorldQA][realworldqa] | 45.5 | 52.2 | 53.9 | | [ReMI][remi] | 27.3 | 38.5 | 44.8 | | [AI2D][ai2d] | 63.2 | 75.2 | 79.0 | | [ChartQA][chartqa] | 63.6 | 74.7 | 76.3 | | [VQAv2][vqav2] | 63.9 | 71.2 | 72.9 | | [BLINK][blinkvqa] | 38.0 | 35.9 | 39.6 | | [OKVQA][okvqa] | 51.0 | 58.7 | 60.2 | | [TallyQA][tallyqa] | 42.5 | 51.8 | 54.3 | | [SpatialSense VQA][ss-vqa] | 50.9 | 60.0 | 59.4 | | [CountBenchQA][countbenchqa] | 26.1 | 17.8 | 68.0 | [coco-cap]: [docvqa]: [info-vqa]: [mmmu]: [textvqa]: [realworldqa]: [remi]: [ai2d]: [chartqa]: [vqav2]: [blinkvqa]: [okvqa]: [tallyqa]: [ss-vqa]: [countbenchqa]: ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: - **Child Safety**: Evaluation of text-to-text and image to text prompts covering child safety policies, including child sexual abuse and exploitation. - **Content Safety:** Evaluation of text-to-text and image to text prompts covering safety policies including, harassment, violence and gore, and hate speech. - **Representational Harms**: Evaluation of text-to-text and image to text prompts covering safety policies including bias, stereotyping, and harmful associations or inaccuracies. In addition to development level evaluations, we conduct \"assurance evaluations\" which are our 'arms-length' internal evaluations for responsibility governance decision making. They are conducted separately from the model development team, to inform decision making about release. High level findings are fed back to the model team, but prompt sets are held-out to prevent overfitting and preserve the results' ability to inform decision making. Assurance evaluation results are reported to our Responsibility & Safety Council as part of release review. ### Evaluation Results For all areas of safety testing, we saw major improvements in the categories of child safety, content safety, and representational harms relative to previous Gemma models. All testing was conducted without safety filters to evaluate the model capabilities and behaviors. For both text-to-text and image-to-text, and across all model sizes, the model produced minimal policy violations, and showed significant improvements over previous Gemma models' performance with respect to ungrounded inferences. A limitation of our evaluations was they included only English language prompts. ## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open vision-language models (VLMs) models have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. - Content Creation and Communication - Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. - Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. - Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. - Image Data Extraction: These models can be used to extract, interpret, and summarize visual data for text communications. - Research and Education - Natural Language Processing (NLP) and VLM Research: These models can serve as a foundation for researchers to experiment with VLM and NLP techniques, develop algorithms, and contribute to the advancement of the field. - Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. - Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations - Training Data - The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. - The scope of the training dataset determines the subject areas the model can handle effectively. - Context and Task Complexity - Models are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. - A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). - Language Ambiguity and Nuance - Natural language is inherently complex. Models might struggle to grasp subtle nuances, sarcasm, or figurative language. - Factual Accuracy - Models generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. - Common Sense - Models rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of vision-language models (VLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: - Bias and Fairness - VLMs trained on large-scale, real-world text and image data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. - Misinformation and Misuse - VLMs can be misused to generate text that is false, misleading, or harmful. - Guidelines are provided for responsible use with the model, see the [Responsible Generative AI Toolkit][rai-toolkit]. - Transparency and Accountability: - This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. - A responsibly developed open model offers the opportunity to share innovation by making VLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: - **Perpetuation of biases**: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. - **Generation of harmful content**: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. - **Misuse for malicious purposes**: Technical limitations and developer and end-user education can help mitigate against malicious applications of VLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the [Gemma Prohibited Use Policy][prohibited-use]. - **Privacy violations**: Models were trained on data filtered for removal of certain personal information and other sensitive data. Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open vision-language model implementations designed from the ground up for responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives. [g3-tech-report]: [rai-toolkit]: [kaggle-gemma]: [vertex-mg-gemma3]: [terms]: [safety-policies]: [prohibited-use]: [tpu]: [sustainability]: [jax]: [ml-pathways]: [sustainability]: [gemini-2-paper]:", + "model_explanation_gemini": "\"google_gemma-3-1b-it\" is a lightweight, multimodal AI model from Google's Gemma 3 family, designed for text generation and image understanding tasks like question answering and summarization, with support for 140+ languages and a 128K token context window.\n\nFeatures: \n- Multimodal (text and image inputs, text output) \n- 128K token context window \n- Supports 140+ languages \n- Instruction-tuned variant \n- Optimized" +} \ No newline at end of file diff --git a/model_data_json/google_gemma-3-1b-pt.json b/model_data_json/google_gemma-3-1b-pt.json new file mode 100644 index 0000000000000000000000000000000000000000..ac5b083abe31d784c91027fbc3181f725bd0a8f0 --- /dev/null +++ b/model_data_json/google_gemma-3-1b-pt.json @@ -0,0 +1,44 @@ +{ + "model_id": "google/gemma-3-1b-pt", + "downloads": 154238, + "tags": [ + "transformers", + "safetensors", + "gemma3_text", + "text-generation", + "arxiv:1905.07830", + "arxiv:1905.10044", + "arxiv:1911.11641", + "arxiv:1904.09728", + "arxiv:1705.03551", + "arxiv:1911.01547", + "arxiv:1907.10641", + "arxiv:1903.00161", + "arxiv:2009.03300", + "arxiv:2304.06364", + "arxiv:2103.03874", + "arxiv:2110.14168", + "arxiv:2311.12022", + "arxiv:2108.07732", + "arxiv:2107.03374", + "arxiv:2210.03057", + "arxiv:2106.03193", + "arxiv:1910.11856", + "arxiv:2502.12404", + "arxiv:2502.21228", + "arxiv:2404.16816", + "arxiv:2104.12756", + "arxiv:2311.16502", + "arxiv:2203.10244", + "arxiv:2404.12390", + "arxiv:1810.12440", + "arxiv:1908.02660", + "arxiv:2312.11805", + "license:gemma", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: gemma library_name: transformers pipeline_tag: text-generation extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: >- To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license --- # Gemma 3 model card **Model Page**: Gemma **Resources and Technical Documentation**: * [Gemma 3 Technical Report][g3-tech-report] * [Responsible Generative AI Toolkit][rai-toolkit] * [Gemma on Kaggle][kaggle-gemma] * [Gemma on Vertex Model Garden][vertex-mg-gemma3] **Terms of Use**: [Terms][terms] **Authors**: Google DeepMind ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. Gemma 3 models are multimodal, handling text and image input and generating text output, with open weights for both pre-trained variants and instruction-tuned variants. Gemma 3 has a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions. Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as laptops, desktops or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Usage Below, there are some code snippets on how to get quickly started with running the model. First, install the Transformers library. Gemma 3 is supported starting from transformers 4.50.0. Then, copy the snippet from the section that is relevant for your use case. #### Running with the API #### Running the model on a single / multi GPU ### Inputs and outputs - **Input:** - Text string, such as a question, a prompt, or a document to be summarized - Images, normalized to 896 x 896 resolution and encoded to 256 tokens each - Total input context of 128K tokens for the 4B, 12B, and 27B sizes, and 32K tokens for the 1B size - **Output:** - Generated text in response to the input, such as an answer to a question, analysis of image content, or a summary of a document - Total output context of 8192 tokens ### Citation ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources. The 27B model was trained with 14 trillion tokens, the 12B model was trained with 12 trillion tokens, 4B model was trained with 4 trillion tokens and 1B with 2 trillion tokens. Here are the key components: - Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. The training dataset includes content in over 140 languages. - Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code and understand code-related questions. - Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. - Images: A wide range of images enables the model to perform image analysis and visual data extraction tasks. The combination of these diverse data sources is crucial for training a powerful multimodal model that can handle a wide variety of different tasks and data formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: - CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content. - Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. - Additional methods: Filtering based on content quality and safety in line with [our policies][safety-policies]. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using [Tensor Processing Unit (TPU)][tpu] hardware (TPUv4p, TPUv5p and TPUv5e). Training vision-language models (VLMS) requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: - Performance: TPUs are specifically designed to handle the massive computations involved in training VLMs. They can speed up training considerably compared to CPUs. - Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. - Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. - Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. - These advantages are aligned with [Google's commitments to operate sustainably][sustainability]. ### Software Training was done using [JAX][jax] and [ML Pathways][ml-pathways]. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for foundation models, including large language models like these ones. Together, JAX and ML Pathways are used as described in the [paper about the Gemini family of models][gemini-2-paper]; *\"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\"* ## Evaluation Model evaluation metrics and results. ### Benchmark Results These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: #### Reasoning and factuality | Benchmark | Metric | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------------|:--------------:|:-------------:|:--------------:|:--------------:| | [HellaSwag][hellaswag] | 10-shot | 62.3 | 77.2 | 84.2 | 85.6 | | [BoolQ][boolq] | 0-shot | 63.2 | 72.3 | 78.8 | 82.4 | | [PIQA][piqa] | 0-shot | 73.8 | 79.6 | 81.8 | 83.3 | | [SocialIQA][socialiqa] | 0-shot | 48.9 | 51.9 | 53.4 | 54.9 | | [TriviaQA][triviaqa] | 5-shot | 39.8 | 65.8 | 78.2 | 85.5 | | [Natural Questions][naturalq] | 5-shot | 9.48 | 20.0 | 31.4 | 36.1 | | [ARC-c][arc] | 25-shot | 38.4 | 56.2 | 68.9 | 70.6 | | [ARC-e][arc] | 0-shot | 73.0 | 82.4 | 88.3 | 89.0 | | [WinoGrande][winogrande] | 5-shot | 58.2 | 64.7 | 74.3 | 78.8 | | [BIG-Bench Hard][bbh] | few-shot | 28.4 | 50.9 | 72.6 | 77.7 | | [DROP][drop] | 1-shot | 42.4 | 60.1 | 72.2 | 77.2 | [hellaswag]: [boolq]: [piqa]: [socialiqa]: [triviaqa]: [naturalq]: [arc]: [winogrande]: [bbh]: [drop]: #### STEM and code | Benchmark | Metric | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------------|:-------------:|:--------------:|:--------------:| | [MMLU][mmlu] | 5-shot | 59.6 | 74.5 | 78.6 | | [MMLU][mmlu] (Pro COT) | 5-shot | 29.2 | 45.3 | 52.2 | | [AGIEval][agieval] | 3-5-shot | 42.1 | 57.4 | 66.2 | | [MATH][math] | 4-shot | 24.2 | 43.3 | 50.0 | | [GSM8K][gsm8k] | 8-shot | 38.4 | 71.0 | 82.6 | | [GPQA][gpqa] | 5-shot | 15.0 | 25.4 | 24.3 | | [MBPP][mbpp] | 3-shot | 46.0 | 60.4 | 65.6 | | [HumanEval][humaneval] | 0-shot | 36.0 | 45.7 | 48.8 | [mmlu]: [agieval]: [math]: [gsm8k]: [gpqa]: [mbpp]: [humaneval]: #### Multilingual | Benchmark | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------------ |:-------------:|:-------------:|:--------------:|:--------------:| | [MGSM][mgsm] | 2.04 | 34.7 | 64.3 | 74.3 | | [Global-MMLU-Lite][global-mmlu-lite] | 24.9 | 57.0 | 69.4 | 75.7 | | [WMT24++][wmt24pp] (ChrF) | 36.7 | 48.4 | 53.9 | 55.7 | | [FloRes][flores] | 29.5 | 39.2 | 46.0 | 48.8 | | [XQuAD][xquad] (all) | 43.9 | 68.0 | 74.5 | 76.8 | | [ECLeKTic][eclektic] | 4.69 | 11.0 | 17.2 | 24.4 | | [IndicGenBench][indicgenbench] | 41.4 | 57.2 | 61.7 | 63.4 | [mgsm]: [flores]: [xquad]: [global-mmlu-lite]: [wmt24pp]: [eclektic]: [indicgenbench]: #### Multimodal | Benchmark | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |:-------------:|:--------------:|:--------------:| | [COCOcap][coco-cap] | 102 | 111 | 116 | | [DocVQA][docvqa] (val) | 72.8 | 82.3 | 85.6 | | [InfoVQA][info-vqa] (val) | 44.1 | 54.8 | 59.4 | | [MMMU][mmmu] (pt) | 39.2 | 50.3 | 56.1 | | [TextVQA][textvqa] (val) | 58.9 | 66.5 | 68.6 | | [RealWorldQA][realworldqa] | 45.5 | 52.2 | 53.9 | | [ReMI][remi] | 27.3 | 38.5 | 44.8 | | [AI2D][ai2d] | 63.2 | 75.2 | 79.0 | | [ChartQA][chartqa] | 63.6 | 74.7 | 76.3 | | [VQAv2][vqav2] | 63.9 | 71.2 | 72.9 | | [BLINK][blinkvqa] | 38.0 | 35.9 | 39.6 | | [OKVQA][okvqa] | 51.0 | 58.7 | 60.2 | | [TallyQA][tallyqa] | 42.5 | 51.8 | 54.3 | | [SpatialSense VQA][ss-vqa] | 50.9 | 60.0 | 59.4 | | [CountBenchQA][countbenchqa] | 26.1 | 17.8 | 68.0 | [coco-cap]: [docvqa]: [info-vqa]: [mmmu]: [textvqa]: [realworldqa]: [remi]: [ai2d]: [chartqa]: [vqav2]: [blinkvqa]: [okvqa]: [tallyqa]: [ss-vqa]: [countbenchqa]: ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: - **Child Safety**: Evaluation of text-to-text and image to text prompts covering child safety policies, including child sexual abuse and exploitation. - **Content Safety:** Evaluation of text-to-text and image to text prompts covering safety policies including, harassment, violence and gore, and hate speech. - **Representational Harms**: Evaluation of text-to-text and image to text prompts covering safety policies including bias, stereotyping, and harmful associations or inaccuracies. In addition to development level evaluations, we conduct \"assurance evaluations\" which are our 'arms-length' internal evaluations for responsibility governance decision making. They are conducted separately from the model development team, to inform decision making about release. High level findings are fed back to the model team, but prompt sets are held-out to prevent overfitting and preserve the results' ability to inform decision making. Assurance evaluation results are reported to our Responsibility & Safety Council as part of release review. ### Evaluation Results For all areas of safety testing, we saw major improvements in the categories of child safety, content safety, and representational harms relative to previous Gemma models. All testing was conducted without safety filters to evaluate the model capabilities and behaviors. For both text-to-text and image-to-text, and across all model sizes, the model produced minimal policy violations, and showed significant improvements over previous Gemma models' performance with respect to ungrounded inferences. A limitation of our evaluations was they included only English language prompts. ## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open vision-language models (VLMs) models have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. - Content Creation and Communication - Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. - Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. - Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. - Image Data Extraction: These models can be used to extract, interpret, and summarize visual data for text communications. - Research and Education - Natural Language Processing (NLP) and VLM Research: These models can serve as a foundation for researchers to experiment with VLM and NLP techniques, develop algorithms, and contribute to the advancement of the field. - Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. - Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations - Training Data - The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. - The scope of the training dataset determines the subject areas the model can handle effectively. - Context and Task Complexity - Models are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. - A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). - Language Ambiguity and Nuance - Natural language is inherently complex. Models might struggle to grasp subtle nuances, sarcasm, or figurative language. - Factual Accuracy - Models generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. - Common Sense - Models rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of vision-language models (VLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: - Bias and Fairness - VLMs trained on large-scale, real-world text and image data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. - Misinformation and Misuse - VLMs can be misused to generate text that is false, misleading, or harmful. - Guidelines are provided for responsible use with the model, see the [Responsible Generative AI Toolkit][rai-toolkit]. - Transparency and Accountability: - This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. - A responsibly developed open model offers the opportunity to share innovation by making VLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: - **Perpetuation of biases**: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. - **Generation of harmful content**: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. - **Misuse for malicious purposes**: Technical limitations and developer and end-user education can help mitigate against malicious applications of VLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the [Gemma Prohibited Use Policy][prohibited-use]. - **Privacy violations**: Models were trained on data filtered for removal of certain personal information and other sensitive data. Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open vision-language model implementations designed from the ground up for responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives. [g3-tech-report]: [rai-toolkit]: [kaggle-gemma]: [vertex-mg-gemma3]: [terms]: [safety-policies]: [prohibited-use]: [tpu]: [sustainability]: [jax]: [ml-pathways]: [sustainability]: [gemini-2-paper]:", + "model_explanation_gemini": "Generates text and analyzes images with a 128K context window, supporting multilingual tasks like question answering and summarization in resource-limited environments.\n\nModel Features: \n- Multimodal (text and image input, text output) \n- 128K context window \n- Supports over 140 languages \n- Available in multiple sizes (1B, 4B, 12B, 27B parameters) \n- Optimized for deployment on limited-resource devices \n\nComparison: \nLarger than" +} \ No newline at end of file diff --git a/model_data_json/google_gemma-3-27b-it-qat-q4_0-gguf.json b/model_data_json/google_gemma-3-27b-it-qat-q4_0-gguf.json new file mode 100644 index 0000000000000000000000000000000000000000..4e9eb348c2609be4dd814b708796bf3128df1a68 --- /dev/null +++ b/model_data_json/google_gemma-3-27b-it-qat-q4_0-gguf.json @@ -0,0 +1,46 @@ +{ + "model_id": "google/gemma-3-27b-it-qat-q4_0-gguf", + "downloads": 69446, + "tags": [ + "gguf", + "gemma", + "gemma3", + "image-text-to-text", + "arxiv:1905.07830", + "arxiv:1905.10044", + "arxiv:1911.11641", + "arxiv:1904.09728", + "arxiv:1705.03551", + "arxiv:1911.01547", + "arxiv:1907.10641", + "arxiv:1903.00161", + "arxiv:2009.03300", + "arxiv:2304.06364", + "arxiv:2103.03874", + "arxiv:2110.14168", + "arxiv:2311.12022", + "arxiv:2108.07732", + "arxiv:2107.03374", + "arxiv:2210.03057", + "arxiv:2106.03193", + "arxiv:1910.11856", + "arxiv:2502.12404", + "arxiv:2502.21228", + "arxiv:2404.16816", + "arxiv:2104.12756", + "arxiv:2311.16502", + "arxiv:2203.10244", + "arxiv:2404.12390", + "arxiv:1810.12440", + "arxiv:1908.02660", + "arxiv:2312.11805", + "base_model:google/gemma-3-27b-it", + "base_model:quantized:google/gemma-3-27b-it", + "license:gemma", + "endpoints_compatible", + "region:us", + "conversational" + ], + "description": "--- license: gemma pipeline_tag: image-text-to-text extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: >- To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license base_model: google/gemma-3-27b-it tags: - gemma - gemma3 --- # Gemma 3 model card **Model Page**: Gemma > [!Note] > This repository corresponds to the 27 **instruction-tuned** version of the Gemma 3 model in GGUF format using Quantization Aware Training (QAT). > The GGUF corresponds to Q4_0 quantization. > > Thanks to QAT, the model is able to preserve similar quality as while significantly reducing the memory requirements > to load the model. > > You can find the half-precision version here. **Resources and Technical Documentation**: * [Gemma 3 Technical Report][g3-tech-report] * [Responsible Generative AI Toolkit][rai-toolkit] * [Gemma on Kaggle][kaggle-gemma] * [Gemma on Vertex Model Garden][vertex-mg-gemma3] **Terms of Use**: [Terms][terms] **Authors**: Google DeepMind ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. Gemma 3 models are multimodal, handling text and image input and generating text output, with open weights for both pre-trained variants and instruction-tuned variants. Gemma 3 has a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions. Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as laptops, desktops or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Inputs and outputs - **Input:** - Text string, such as a question, a prompt, or a document to be summarized - Images, normalized to 896 x 896 resolution and encoded to 256 tokens each - Total input context of 128K tokens for the 4B, 12B, and 27B sizes, and 32K tokens for the 1B size - **Output:** - Generated text in response to the input, such as an answer to a question, analysis of image content, or a summary of a document - Total output context of 8192 tokens ### Usage Below, there are some code snippets on how to get quickly started with running the model. **llama.cpp (text-only)** **llama.cpp (image input)** **ollama (text only)** Using GGUFs with Ollama via Hugging Face does not support image inputs at the moment. Please check the docs on running gated repositories. ### Citation ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources. The 27B model was trained with 14 trillion tokens, the 12B model was trained with 12 trillion tokens, 4B model was trained with 4 trillion tokens and 1B with 2 trillion tokens. Here are the key components: - Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. The training dataset includes content in over 140 languages. - Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code and understand code-related questions. - Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. - Images: A wide range of images enables the model to perform image analysis and visual data extraction tasks. The combination of these diverse data sources is crucial for training a powerful multimodal model that can handle a wide variety of different tasks and data formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: - CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content. - Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. - Additional methods: Filtering based on content quality and safety in line with [our policies][safety-policies]. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using [Tensor Processing Unit (TPU)][tpu] hardware (TPUv4p, TPUv5p and TPUv5e). Training vision-language models (VLMS) requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: - Performance: TPUs are specifically designed to handle the massive computations involved in training VLMs. They can speed up training considerably compared to CPUs. - Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. - Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. - Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. - These advantages are aligned with [Google's commitments to operate sustainably][sustainability]. ### Software Training was done using [JAX][jax] and [ML Pathways][ml-pathways]. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for foundation models, including large language models like these ones. Together, JAX and ML Pathways are used as described in the [paper about the Gemini family of models][gemini-2-paper]; *\"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\"* ## Evaluation Model evaluation metrics and results. ### Benchmark Results These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: #### Reasoning and factuality | Benchmark | Metric | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------------|:--------------:|:-------------:|:--------------:|:--------------:| | [HellaSwag][hellaswag] | 10-shot | 62.3 | 77.2 | 84.2 | 85.6 | | [BoolQ][boolq] | 0-shot | 63.2 | 72.3 | 78.8 | 82.4 | | [PIQA][piqa] | 0-shot | 73.8 | 79.6 | 81.8 | 83.3 | | [SocialIQA][socialiqa] | 0-shot | 48.9 | 51.9 | 53.4 | 54.9 | | [TriviaQA][triviaqa] | 5-shot | 39.8 | 65.8 | 78.2 | 85.5 | | [Natural Questions][naturalq] | 5-shot | 9.48 | 20.0 | 31.4 | 36.1 | | [ARC-c][arc] | 25-shot | 38.4 | 56.2 | 68.9 | 70.6 | | [ARC-e][arc] | 0-shot | 73.0 | 82.4 | 88.3 | 89.0 | | [WinoGrande][winogrande] | 5-shot | 58.2 | 64.7 | 74.3 | 78.8 | | [BIG-Bench Hard][bbh] | few-shot | 28.4 | 50.9 | 72.6 | 77.7 | | [DROP][drop] | 1-shot | 42.4 | 60.1 | 72.2 | 77.2 | [hellaswag]: [boolq]: [piqa]: [socialiqa]: [triviaqa]: [naturalq]: [arc]: [winogrande]: [bbh]: [drop]: #### STEM and code | Benchmark | Metric | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------------|:-------------:|:--------------:|:--------------:| | [MMLU][mmlu] | 5-shot | 59.6 | 74.5 | 78.6 | | [MMLU][mmlu] (Pro COT) | 5-shot | 29.2 | 45.3 | 52.2 | | [AGIEval][agieval] | 3-5-shot | 42.1 | 57.4 | 66.2 | | [MATH][math] | 4-shot | 24.2 | 43.3 | 50.0 | | [GSM8K][gsm8k] | 8-shot | 38.4 | 71.0 | 82.6 | | [GPQA][gpqa] | 5-shot | 15.0 | 25.4 | 24.3 | | [MBPP][mbpp] | 3-shot | 46.0 | 60.4 | 65.6 | | [HumanEval][humaneval] | 0-shot | 36.0 | 45.7 | 48.8 | [mmlu]: [agieval]: [math]: [gsm8k]: [gpqa]: [mbpp]: [humaneval]: #### Multilingual | Benchmark | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------------ |:-------------:|:-------------:|:--------------:|:--------------:| | [MGSM][mgsm] | 2.04 | 34.7 | 64.3 | 74.3 | | [Global-MMLU-Lite][global-mmlu-lite] | 24.9 | 57.0 | 69.4 | 75.7 | | [WMT24++][wmt24pp] (ChrF) | 36.7 | 48.4 | 53.9 | 55.7 | | [FloRes][flores] | 29.5 | 39.2 | 46.0 | 48.8 | | [XQuAD][xquad] (all) | 43.9 | 68.0 | 74.5 | 76.8 | | [ECLeKTic][eclektic] | 4.69 | 11.0 | 17.2 | 24.4 | | [IndicGenBench][indicgenbench] | 41.4 | 57.2 | 61.7 | 63.4 | [mgsm]: [flores]: [xquad]: [global-mmlu-lite]: [wmt24pp]: [eclektic]: [indicgenbench]: #### Multimodal | Benchmark | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |:-------------:|:--------------:|:--------------:| | [COCOcap][coco-cap] | 102 | 111 | 116 | | [DocVQA][docvqa] (val) | 72.8 | 82.3 | 85.6 | | [InfoVQA][info-vqa] (val) | 44.1 | 54.8 | 59.4 | | [MMMU][mmmu] (pt) | 39.2 | 50.3 | 56.1 | | [TextVQA][textvqa] (val) | 58.9 | 66.5 | 68.6 | | [RealWorldQA][realworldqa] | 45.5 | 52.2 | 53.9 | | [ReMI][remi] | 27.3 | 38.5 | 44.8 | | [AI2D][ai2d] | 63.2 | 75.2 | 79.0 | | [ChartQA][chartqa] | 63.6 | 74.7 | 76.3 | | [VQAv2][vqav2] | 63.9 | 71.2 | 72.9 | | [BLINK][blinkvqa] | 38.0 | 35.9 | 39.6 | | [OKVQA][okvqa] | 51.0 | 58.7 | 60.2 | | [TallyQA][tallyqa] | 42.5 | 51.8 | 54.3 | | [SpatialSense VQA][ss-vqa] | 50.9 | 60.0 | 59.4 | | [CountBenchQA][countbenchqa] | 26.1 | 17.8 | 68.0 | [coco-cap]: [docvqa]: [info-vqa]: [mmmu]: [textvqa]: [realworldqa]: [remi]: [ai2d]: [chartqa]: [vqav2]: [blinkvqa]: [okvqa]: [tallyqa]: [ss-vqa]: [countbenchqa]: ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: - **Child Safety**: Evaluation of text-to-text and image to text prompts covering child safety policies, including child sexual abuse and exploitation. - **Content Safety:** Evaluation of text-to-text and image to text prompts covering safety policies including, harassment, violence and gore, and hate speech. - **Representational Harms**: Evaluation of text-to-text and image to text prompts covering safety policies including bias, stereotyping, and harmful associations or inaccuracies. In addition to development level evaluations, we conduct \"assurance evaluations\" which are our 'arms-length' internal evaluations for responsibility governance decision making. They are conducted separately from the model development team, to inform decision making about release. High level findings are fed back to the model team, but prompt sets are held-out to prevent overfitting and preserve the results' ability to inform decision making. Assurance evaluation results are reported to our Responsibility & Safety Council as part of release review. ### Evaluation Results For all areas of safety testing, we saw major improvements in the categories of child safety, content safety, and representational harms relative to previous Gemma models. All testing was conducted without safety filters to evaluate the model capabilities and behaviors. For both text-to-text and image-to-text, and across all model sizes, the model produced minimal policy violations, and showed significant improvements over previous Gemma models' performance with respect to ungrounded inferences. A limitation of our evaluations was they included only English language prompts. ## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open vision-language models (VLMs) models have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. - Content Creation and Communication - Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. - Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. - Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. - Image Data Extraction: These models can be used to extract, interpret, and summarize visual data for text communications. - Research and Education - Natural Language Processing (NLP) and VLM Research: These models can serve as a foundation for researchers to experiment with VLM and NLP techniques, develop algorithms, and contribute to the advancement of the field. - Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. - Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations - Training Data - The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. - The scope of the training dataset determines the subject areas the model can handle effectively. - Context and Task Complexity - Models are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. - A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). - Language Ambiguity and Nuance - Natural language is inherently complex. Models might struggle to grasp subtle nuances, sarcasm, or figurative language. - Factual Accuracy - Models generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. - Common Sense - Models rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of vision-language models (VLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: - Bias and Fairness - VLMs trained on large-scale, real-world text and image data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. - Misinformation and Misuse - VLMs can be misused to generate text that is false, misleading, or harmful. - Guidelines are provided for responsible use with the model, see the [Responsible Generative AI Toolkit][rai-toolkit]. - Transparency and Accountability: - This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. - A responsibly developed open model offers the opportunity to share innovation by making VLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: - **Perpetuation of biases**: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. - **Generation of harmful content**: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. - **Misuse for malicious purposes**: Technical limitations and developer and end-user education can help mitigate against malicious applications of VLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the [Gemma Prohibited Use Policy][prohibited-use]. - **Privacy violations**: Models were trained on data filtered for removal of certain personal information and other sensitive data. Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open vision-language model implementations designed from the ground up for responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives. [g3-tech-report]: [rai-toolkit]: [kaggle-gemma]: [vertex-mg-gemma3]: [terms]: [safety-policies]: [prohibited-use]: [tpu]: [sustainability]: [jax]: [ml-pathways]: [sustainability]: [gemini-2-paper]:", + "model_explanation_gemini": "A 27B-parameter instruction-tuned multimodal Gemma 3 model quantized to 4-bit (GGUF Q4_0) for efficient deployment, handling text/image inputs (128K context) to generate text outputs for tasks like QA, summarization, and image understanding.\n\n**Features**: \n- **Multimodal**: Processes text and images (896x896 resolution). \n- **Quantized**: 4-bit GGUF format via QAT reduces memory usage while preserving quality." +} \ No newline at end of file diff --git a/model_data_json/google_gemma-3-27b-it.json b/model_data_json/google_gemma-3-27b-it.json new file mode 100644 index 0000000000000000000000000000000000000000..bd344d189a49bee952b64237d8e519e541418cf7 --- /dev/null +++ b/model_data_json/google_gemma-3-27b-it.json @@ -0,0 +1,47 @@ +{ + "model_id": "google/gemma-3-27b-it", + "downloads": 395187, + "tags": [ + "transformers", + "safetensors", + "gemma3", + "image-text-to-text", + "conversational", + "arxiv:1905.07830", + "arxiv:1905.10044", + "arxiv:1911.11641", + "arxiv:1904.09728", + "arxiv:1705.03551", + "arxiv:1911.01547", + "arxiv:1907.10641", + "arxiv:1903.00161", + "arxiv:2009.03300", + "arxiv:2304.06364", + "arxiv:2103.03874", + "arxiv:2110.14168", + "arxiv:2311.12022", + "arxiv:2108.07732", + "arxiv:2107.03374", + "arxiv:2210.03057", + "arxiv:2106.03193", + "arxiv:1910.11856", + "arxiv:2502.12404", + "arxiv:2502.21228", + "arxiv:2404.16816", + "arxiv:2104.12756", + "arxiv:2311.16502", + "arxiv:2203.10244", + "arxiv:2404.12390", + "arxiv:1810.12440", + "arxiv:1908.02660", + "arxiv:2312.11805", + "base_model:google/gemma-3-27b-pt", + "base_model:finetune:google/gemma-3-27b-pt", + "license:gemma", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: gemma library_name: transformers pipeline_tag: image-text-to-text extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license base_model: google/gemma-3-27b-pt --- # Gemma 3 model card **Model Page**: Gemma **Resources and Technical Documentation**: * [Gemma 3 Technical Report][g3-tech-report] * [Responsible Generative AI Toolkit][rai-toolkit] * [Gemma on Kaggle][kaggle-gemma] * [Gemma on Vertex Model Garden][vertex-mg-gemma3] **Terms of Use**: [Terms][terms] **Authors**: Google DeepMind ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. Gemma 3 models are multimodal, handling text and image input and generating text output, with open weights for both pre-trained variants and instruction-tuned variants. Gemma 3 has a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions. Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as laptops, desktops or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Inputs and outputs - **Input:** - Text string, such as a question, a prompt, or a document to be summarized - Images, normalized to 896 x 896 resolution and encoded to 256 tokens each - Total input context of 128K tokens for the 4B, 12B, and 27B sizes, and 32K tokens for the 1B size - **Output:** - Generated text in response to the input, such as an answer to a question, analysis of image content, or a summary of a document - Total output context of 8192 tokens ### Usage Below there are some code snippets on how to get quickly started with running the model. First, install the Transformers library. Gemma 3 is supported starting from transformers 4.50.0. Then, copy the snippet from the section that is relevant for your use case. #### Running with the API You can initialize the model and processor for inference with as follows. With instruction-tuned models, you need to use chat templates to process our inputs first. Then, you can pass it to the pipeline. #### Running the model on a single/multi GPU ### Citation ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources. The 27B model was trained with 14 trillion tokens, the 12B model was trained with 12 trillion tokens, 4B model was trained with 4 trillion tokens and 1B with 2 trillion tokens. Here are the key components: - Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. The training dataset includes content in over 140 languages. - Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code and understand code-related questions. - Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. - Images: A wide range of images enables the model to perform image analysis and visual data extraction tasks. The combination of these diverse data sources is crucial for training a powerful multimodal model that can handle a wide variety of different tasks and data formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: - CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content. - Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. - Additional methods: Filtering based on content quality and safety in line with [our policies][safety-policies]. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using [Tensor Processing Unit (TPU)][tpu] hardware (TPUv4p, TPUv5p and TPUv5e). Training vision-language models (VLMS) requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: - Performance: TPUs are specifically designed to handle the massive computations involved in training VLMs. They can speed up training considerably compared to CPUs. - Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. - Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. - Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. - These advantages are aligned with [Google's commitments to operate sustainably][sustainability]. ### Software Training was done using [JAX][jax] and [ML Pathways][ml-pathways]. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for foundation models, including large language models like these ones. Together, JAX and ML Pathways are used as described in the [paper about the Gemini family of models][gemini-2-paper]; *\"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\"* ## Evaluation Model evaluation metrics and results. ### Benchmark Results These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: #### Reasoning and factuality | Benchmark | Metric | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------------|:--------------:|:-------------:|:--------------:|:--------------:| | [HellaSwag][hellaswag] | 10-shot | 62.3 | 77.2 | 84.2 | 85.6 | | [BoolQ][boolq] | 0-shot | 63.2 | 72.3 | 78.8 | 82.4 | | [PIQA][piqa] | 0-shot | 73.8 | 79.6 | 81.8 | 83.3 | | [SocialIQA][socialiqa] | 0-shot | 48.9 | 51.9 | 53.4 | 54.9 | | [TriviaQA][triviaqa] | 5-shot | 39.8 | 65.8 | 78.2 | 85.5 | | [Natural Questions][naturalq] | 5-shot | 9.48 | 20.0 | 31.4 | 36.1 | | [ARC-c][arc] | 25-shot | 38.4 | 56.2 | 68.9 | 70.6 | | [ARC-e][arc] | 0-shot | 73.0 | 82.4 | 88.3 | 89.0 | | [WinoGrande][winogrande] | 5-shot | 58.2 | 64.7 | 74.3 | 78.8 | | [BIG-Bench Hard][bbh] | few-shot | 28.4 | 50.9 | 72.6 | 77.7 | | [DROP][drop] | 1-shot | 42.4 | 60.1 | 72.2 | 77.2 | [hellaswag]: [boolq]: [piqa]: [socialiqa]: [triviaqa]: [naturalq]: [arc]: [winogrande]: [bbh]: [drop]: #### STEM and code | Benchmark | Metric | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------------|:-------------:|:--------------:|:--------------:| | [MMLU][mmlu] | 5-shot | 59.6 | 74.5 | 78.6 | | [MMLU][mmlu] (Pro COT) | 5-shot | 29.2 | 45.3 | 52.2 | | [AGIEval][agieval] | 3-5-shot | 42.1 | 57.4 | 66.2 | | [MATH][math] | 4-shot | 24.2 | 43.3 | 50.0 | | [GSM8K][gsm8k] | 8-shot | 38.4 | 71.0 | 82.6 | | [GPQA][gpqa] | 5-shot | 15.0 | 25.4 | 24.3 | | [MBPP][mbpp] | 3-shot | 46.0 | 60.4 | 65.6 | | [HumanEval][humaneval] | 0-shot | 36.0 | 45.7 | 48.8 | [mmlu]: [agieval]: [math]: [gsm8k]: [gpqa]: [mbpp]: [humaneval]: #### Multilingual | Benchmark | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------------ |:-------------:|:-------------:|:--------------:|:--------------:| | [MGSM][mgsm] | 2.04 | 34.7 | 64.3 | 74.3 | | [Global-MMLU-Lite][global-mmlu-lite] | 24.9 | 57.0 | 69.4 | 75.7 | | [WMT24++][wmt24pp] (ChrF) | 36.7 | 48.4 | 53.9 | 55.7 | | [FloRes][flores] | 29.5 | 39.2 | 46.0 | 48.8 | | [XQuAD][xquad] (all) | 43.9 | 68.0 | 74.5 | 76.8 | | [ECLeKTic][eclektic] | 4.69 | 11.0 | 17.2 | 24.4 | | [IndicGenBench][indicgenbench] | 41.4 | 57.2 | 61.7 | 63.4 | [mgsm]: [flores]: [xquad]: [global-mmlu-lite]: [wmt24pp]: [eclektic]: [indicgenbench]: #### Multimodal | Benchmark | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |:-------------:|:--------------:|:--------------:| | [COCOcap][coco-cap] | 102 | 111 | 116 | | [DocVQA][docvqa] (val) | 72.8 | 82.3 | 85.6 | | [InfoVQA][info-vqa] (val) | 44.1 | 54.8 | 59.4 | | [MMMU][mmmu] (pt) | 39.2 | 50.3 | 56.1 | | [TextVQA][textvqa] (val) | 58.9 | 66.5 | 68.6 | | [RealWorldQA][realworldqa] | 45.5 | 52.2 | 53.9 | | [ReMI][remi] | 27.3 | 38.5 | 44.8 | | [AI2D][ai2d] | 63.2 | 75.2 | 79.0 | | [ChartQA][chartqa] | 63.6 | 74.7 | 76.3 | | [VQAv2][vqav2] | 63.9 | 71.2 | 72.9 | | [BLINK][blinkvqa] | 38.0 | 35.9 | 39.6 | | [OKVQA][okvqa] | 51.0 | 58.7 | 60.2 | | [TallyQA][tallyqa] | 42.5 | 51.8 | 54.3 | | [SpatialSense VQA][ss-vqa] | 50.9 | 60.0 | 59.4 | | [CountBenchQA][countbenchqa] | 26.1 | 17.8 | 68.0 | [coco-cap]: [docvqa]: [info-vqa]: [mmmu]: [textvqa]: [realworldqa]: [remi]: [ai2d]: [chartqa]: [vqav2]: [blinkvqa]: [okvqa]: [tallyqa]: [ss-vqa]: [countbenchqa]: ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: - **Child Safety**: Evaluation of text-to-text and image to text prompts covering child safety policies, including child sexual abuse and exploitation. - **Content Safety:** Evaluation of text-to-text and image to text prompts covering safety policies including, harassment, violence and gore, and hate speech. - **Representational Harms**: Evaluation of text-to-text and image to text prompts covering safety policies including bias, stereotyping, and harmful associations or inaccuracies. In addition to development level evaluations, we conduct \"assurance evaluations\" which are our 'arms-length' internal evaluations for responsibility governance decision making. They are conducted separately from the model development team, to inform decision making about release. High level findings are fed back to the model team, but prompt sets are held-out to prevent overfitting and preserve the results' ability to inform decision making. Assurance evaluation results are reported to our Responsibility & Safety Council as part of release review. ### Evaluation Results For all areas of safety testing, we saw major improvements in the categories of child safety, content safety, and representational harms relative to previous Gemma models. All testing was conducted without safety filters to evaluate the model capabilities and behaviors. For both text-to-text and image-to-text, and across all model sizes, the model produced minimal policy violations, and showed significant improvements over previous Gemma models' performance with respect to ungrounded inferences. A limitation of our evaluations was they included only English language prompts. ## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open vision-language models (VLMs) models have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. - Content Creation and Communication - Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. - Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. - Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. - Image Data Extraction: These models can be used to extract, interpret, and summarize visual data for text communications. - Research and Education - Natural Language Processing (NLP) and VLM Research: These models can serve as a foundation for researchers to experiment with VLM and NLP techniques, develop algorithms, and contribute to the advancement of the field. - Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. - Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations - Training Data - The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. - The scope of the training dataset determines the subject areas the model can handle effectively. - Context and Task Complexity - Models are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. - A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). - Language Ambiguity and Nuance - Natural language is inherently complex. Models might struggle to grasp subtle nuances, sarcasm, or figurative language. - Factual Accuracy - Models generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. - Common Sense - Models rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of vision-language models (VLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: - Bias and Fairness - VLMs trained on large-scale, real-world text and image data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. - Misinformation and Misuse - VLMs can be misused to generate text that is false, misleading, or harmful. - Guidelines are provided for responsible use with the model, see the [Responsible Generative AI Toolkit][rai-toolkit]. - Transparency and Accountability: - This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. - A responsibly developed open model offers the opportunity to share innovation by making VLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: - **Perpetuation of biases**: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. - **Generation of harmful content**: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. - **Misuse for malicious purposes**: Technical limitations and developer and end-user education can help mitigate against malicious applications of VLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the [Gemma Prohibited Use Policy][prohibited-use]. - **Privacy violations**: Models were trained on data filtered for removal of certain personal information and other sensitive data. Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open vision-language model implementations designed from the ground up for responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives. [g3-tech-report]: [rai-toolkit]: [kaggle-gemma]: [vertex-mg-gemma3]: [terms]: [safety-policies]: [prohibited-use]: [tpu]: [sustainability]: [jax]: [ml-pathways]: [sustainability]: [gemini-2-paper]:", + "model_explanation_gemini": "\"google_gemma-3-27b-it\" is a lightweight, multimodal AI model that processes text and images (896x896 resolution) to generate text outputs, supporting 140+ languages and a 128K token context window for tasks like question answering, summarization, and reasoning. \n\n**Features**: \n- Multimodal (text + image inputs, text output) \n- 128K token context window (27B size) \n- Supports 140+ languages \n-" +} \ No newline at end of file diff --git a/model_data_json/google_gemma-3-4b-it.json b/model_data_json/google_gemma-3-4b-it.json new file mode 100644 index 0000000000000000000000000000000000000000..80d5844dbc92ef0729f7bc881bc0fc0e174b2ecf --- /dev/null +++ b/model_data_json/google_gemma-3-4b-it.json @@ -0,0 +1,47 @@ +{ + "model_id": "google/gemma-3-4b-it", + "downloads": 576385, + "tags": [ + "transformers", + "safetensors", + "gemma3", + "image-text-to-text", + "conversational", + "arxiv:1905.07830", + "arxiv:1905.10044", + "arxiv:1911.11641", + "arxiv:1904.09728", + "arxiv:1705.03551", + "arxiv:1911.01547", + "arxiv:1907.10641", + "arxiv:1903.00161", + "arxiv:2009.03300", + "arxiv:2304.06364", + "arxiv:2103.03874", + "arxiv:2110.14168", + "arxiv:2311.12022", + "arxiv:2108.07732", + "arxiv:2107.03374", + "arxiv:2210.03057", + "arxiv:2106.03193", + "arxiv:1910.11856", + "arxiv:2502.12404", + "arxiv:2502.21228", + "arxiv:2404.16816", + "arxiv:2104.12756", + "arxiv:2311.16502", + "arxiv:2203.10244", + "arxiv:2404.12390", + "arxiv:1810.12440", + "arxiv:1908.02660", + "arxiv:2312.11805", + "base_model:google/gemma-3-4b-pt", + "base_model:finetune:google/gemma-3-4b-pt", + "license:gemma", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: gemma library_name: transformers pipeline_tag: image-text-to-text extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license base_model: google/gemma-3-4b-pt --- # Gemma 3 model card **Model Page**: Gemma **Resources and Technical Documentation**: * [Gemma 3 Technical Report][g3-tech-report] * [Responsible Generative AI Toolkit][rai-toolkit] * [Gemma on Kaggle][kaggle-gemma] * [Gemma on Vertex Model Garden][vertex-mg-gemma3] **Terms of Use**: [Terms][terms] **Authors**: Google DeepMind ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. Gemma 3 models are multimodal, handling text and image input and generating text output, with open weights for both pre-trained variants and instruction-tuned variants. Gemma 3 has a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions. Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as laptops, desktops or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Inputs and outputs - **Input:** - Text string, such as a question, a prompt, or a document to be summarized - Images, normalized to 896 x 896 resolution and encoded to 256 tokens each - Total input context of 128K tokens for the 4B, 12B, and 27B sizes, and 32K tokens for the 1B size - **Output:** - Generated text in response to the input, such as an answer to a question, analysis of image content, or a summary of a document - Total output context of 8192 tokens ### Usage Below, there are some code snippets on how to get quickly started with running the model. First, install the Transformers library. Gemma 3 is supported starting from transformers 4.50.0. Then, copy the snippet from the section that is relevant for your use case. #### Running with the API You can initialize the model and processor for inference with as follows. With instruction-tuned models, you need to use chat templates to process our inputs first. Then, you can pass it to the pipeline. #### Running the model on a single/multi GPU ### Citation ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources. The 27B model was trained with 14 trillion tokens, the 12B model was trained with 12 trillion tokens, 4B model was trained with 4 trillion tokens and 1B with 2 trillion tokens. Here are the key components: - Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. The training dataset includes content in over 140 languages. - Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code and understand code-related questions. - Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. - Images: A wide range of images enables the model to perform image analysis and visual data extraction tasks. The combination of these diverse data sources is crucial for training a powerful multimodal model that can handle a wide variety of different tasks and data formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: - CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content. - Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. - Additional methods: Filtering based on content quality and safety in line with [our policies][safety-policies]. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using [Tensor Processing Unit (TPU)][tpu] hardware (TPUv4p, TPUv5p and TPUv5e). Training vision-language models (VLMS) requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: - Performance: TPUs are specifically designed to handle the massive computations involved in training VLMs. They can speed up training considerably compared to CPUs. - Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. - Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. - Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. - These advantages are aligned with [Google's commitments to operate sustainably][sustainability]. ### Software Training was done using [JAX][jax] and [ML Pathways][ml-pathways]. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for foundation models, including large language models like these ones. Together, JAX and ML Pathways are used as described in the [paper about the Gemini family of models][gemini-2-paper]; *\"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\"* ## Evaluation Model evaluation metrics and results. ### Benchmark Results These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: #### Reasoning and factuality | Benchmark | Metric | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------------|:--------------:|:-------------:|:--------------:|:--------------:| | [HellaSwag][hellaswag] | 10-shot | 62.3 | 77.2 | 84.2 | 85.6 | | [BoolQ][boolq] | 0-shot | 63.2 | 72.3 | 78.8 | 82.4 | | [PIQA][piqa] | 0-shot | 73.8 | 79.6 | 81.8 | 83.3 | | [SocialIQA][socialiqa] | 0-shot | 48.9 | 51.9 | 53.4 | 54.9 | | [TriviaQA][triviaqa] | 5-shot | 39.8 | 65.8 | 78.2 | 85.5 | | [Natural Questions][naturalq] | 5-shot | 9.48 | 20.0 | 31.4 | 36.1 | | [ARC-c][arc] | 25-shot | 38.4 | 56.2 | 68.9 | 70.6 | | [ARC-e][arc] | 0-shot | 73.0 | 82.4 | 88.3 | 89.0 | | [WinoGrande][winogrande] | 5-shot | 58.2 | 64.7 | 74.3 | 78.8 | | [BIG-Bench Hard][bbh] | few-shot | 28.4 | 50.9 | 72.6 | 77.7 | | [DROP][drop] | 1-shot | 42.4 | 60.1 | 72.2 | 77.2 | [hellaswag]: [boolq]: [piqa]: [socialiqa]: [triviaqa]: [naturalq]: [arc]: [winogrande]: [bbh]: [drop]: #### STEM and code | Benchmark | Metric | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |----------------|:-------------:|:--------------:|:--------------:| | [MMLU][mmlu] | 5-shot | 59.6 | 74.5 | 78.6 | | [MMLU][mmlu] (Pro COT) | 5-shot | 29.2 | 45.3 | 52.2 | | [AGIEval][agieval] | 3-5-shot | 42.1 | 57.4 | 66.2 | | [MATH][math] | 4-shot | 24.2 | 43.3 | 50.0 | | [GSM8K][gsm8k] | 8-shot | 38.4 | 71.0 | 82.6 | | [GPQA][gpqa] | 5-shot | 15.0 | 25.4 | 24.3 | | [MBPP][mbpp] | 3-shot | 46.0 | 60.4 | 65.6 | | [HumanEval][humaneval] | 0-shot | 36.0 | 45.7 | 48.8 | [mmlu]: [agieval]: [math]: [gsm8k]: [gpqa]: [mbpp]: [humaneval]: #### Multilingual | Benchmark | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------------ |:-------------:|:-------------:|:--------------:|:--------------:| | [MGSM][mgsm] | 2.04 | 34.7 | 64.3 | 74.3 | | [Global-MMLU-Lite][global-mmlu-lite] | 24.9 | 57.0 | 69.4 | 75.7 | | [WMT24++][wmt24pp] (ChrF) | 36.7 | 48.4 | 53.9 | 55.7 | | [FloRes][flores] | 29.5 | 39.2 | 46.0 | 48.8 | | [XQuAD][xquad] (all) | 43.9 | 68.0 | 74.5 | 76.8 | | [ECLeKTic][eclektic] | 4.69 | 11.0 | 17.2 | 24.4 | | [IndicGenBench][indicgenbench] | 41.4 | 57.2 | 61.7 | 63.4 | [mgsm]: [flores]: [xquad]: [global-mmlu-lite]: [wmt24pp]: [eclektic]: [indicgenbench]: #### Multimodal | Benchmark | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B | | ------------------------------ |:-------------:|:--------------:|:--------------:| | [COCOcap][coco-cap] | 102 | 111 | 116 | | [DocVQA][docvqa] (val) | 72.8 | 82.3 | 85.6 | | [InfoVQA][info-vqa] (val) | 44.1 | 54.8 | 59.4 | | [MMMU][mmmu] (pt) | 39.2 | 50.3 | 56.1 | | [TextVQA][textvqa] (val) | 58.9 | 66.5 | 68.6 | | [RealWorldQA][realworldqa] | 45.5 | 52.2 | 53.9 | | [ReMI][remi] | 27.3 | 38.5 | 44.8 | | [AI2D][ai2d] | 63.2 | 75.2 | 79.0 | | [ChartQA][chartqa] | 63.6 | 74.7 | 76.3 | | [VQAv2][vqav2] | 63.9 | 71.2 | 72.9 | | [BLINK][blinkvqa] | 38.0 | 35.9 | 39.6 | | [OKVQA][okvqa] | 51.0 | 58.7 | 60.2 | | [TallyQA][tallyqa] | 42.5 | 51.8 | 54.3 | | [SpatialSense VQA][ss-vqa] | 50.9 | 60.0 | 59.4 | | [CountBenchQA][countbenchqa] | 26.1 | 17.8 | 68.0 | [coco-cap]: [docvqa]: [info-vqa]: [mmmu]: [textvqa]: [realworldqa]: [remi]: [ai2d]: [chartqa]: [vqav2]: [blinkvqa]: [okvqa]: [tallyqa]: [ss-vqa]: [countbenchqa]: ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: - **Child Safety**: Evaluation of text-to-text and image to text prompts covering child safety policies, including child sexual abuse and exploitation. - **Content Safety:** Evaluation of text-to-text and image to text prompts covering safety policies including, harassment, violence and gore, and hate speech. - **Representational Harms**: Evaluation of text-to-text and image to text prompts covering safety policies including bias, stereotyping, and harmful associations or inaccuracies. In addition to development level evaluations, we conduct \"assurance evaluations\" which are our 'arms-length' internal evaluations for responsibility governance decision making. They are conducted separately from the model development team, to inform decision making about release. High level findings are fed back to the model team, but prompt sets are held-out to prevent overfitting and preserve the results' ability to inform decision making. Assurance evaluation results are reported to our Responsibility & Safety Council as part of release review. ### Evaluation Results For all areas of safety testing, we saw major improvements in the categories of child safety, content safety, and representational harms relative to previous Gemma models. All testing was conducted without safety filters to evaluate the model capabilities and behaviors. For both text-to-text and image-to-text, and across all model sizes, the model produced minimal policy violations, and showed significant improvements over previous Gemma models' performance with respect to ungrounded inferences. A limitation of our evaluations was they included only English language prompts. ## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open vision-language models (VLMs) models have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. - Content Creation and Communication - Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. - Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. - Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. - Image Data Extraction: These models can be used to extract, interpret, and summarize visual data for text communications. - Research and Education - Natural Language Processing (NLP) and VLM Research: These models can serve as a foundation for researchers to experiment with VLM and NLP techniques, develop algorithms, and contribute to the advancement of the field. - Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. - Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations - Training Data - The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. - The scope of the training dataset determines the subject areas the model can handle effectively. - Context and Task Complexity - Models are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. - A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). - Language Ambiguity and Nuance - Natural language is inherently complex. Models might struggle to grasp subtle nuances, sarcasm, or figurative language. - Factual Accuracy - Models generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. - Common Sense - Models rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of vision-language models (VLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: - Bias and Fairness - VLMs trained on large-scale, real-world text and image data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. - Misinformation and Misuse - VLMs can be misused to generate text that is false, misleading, or harmful. - Guidelines are provided for responsible use with the model, see the [Responsible Generative AI Toolkit][rai-toolkit]. - Transparency and Accountability: - This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. - A responsibly developed open model offers the opportunity to share innovation by making VLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: - **Perpetuation of biases**: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. - **Generation of harmful content**: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. - **Misuse for malicious purposes**: Technical limitations and developer and end-user education can help mitigate against malicious applications of VLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the [Gemma Prohibited Use Policy][prohibited-use]. - **Privacy violations**: Models were trained on data filtered for removal of certain personal information and other sensitive data. Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open vision-language model implementations designed from the ground up for responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives. [g3-tech-report]: [rai-toolkit]: [kaggle-gemma]: [vertex-mg-gemma3]: [terms]: [safety-policies]: [prohibited-use]: [tpu]: [sustainability]: [jax]: [ml-pathways]: [sustainability]: [gemini-2-paper]:", + "model_explanation_gemini": "Generates text outputs from text or image inputs, supporting multilingual tasks like question answering, summarization, and reasoning with a 128K context window. \n\n**Features:** \n- Multimodal (text/image input, text output) \n- 128K token context window (4B/12B/27B sizes) \n- Supports 140+ languages \n- Lightweight for local/cloud deployment \n- Instruction-tuned variants available \n\n**Comparison:** \nMore sizes and longer context than" +} \ No newline at end of file diff --git a/model_data_json/google_gemma-7b-it.json b/model_data_json/google_gemma-7b-it.json new file mode 100644 index 0000000000000000000000000000000000000000..d3be1b145d549b5502ea8fe0c2aef609ce2c98d0 --- /dev/null +++ b/model_data_json/google_gemma-7b-it.json @@ -0,0 +1,44 @@ +{ + "model_id": "google/gemma-7b-it", + "downloads": 80370, + "tags": [ + "transformers", + "safetensors", + "gguf", + "gemma", + "text-generation", + "conversational", + "arxiv:2312.11805", + "arxiv:2009.03300", + "arxiv:1905.07830", + "arxiv:1911.11641", + "arxiv:1904.09728", + "arxiv:1905.10044", + "arxiv:1907.10641", + "arxiv:1811.00937", + "arxiv:1809.02789", + "arxiv:1911.01547", + "arxiv:1705.03551", + "arxiv:2107.03374", + "arxiv:2108.07732", + "arxiv:2110.14168", + "arxiv:2304.06364", + "arxiv:2206.04615", + "arxiv:1804.06876", + "arxiv:2110.08193", + "arxiv:2009.11462", + "arxiv:2101.11718", + "arxiv:1804.09301", + "arxiv:2109.07958", + "arxiv:2203.09509", + "base_model:google/gemma-7b", + "base_model:finetune:google/gemma-7b", + "license:gemma", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- library_name: transformers license: gemma tags: [] widget: - messages: - role: user content: How does the brain work? inference: parameters: max_new_tokens: 200 extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged-in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license base_model: google/gemma-7b base_model_relation: finetune --- # Gemma Model Card **Model Page**: Gemma This model card corresponds to the 7B instruct version of the Gemma model. You can also visit the model card of the 2B base model, 7B base model, and 2B instruct model. **Resources and Technical Documentation**: * Responsible Generative AI Toolkit * Gemma on Kaggle * Gemma on Vertex Model Garden **Terms of Use**: Terms **Authors**: Google ## Model Information Summary description and brief definition of inputs and outputs. ### Description Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights, pre-trained variants, and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. ### Usage Below we share some code snippets on how to get quickly started with running the model. First make sure to , then copy the snippet from the section that is relevant for your usecase. #### Fine-tuning the model You can find fine-tuning scripts and notebook under the directory of []( repository. To adapt it to this model, simply change the model-id to . In that repository, we provide: * A script to perform Supervised Fine-Tuning (SFT) on UltraChat dataset using QLoRA * A script to perform SFT using FSDP on TPU devices * A notebook that you can run on a free-tier Google Colab instance to perform SFT on English quotes dataset #### Running the model on a CPU As explained below, we recommend as the default dtype. You can use a different precision if necessary. #### Running the model on a single / multi GPU #### Running the model on a GPU using different precisions The native weights of this model were exported in precision. You can use , which may be faster on certain hardware, indicating the when loading the model. For convenience, the revision of the repo contains a copy of the weights already converted to that precision. You can also use if you skip the dtype, but no precision increase will occur (model weights will just be upcasted to ). See examples below. * _Using _ * _Using _ * _Upcasting to _ #### Quantized Versions through * _Using 8-bit precision (int8)_ * _Using 4-bit precision_ #### Other optimizations * _Flash Attention 2_ First make sure to install in your environment ### Chat Template The instruction-tuned models use a chat template that must be adhered to for conversational use. The easiest way to apply it is using the tokenizer's built-in chat template, as shown in the following snippet. Let's load the model and apply the chat template to a conversation. In this example, we'll start with a single user interaction: At this point, the prompt contains the following text: As you can see, each turn is preceded by a delimiter and then the role of the entity (either , for content supplied by the user, or for LLM responses). Turns finish with the token. You can follow this format to build the prompt manually, if you need to do it without the tokenizer's chat template. After the prompt is ready, generation can be performed like this: ### Inputs and outputs * **Input:** Text string, such as a question, a prompt, or a document to be summarized. * **Output:** Generated English-language text in response to the input, such as an answer to a question, or a summary of a document. ## Model Data Data used for model training and how the data was processed. ### Training Dataset These models were trained on a dataset of text data that includes a wide variety of sources, totaling 6 trillion tokens. Here are the key components: * Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. Primarily English-language content. * Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code or understand code-related questions. * Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries. The combination of these diverse data sources is crucial for training a powerful language model that can handle a wide variety of different tasks and text formats. ### Data Preprocessing Here are the key data cleaning and filtering methods applied to the training data: * CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and illegal content * Sensitive Data Filtering: As part of making Gemma pre-trained models safe and reliable, automated techniques were used to filter out certain personal information and other sensitive data from training sets. * Additional methods: Filtering based on content quality and safely in line with our policies. ## Implementation Information Details about the model internals. ### Hardware Gemma was trained using the latest generation of Tensor Processing Unit (TPU) hardware (TPUv5e). Training large language models requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain: * Performance: TPUs are specifically designed to handle the massive computations involved in training LLMs. They can speed up training considerably compared to CPUs. * Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality. * Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing. * Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training. * These advantages are aligned with Google's commitments to operate sustainably. ### Software Training was done using JAX and ML Pathways. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for foundation models, including large language models like these ones. Together, JAX and ML Pathways are used as described in the paper about the Gemini family of models; \"the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow.\" ## Evaluation Model evaluation metrics and results. ### Benchmark Results These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation: | Benchmark | Metric | 2B Params | 7B Params | | ------------------------------ | ------------- | ----------- | --------- | | MMLU | 5-shot, top-1 | 42.3 | 64.3 | | HellaSwag | 0-shot |71.4 | 81.2 | | PIQA | 0-shot | 77.3 | 81.2 | | SocialIQA | 0-shot | 49.7 | 51.8 | | BooIQ | 0-shot | 69.4 | 83.2 | | WinoGrande | partial score | 65.4 | 72.3 | | CommonsenseQA | 7-shot | 65.3 | 71.3 | | OpenBookQA | | 47.8 | 52.8 | | ARC-e | | 73.2 | 81.5 | | ARC-c | | 42.1 | 53.2 | | TriviaQA | 5-shot | 53.2 | 63.4 | | Natural Questions | 5-shot | 12.5 | 23 | | HumanEval | pass@1 | 22.0 | 32.3 | | MBPP | 3-shot | 29.2 | 44.4 | | GSM8K | maj@1 | 17.7 | 46.4 | | MATH | 4-shot | 11.8 | 24.3 | | AGIEval | | 24.2 | 41.7 | | BIG-Bench | | 35.2 | 55.1 | | ------------------------------ | ------------- | ----------- | --------- | | **Average** | | **45.0** | **56.9** | ## Ethics and Safety Ethics and safety evaluation approach and results. ### Evaluation Approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: * Text-to-Text Content Safety: Human evaluation on prompts covering safety policies including child sexual abuse and exploitation, harassment, violence and gore, and hate speech. * Text-to-Text Representational Harms: Benchmark against relevant academic datasets such as WinoBias and BBQ Dataset. * Memorization: Automated evaluation of memorization of training data, including the risk of personally identifiable information exposure. * Large-scale harm: Tests for \"dangerous capabilities,\" such as chemical, biological, radiological, and nuclear (CBRN) risks. ### Evaluation Results The results of ethics and safety evaluations are within acceptable thresholds for meeting internal policies for categories such as child safety, content safety, representational harms, memorization, large-scale harms. On top of robust internal evaluations, the results of well known safety benchmarks like BBQ, BOLD, Winogender, Winobias, RealToxicity, and TruthfulQA are shown here. | Benchmark | Metric | 2B Params | 7B Params | | ------------------------------ | ------------- | ----------- | --------- | | RealToxicity | average | 6.86 | 7.90 | | BOLD | | 45.57 | 49.08 | | CrowS-Pairs | top-1 | 45.82 | 51.33 | | BBQ Ambig | 1-shot, top-1 | 62.58 | 92.54 | | BBQ Disambig | top-1 | 54.62 | 71.99 | | Winogender | top-1 | 51.25 | 54.17 | | TruthfulQA | | 44.84 | 31.81 | | Winobias 1_2 | | 56.12 | 59.09 | | Winobias 2_2 | | 91.10 | 92.23 | | Toxigen | | 29.77 | 39.59 | | ------------------------------ | ------------- | ----------- | --------- | ## Usage and Limitations These models have certain limitations that users should be aware of. ### Intended Usage Open Large Language Models (LLMs) have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. * Content Creation and Communication * Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy, and email drafts. * Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications. * Text Summarization: Generate concise summaries of a text corpus, research papers, or reports. * Research and Education * Natural Language Processing (NLP) Research: These models can serve as a foundation for researchers to experiment with NLP techniques, develop algorithms, and contribute to the advancement of the field. * Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice. * Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics. ### Limitations * Training Data * The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses. * The scope of the training dataset determines the subject areas the model can handle effectively. * Context and Task Complexity * LLMs are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. * A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point). * Language Ambiguity and Nuance * Natural language is inherently complex. LLMs might struggle to grasp subtle nuances, sarcasm, or figurative language. * Factual Accuracy * LLMs generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. * Common Sense * LLMs rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations. ### Ethical Considerations and Risks The development of large language models (LLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: * Bias and Fairness * LLMs trained on large-scale, real-world text data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. * Misinformation and Misuse * LLMs can be misused to generate text that is false, misleading, or harmful. * Guidelines are provided for responsible use with the model, see the Responsible Generative AI Toolkit. * Transparency and Accountability: * This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. * A responsibly developed open model offers the opportunity to share innovation by making LLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: * Perpetuation of biases: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. * Generation of harmful content: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. * Misuse for malicious purposes: Technical limitations and developer and end-user education can help mitigate against malicious applications of LLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the Gemma Prohibited Use Policy. * Privacy violations: Models were trained on data filtered for removal of PII (Personally Identifiable Information). Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Benefits At the time of release, this family of models provides high-performance open large language model implementations designed from the ground up for Responsible AI development compared to similarly sized models. Using the benchmark evaluation metrics described in this document, these models have shown to provide superior performance to other, comparably-sized open model alternatives.", + "model_explanation_gemini": "\"google_gemma-7b-it\" is a lightweight, instruction-tuned, decoder-only LLM for English text generation tasks like question answering, summarization, and reasoning, designed for efficient deployment on limited-resource environments. \n\n**Features**: \n- Text-to-text generation \n- Instruction-tuned variant (optimized for conversational use) \n- Supports fine-tuning (QLoRA, FSDP on TPU) \n- Compatible with CPU/GPU (multiple precisions: b" +} \ No newline at end of file diff --git a/model_data_json/google_long-t5-tglobal-base.json b/model_data_json/google_long-t5-tglobal-base.json new file mode 100644 index 0000000000000000000000000000000000000000..effc18a4482a634841c847832d388f583c31a5c5 --- /dev/null +++ b/model_data_json/google_long-t5-tglobal-base.json @@ -0,0 +1,21 @@ +{ + "model_id": "google/long-t5-tglobal-base", + "downloads": 73842, + "tags": [ + "transformers", + "pytorch", + "jax", + "longt5", + "text2text-generation", + "en", + "arxiv:2112.07916", + "arxiv:1912.08777", + "arxiv:1910.10683", + "license:apache-2.0", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: apache-2.0 language: en --- # LongT5 (transient-global attention, base-sized model) LongT5 model pre-trained on English language. The model was introduced in the paper LongT5: Efficient Text-To-Text Transformer for Long Sequences by Guo et al. and first released in the LongT5 repository. All the model architecture and configuration can be found in Flaxformer repository which uses another Google research project repository T5x. Disclaimer: The team releasing LongT5 did not write a model card for this model so this model card has been written by the Hugging Face team. ## Model description LongT5 model is an encoder-decoder transformer pre-trained in a text-to-text denoising generative setting (Pegasus-like generation pre-training). LongT5 model is an extension of T5 model, and it enables using one of the two different efficient attention mechanisms - (1) Local attention, or (2) Transient-Global attention. The usage of attention sparsity patterns allows the model to efficiently handle input sequence. LongT5 is particularly effective when fine-tuned for text generation (summarization, question answering) which requires handling long input sequences (up to 16,384 tokens). ## Intended uses & limitations The model is mostly meant to be fine-tuned on a supervised dataset. See the model hub to look for fine-tuned versions on a task that interests you. ### How to use ### BibTeX entry and citation info", + "model_explanation_gemini": "\"LongT5-tglobal-base is a text-to-text transformer model optimized for handling long sequences up to 16,384 tokens, using transient-global attention for efficient text generation tasks like summarization and question answering.\"\n\nFeatures: \n- Base-sized encoder-decoder transformer \n- Pre-trained with Pegasus-like denoising generation \n- Uses transient-global attention mechanism \n- Handles long input sequences (16K tokens) \n- Optimized for text generation tasks \n\nComparison: \nExtends" +} \ No newline at end of file diff --git a/model_data_json/google_mobilenet_v2_1.0_224.json b/model_data_json/google_mobilenet_v2_1.0_224.json new file mode 100644 index 0000000000000000000000000000000000000000..74e5ade6d5e25bd033e51e93191d35852292f21d --- /dev/null +++ b/model_data_json/google_mobilenet_v2_1.0_224.json @@ -0,0 +1,20 @@ +{ + "model_id": "google/mobilenet_v2_1.0_224", + "downloads": 72081, + "tags": [ + "transformers", + "pytorch", + "safetensors", + "mobilenet_v2", + "image-classification", + "vision", + "dataset:imagenet-1k", + "arxiv:1801.04381", + "license:other", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: other tags: - vision - image-classification datasets: - imagenet-1k widget: - src: example_title: Tiger - src: example_title: Teapot - src: example_title: Palace --- # MobileNet V2 MobileNet V2 model pre-trained on ImageNet-1k at resolution 224x224. It was introduced in MobileNetV2: Inverted Residuals and Linear Bottlenecks by Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, Liang-Chieh Chen. It was first released in this repository. Disclaimer: The team releasing MobileNet V2 did not write a model card for this model so this model card has been written by the Hugging Face team. ## Model description From the original README: > MobileNets are small, low-latency, low-power models parameterized to meet the resource constraints of a variety of use cases. They can be built upon for classification, detection, embeddings and segmentation similar to how other popular large scale models, such as Inception, are used. MobileNets can be run efficiently on mobile devices [...] MobileNets trade off between latency, size and accuracy while comparing favorably with popular models from the literature. The checkpoints are named **mobilenet\\_v2\\_*depth*\\_*size***, for example **mobilenet\\_v2\\_1.0\\_224**, where **1.0** is the depth multiplier and **224** is the resolution of the input images the model was trained on. ## Intended uses & limitations You can use the raw model for image classification. See the model hub to look for fine-tuned versions on a task that interests you. ### How to use Here is how to use this model to classify an image of the COCO 2017 dataset into one of the 1,000 ImageNet classes: Note: This model actually predicts 1001 classes, the 1000 classes from ImageNet plus an extra “background” class (index 0). Currently, both the feature extractor and model support PyTorch. ### BibTeX entry and citation info", + "model_explanation_gemini": "\"MobileNet V2 is a lightweight vision model pre-trained on ImageNet-1k for efficient image classification at 224x224 resolution, optimized for mobile devices with trade-offs between latency, size, and accuracy.\"\n\nModel Features: \n- Architecture: MobileNet V2 (Inverted Residuals, Linear Bottlenecks) \n- Input Resolution: 224x224 \n- Pretraining Dataset: ImageNet-1k (1000 classes + 1 background class) \n- Use Case" +} \ No newline at end of file diff --git a/model_data_json/google_paligemma-3b-mix-224.json b/model_data_json/google_paligemma-3b-mix-224.json new file mode 100644 index 0000000000000000000000000000000000000000..215690166893e21fbef6d288d5f38c81636db1cf --- /dev/null +++ b/model_data_json/google_paligemma-3b-mix-224.json @@ -0,0 +1,36 @@ +{ + "model_id": "google/paligemma-3b-mix-224", + "downloads": 176245, + "tags": [ + "transformers", + "safetensors", + "paligemma", + "image-text-to-text", + "arxiv:2310.09199", + "arxiv:2303.15343", + "arxiv:2403.08295", + "arxiv:1706.03762", + "arxiv:2010.11929", + "arxiv:2209.06794", + "arxiv:2209.04372", + "arxiv:2103.01913", + "arxiv:2205.12522", + "arxiv:2110.11624", + "arxiv:2108.03353", + "arxiv:2010.04295", + "arxiv:2401.06209", + "arxiv:2305.10355", + "arxiv:2203.10244", + "arxiv:1810.12440", + "arxiv:1905.13648", + "arxiv:1608.00272", + "arxiv:1908.04913", + "arxiv:2407.07726", + "license:gemma", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- library_name: transformers license: gemma pipeline_tag: image-text-to-text extra_gated_heading: Access PaliGemma on Hugging Face extra_gated_prompt: To access PaliGemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged-in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license --- # PaliGemma model card **Model page:** PaliGemma Transformers PaliGemma 3B weights, fine-tuned with 224*224 input images and 256 token input/output text sequences on a mixture of downstream academic datasets. The models are available in float32, bfloat16 and float16 format for research purposes only. **Resources and technical documentation:** * Responsible Generative AI Toolkit * PaliGemma on Kaggle * PaliGemma on Vertex Model Garden **Terms of Use:** Terms **Authors:** Google ## Model information ### Model summary #### Description PaliGemma is a versatile and lightweight vision-language model (VLM) inspired by PaLI-3 and based on open components such as the SigLIP vision model and the Gemma language model. It takes both image and text as input and generates text as output, supporting multiple languages. It is designed for class-leading fine-tune performance on a wide range of vision-language tasks such as image and short video caption, visual question answering, text reading, object detection and object segmentation. #### Model architecture PaliGemma is the composition of a Transformer decoder and a Vision Transformer image encoder, with a total of 3 billion params. The text decoder is initialized from Gemma-2B. The image encoder is initialized from SigLIP-So400m/14. PaliGemma is trained following the PaLI-3 recipes. #### Inputs and outputs * **Input:** Image and text string, such as a prompt to caption the image, or a question. * **Output:** Generated text in response to the input, such as a caption of the image, an answer to a question, a list of object bounding box coordinates, or segmentation codewords. ### Model data #### Pre-train datasets PaliGemma is pre-trained on the following mixture of datasets: * **WebLI:** WebLI (Web Language Image) is a web-scale multilingual image-text dataset built from the public web. A wide range of WebLI splits are used to acquire versatile model capabilities, such as visual semantic understanding, object localization, visually-situated text understanding, multilinguality, etc. * **CC3M-35L:** Curated English image-alt_text pairs from webpages (Sharma et al., 2018). We used the Google Cloud Translation API to translate into 34 additional languages. * **VQ²A-CC3M-35L/VQG-CC3M-35L:** A subset of VQ2A-CC3M (Changpinyo et al., 2022a), translated into the same additional 34 languages as CC3M-35L, using the Google Cloud Translation API. * **OpenImages:** Detection and object-aware questions and answers (Piergiovanni et al. 2022) generated by handcrafted rules on the [OpenImages dataset]. * **WIT:** Images and texts collected from Wikipedia (Srinivasan et al., 2021). [OpenImages dataset]: #### Data responsibility filtering The following filters are applied to WebLI, with the goal of training PaliGemma on clean data: * **Pornographic image filtering:** This filter removes images deemed to be of pornographic nature. * **Text safety filtering:** We identify and filter out images that are paired with unsafe text. Unsafe text is any text deemed to contain or be about CSAI, pornography, vulgarities, or otherwise offensive. * **Text toxicity filtering:** We further use the Perspective API to identify and filter out images that are paired with text deemed insulting, obscene, hateful or otherwise toxic. * **Text personal information filtering:** We filtered certain personal information and other sensitive data using Cloud Data Loss Prevention (DLP) API to protect the privacy of individuals. Identifiers such as social security numbers and [other sensitive information types] were removed. * **Additional methods:** Filtering based on content quality and safety in line with our policies and practices. [other sensitive information types]: ## How to Use PaliGemma is a single-turn vision language model not meant for conversational use, and it works best when fine-tuning to a specific use case. You can configure which task the model will solve by conditioning it with task prefixes, such as “detect” or “segment”. The pretrained models were trained in this fashion to imbue them with a rich set of capabilities (question answering, captioning, segmentation, etc.). However, they are not designed to be used directly, but to be transferred (by fine-tuning) to specific tasks using a similar prompt structure. For interactive testing, you can use the \"mix\" family of models, which have been fine-tuned on a mixture of tasks. To see model google/paligemma-3b-mix-448 in action, check this Space that uses the Transformers codebase. Please, refer to the usage and limitations section for intended use cases, or visit the blog post for additional details and examples. ## Use in Transformers The following snippets use model for reference purposes. The model in this repo you are now browsing may have been trained for other tasks, please make sure you use appropriate inputs for the task at hand. ### Running the default precision () on CPU Output: ### Running other precisions on CUDA For convenience, the repos contain revisions of the weights already converted to and , so you can use them to reduce the download size and avoid casting on your local computer. This is how you'd run on an nvidia CUDA card. ### Loading in 4-bit / 8-bit You need to install to automatically run inference using 8-bit or 4-bit precision: ## Implementation information ### Hardware PaliGemma was trained using the latest generation of Tensor Processing Unit (TPU) hardware (TPUv5e). ### Software Training was done using JAX, Flax, TFDS and []( JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. TFDS is used to access datasets and Flax is used for model architecture. The PaliGemma fine-tune code and inference code are released in the GitHub repository. ## Evaluation information ### Benchmark results In order to verify the transferability of PaliGemma to a wide variety of academic tasks, we fine-tune the pretrained models on each task. Additionally we train the mix model with a mixture of the transfer tasks. We report results on different resolutions to provide an impression of which tasks benefit from increased resolution. Importantly, none of these tasks or datasets are part of the pretraining data mixture, and their images are explicitly removed from the web-scale pre-training data. #### Single task (fine-tune on single task)
Benchmark
(train split)
Metric
(split)
pt-224 pt-448 pt-896
Captioning

(train+restval)
CIDEr (val) 141.92 144.60
captions transfer) CIDEr (val) 121.72 123.58
CIDEr dev
(en/avg-34/avg)
139.2
115.8
116.4
141.2
118.0
118.6
CIDEr dev
(en/avg-34/avg)
78.1
41.3
42.4
80.0
41.9
42.9
CIDEr (val) 127.48 153.94
(train+val) CIDEr/BLEU-4
(test)
162.25
0.192
181.49
0.211
CIDEr (test) 117.57 119.59

(train+dev)
CIDEr (test) 136.07 148.36
Question answering
Accuracy
(Test server - std)
83.19 85.64
Paired Accuracy 47.33 45.33
Accuracy
(random/popular/
adversarial)
87.80
85.87
84.27
88.23
86.77
85.90
Accuracy (val) 63.54 63.15
(train+val) Accuracy
(Test server)
76.37 76.90
(train+val) Accuracy
(Test server)
61.85 63.22
Accuracy
(testdev balanced)
65.61 67.03
Mean Accuracy
(bn, de, en, id,
ko, pt, ru, zh)
58.37 59.07
Accuracy (test) 90.02 88.93
Mean Accuracy
(test)
(id, sw, ta, tr, zh)
80.57 76.78
Accuracy (test) 72.12 73.28
(train+val) Accuracy (test) 95.39 95.93
(train+val) Mean Accuracy
(test)
92.65 93.11
(train+val) Mean Accuracy
(test/test2)
92.61
90.58
92.79
90.54
Mean Relaxed
Accuracy
(test_human,
test_aug)
57.08 71.36

(train+val)
Accuracy
(Test server - std)
73.7 75.52
Accuracy
(test_simple/
test_complex)
81.72
69.56
84.86
72.27
Accuracy (test) 72.32 74.61 74.93
Accuracy
(Test server - std)
55.47 73.15 76.48
ANLS (Test server) 43.74 78.02 84.77

(train+val)
ANLS (Test server) 28.46 40.47 47.75

(train+val)
ANLS (Test server) 63.29 81.82 84.40
Segmentation
refcocog excluding val
and test images)
MIoU
(validation)
refcoco/refcoco+/
refcocog
73.40
68.32
67.65
75.57
69.76
70.17
76.94
72.18
72.22
Video tasks (Caption/QA)
MSR-VTT (Captioning) CIDEr (test) 70.54
MSR-VTT (QA) Accuracy (test) 50.09
ActivityNet (Captioning) CIDEr (test) 34.62
ActivityNet (QA) Accuracy (test) 50.78
VATEX (Captioning) CIDEr (test) 79.73
MSVD (QA) Accuracy (test) 60.22
#### Mix model (fine-tune on mixture of transfer tasks)
Benchmark Metric (split) mix-224 mix-448
Paired Accuracy 46.00 45.33
Accuracy
(random/popular/adversarial)
88.00
86.63
85.67
89.37
88.40
87.47
## Ethics and safety ### Evaluation approach Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including: * Human evaluation on prompts covering child safety, content safety and representational harms. See the Gemma model card for more details on evaluation approach, but with image captioning and visual question answering setups. * Image-to-Text benchmark evaluation: Benchmark against relevant academic datasets such as FairFace Dataset (Karkkainen et al., 2021). ### Evaluation results * The human evaluation results of ethics and safety evaluations are within acceptable thresholds for meeting internal policies for categories such as child safety, content safety and representational harms. * On top of robust internal evaluations, we also use the Perspective API (threshold of 0.8) to measure toxicity, profanity, and other potential issues in the generated captions for images sourced from the FairFace dataset. We report the maximum and median values observed across subgroups for each of the perceived gender, ethnicity, and age attributes.
Metric Perceived
gender
Ethnicity Age group
Maximum Median Maximum Median Maximum Median
Toxicity 0.04% 0.03% 0.08% 0.00% 0.09% 0.00%
Identity Attack 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
Insult 0.06% 0.04% 0.09% 0.07% 0.16% 0.00%
Threat 0.06% 0.05% 0.14% 0.05% 0.17% 0.00%
Profanity 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
## Usage and limitations ### Intended usage Open Vision Language Models (VLMs) have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development. Fine-tune on specific vision-language task: * The pre-trained models can be fine-tuned on a wide range of vision-language tasks such as: image captioning, short video caption, visual question answering, text reading, object detection and object segmentation. * The pre-trained models can be fine-tuned for specific domains such as remote sensing question answering, visual questions from people who are blind, science question answering, describe UI element functionalities. * The pre-trained models can be fine-tuned for tasks with non-textual outputs such as bounding boxes or segmentation masks. Vision-language research: * The pre-trained models and fine-tuned models can serve as a foundation for researchers to experiment with VLM techniques, develop algorithms, and contribute to the advancement of the field. ### Ethical considerations and risks The development of vision-language models (VLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following: * Bias and Fairness * VLMs trained on large-scale, real-world image-text data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card. * Misinformation and Misuse * VLMs can be misused to generate text that is false, misleading, or harmful. * Guidelines are provided for responsible use with the model, see the Responsible Generative AI Toolkit. * Transparency and Accountability * This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. * A responsibly developed open model offers the opportunity to share innovation by making VLM technology accessible to developers and researchers across the AI ecosystem. Risks identified and mitigations: * **Perpetuation of biases:** It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training, fine-tuning, and other use cases. * **Generation of harmful content:** Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases. * **Misuse for malicious purposes:** Technical limitations and developer and end-user education can help mitigate against malicious applications of LLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in the Gemma Prohibited Use Policy. * **Privacy violations:** Models were trained on data filtered to remove certain personal information and sensitive data. Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques. ### Limitations * Most limitations inherited from the underlying Gemma model still apply: * VLMs are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging. * Natural language is inherently complex. VLMs might struggle to grasp subtle nuances, sarcasm, or figurative language. * VLMs generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements. * VLMs rely on statistical patterns in language and images. They might lack the ability to apply common sense reasoning in certain situations. * PaliGemma was designed first and foremost to serve as a general pre-trained model for transfer to specialized tasks. Hence, its \"out of the box\" or \"zero-shot\" performance might lag behind models designed specifically for that. * PaliGemma is not a multi-turn chatbot. It is designed for a single round of image and text input. ## Citation Find the paper here.", + "model_explanation_gemini": "\"PaliGemma is a 3B-parameter vision-language model that processes images and text inputs to generate text outputs, excelling at tasks like captioning, visual QA, text reading, object detection, and segmentation across multiple languages.\"\n\nFeatures: \n- Vision-language model (VLM) combining SigLIP vision encoder and Gemma-2B text decoder \n- Handles 224×224 resolution images with 256-token text sequences \n- Multilingual text generation \n- Supports" +} \ No newline at end of file diff --git a/model_data_json/google_t5-v1_1-xl.json b/model_data_json/google_t5-v1_1-xl.json new file mode 100644 index 0000000000000000000000000000000000000000..52bd69e9c62c1000d457e3b0a0074e723a90ed9e --- /dev/null +++ b/model_data_json/google_t5-v1_1-xl.json @@ -0,0 +1,22 @@ +{ + "model_id": "google/t5-v1_1-xl", + "downloads": 77361, + "tags": [ + "transformers", + "pytorch", + "tf", + "t5", + "text2text-generation", + "en", + "dataset:c4", + "arxiv:2002.05202", + "arxiv:1910.10683", + "license:apache-2.0", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: en datasets: - c4 license: apache-2.0 --- Google's T5 Version 1.1 ## Version 1.1 T5 Version 1.1 includes the following improvements compared to the original T5 model- GEGLU activation in feed-forward hidden layer, rather than ReLU - see here. - Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning. - Pre-trained on C4 only without mixing in the downstream tasks. - no parameter sharing between embedding and classifier layer - \"xl\" and \"xxl\" replace \"3B\" and \"11B\". The model shapes are a bit different - larger and smaller and . **Note**: T5 Version 1.1 was only pre-trained on C4 excluding any supervised training. Therefore, this model has to be fine-tuned before it is useable on a downstream task. Pretraining Dataset: C4 Other Community Checkpoints: here Paper: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu* ## Abstract Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code. !model image", + "model_explanation_gemini": "\"Google's T5-v1_1-xl is a text-to-text transformer model requiring fine-tuning for downstream NLP tasks, pretrained on C4 with improved architecture like GEGLU activation and no dropout during pretraining.\"\n\nModel Features:\n- GEGLU activation (replaces ReLU)\n- No dropout during pretraining (enabled during fine-tuning)\n- Pretrained exclusively on C4 dataset\n- No parameter sharing between embedding/classifier layers\n- XL size variant (replaces original" +} \ No newline at end of file diff --git a/model_data_json/google_vit-large-patch16-224-in21k.json b/model_data_json/google_vit-large-patch16-224-in21k.json new file mode 100644 index 0000000000000000000000000000000000000000..8219e110f1bc2c3b7443997e7636ead2727c6425 --- /dev/null +++ b/model_data_json/google_vit-large-patch16-224-in21k.json @@ -0,0 +1,21 @@ +{ + "model_id": "google/vit-large-patch16-224-in21k", + "downloads": 82508, + "tags": [ + "transformers", + "pytorch", + "tf", + "jax", + "safetensors", + "vit", + "image-feature-extraction", + "vision", + "dataset:imagenet-21k", + "arxiv:2010.11929", + "arxiv:2006.03677", + "license:apache-2.0", + "region:us" + ], + "description": "--- license: apache-2.0 tags: - vision datasets: - imagenet-21k inference: false --- # Vision Transformer (large-sized model) Vision Transformer (ViT) model pre-trained on ImageNet-21k (14 million images, 21,843 classes) at resolution 224x224. It was introduced in the paper An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale by Dosovitskiy et al. and first released in this repository. However, the weights were converted from the timm repository by Ross Wightman, who already converted the weights from JAX to PyTorch. Credits go to him. Disclaimer: The team releasing ViT did not write a model card for this model so this model card has been written by the Hugging Face team. ## Model description The Vision Transformer (ViT) is a transformer encoder model (BERT-like) pretrained on a large collection of images in a supervised fashion, namely ImageNet-21k, at a resolution of 224x224 pixels. Images are presented to the model as a sequence of fixed-size patches (resolution 16x16), which are linearly embedded. One also adds a [CLS] token to the beginning of a sequence to use it for classification tasks. One also adds absolute position embeddings before feeding the sequence to the layers of the Transformer encoder. Note that this model does not provide any fine-tuned heads, as these were zero'd by Google researchers. However, the model does include the pre-trained pooler, which can be used for downstream tasks (such as image classification). By pre-training the model, it learns an inner representation of images that can then be used to extract features useful for downstream tasks: if you have a dataset of labeled images for instance, you can train a standard classifier by placing a linear layer on top of the pre-trained encoder. One typically places a linear layer on top of the [CLS] token, as the last hidden state of this token can be seen as a representation of an entire image. ## Intended uses & limitations You can use the raw model to embed images, but it's mostly intended to be fine-tuned on a downstream task. ### How to use Here is how to use this model: Currently, both the feature extractor and model support PyTorch. Tensorflow and JAX/FLAX are coming soon, and the API of ViTFeatureExtractor might change. ## Training data The ViT model was pretrained on ImageNet-21k, a dataset consisting of 14 million images and 21k classes. ## Training procedure ### Preprocessing The exact details of preprocessing of images during training/validation can be found here. Images are resized/rescaled to the same resolution (224x224) and normalized across the RGB channels with mean (0.5, 0.5, 0.5) and standard deviation (0.5, 0.5, 0.5). ### Pretraining The model was trained on TPUv3 hardware (8 cores). All model variants are trained with a batch size of 4096 and learning rate warmup of 10k steps. For ImageNet, the authors found it beneficial to additionally apply gradient clipping at global norm 1. Pre-training resolution is 224. ## Evaluation results For evaluation results on several image classification benchmarks, we refer to tables 2 and 5 of the original paper. Note that for fine-tuning, the best results are obtained with a higher resolution (384x384). Of course, increasing the model size will result in better performance. ### BibTeX entry and citation info", + "model_explanation_gemini": "A large Vision Transformer (ViT) model pre-trained on ImageNet-21k for image feature extraction, designed for fine-tuning on downstream vision tasks like classification.\n\n**Features:** \n- Architecture: Transformer encoder (BERT-like) \n- Pre-trained on: ImageNet-21k (14M images, 21,843 classes) \n- Input resolution: 224x224 pixels \n- Patch size: 16x16 \n- Includes [CLS] token for classification tasks \n-" +} \ No newline at end of file diff --git a/model_data_json/h2oai_h2ogpt-4096-llama2-7b-chat.json b/model_data_json/h2oai_h2ogpt-4096-llama2-7b-chat.json new file mode 100644 index 0000000000000000000000000000000000000000..f3888b581c47d9d6f793dc6b8f03830591ed8b6b --- /dev/null +++ b/model_data_json/h2oai_h2ogpt-4096-llama2-7b-chat.json @@ -0,0 +1,22 @@ +{ + "model_id": "h2oai/h2ogpt-4096-llama2-7b-chat", + "downloads": 76657, + "tags": [ + "transformers", + "pytorch", + "safetensors", + "llama", + "text-generation", + "facebook", + "meta", + "llama-2", + "h2ogpt", + "en", + "license:llama2", + "autotrain_compatible", + "text-generation-inference", + "region:us" + ], + "description": "--- inference: false language: - en license: llama2 model_type: llama pipeline_tag: text-generation tags: - facebook - meta - pytorch - llama - llama-2 - h2ogpt --- h2oGPT clone of Meta's Llama 2 7B Chat. Try it live on our h2oGPT demo with side-by-side LLM comparisons and private document chat! See how it compares to other models on our LLM Leaderboard! See more at H2O.ai ## Model Architecture", + "model_explanation_gemini": "A clone of Meta's Llama 2 7B Chat designed for text generation, offering side-by-side LLM comparisons and private document chat capabilities.\n\nFeatures: \n- Text generation \n- Based on Meta's Llama 2 7B architecture \n- Supports private document chat \n- Allows comparison with other models \n\nComparison: \nPositioned as an alternative to Meta's original Llama 2 7B Chat, with added features like live comparisons and document chat, as showcased in H" +} \ No newline at end of file diff --git a/model_data_json/huggingface_CodeBERTa-small-v1.json b/model_data_json/huggingface_CodeBERTa-small-v1.json new file mode 100644 index 0000000000000000000000000000000000000000..174678de8f0214b81871e4d9329eb5e5a5a6e8ac --- /dev/null +++ b/model_data_json/huggingface_CodeBERTa-small-v1.json @@ -0,0 +1,20 @@ +{ + "model_id": "huggingface/CodeBERTa-small-v1", + "downloads": 70821, + "tags": [ + "transformers", + "pytorch", + "tf", + "jax", + "roberta", + "fill-mask", + "code", + "dataset:code_search_net", + "arxiv:1909.09436", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: code thumbnail: datasets: - code_search_net --- # CodeBERTa CodeBERTa is a RoBERTa-like model trained on the CodeSearchNet dataset from GitHub. Supported languages: The **tokenizer** is a Byte-level BPE tokenizer trained on the corpus using Hugging Face . Because it is trained on a corpus of code (vs. natural language), it encodes the corpus efficiently (the sequences are between 33% to 50% shorter, compared to the same corpus tokenized by gpt2/roberta). The (small) **model** is a 6-layer, 84M parameters, RoBERTa-like Transformer model – that’s the same number of layers & heads as DistilBERT – initialized from the default initialization settings and trained from scratch on the full corpus (~2M functions) for 5 epochs. ### Tensorboard for this training ⤵️ Results: > This program can auto-complete itself! 😱 ### Just for fun, let's try to mask natural language (not code): This (kind of) works because code contains comments (which contain natural language). Of course, the most frequent name for a Computer scientist must be undefined 🤓. ## Downstream task: programming language identification See the model card for **[]( 🤯.
## CodeSearchNet citation
", + "model_explanation_gemini": "A RoBERTa-like model trained on GitHub code for efficient code tokenization and programming language identification, with shorter sequences compared to gpt2/roberta.\n\nFeatures: \n- Trained on CodeSearchNet dataset (GitHub code) \n- Byte-level BPE tokenizer optimized for code (33-50% shorter sequences than gpt2/roberta) \n- 6-layer, 84M parameter RoBERTa-like architecture \n- Supports code auto-completion and masked language modeling" +} \ No newline at end of file diff --git a/model_data_json/huihui-ai_Llama-3.3-70B-Instruct-abliterated.json b/model_data_json/huihui-ai_Llama-3.3-70B-Instruct-abliterated.json new file mode 100644 index 0000000000000000000000000000000000000000..60a4ecb2208a66c2030707e600caffb1c4f1532e --- /dev/null +++ b/model_data_json/huihui-ai_Llama-3.3-70B-Instruct-abliterated.json @@ -0,0 +1,34 @@ +{ + "model_id": "huihui-ai/Llama-3.3-70B-Instruct-abliterated", + "downloads": 71455, + "tags": [ + "transformers", + "safetensors", + "llama", + "text-generation", + "facebook", + "meta", + "pytorch", + "llama-3", + "abliterated", + "uncensored", + "conversational", + "en", + "fr", + "it", + "pt", + "hi", + "es", + "th", + "de", + "base_model:meta-llama/Llama-3.3-70B-Instruct", + "base_model:finetune:meta-llama/Llama-3.3-70B-Instruct", + "license:llama3.3", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- library_name: transformers language: - en - fr - it - pt - hi - es - th - de base_model: - meta-llama/Llama-3.3-70B-Instruct tags: - facebook - meta - pytorch - llama - llama-3 - abliterated - uncensored extra_gated_prompt: \"### LLAMA 3.3 COMMUNITY LICENSE AGREEMENT\\nLlama 3.3 Version Release Date: December 6, 2024\\n\\\"Agreement\\\" means the terms and conditions for use, reproduction, distribution and modification of the Llama Materials set forth herein.\\n\\\"Documentation\\\" means the specifications, manuals and documentation accompanying Llama 3.3 distributed by Meta at or \\\"you\\\" means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity’s behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf.\\n\\\"Llama 3.3\\\" means the foundational large language models and software and algorithms, including machine-learning model code, trained model weights, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing distributed by Meta at Materials\\\" means, collectively, Meta’s proprietary Llama 3.3 and Documentation (and any portion thereof) made available under this Agreement.\\n\\\"Meta\\\" or \\\"we\\\" means Meta Platforms Ireland Limited (if you are located in or, if you are an entity, your principal place of business is in the EEA or Switzerland) and Meta Platforms, Inc. (if you are located outside of the EEA or Switzerland).\\nBy clicking “I Accept” below or by using or distributing any portion or element of the Llama Materials, you agree to be bound by this Agreement.\\n1. License Rights and Redistribution.\\na. Grant of Rights. You are granted a non-exclusive, worldwide, non-transferable and royalty-free limited license under Meta’s intellectual property or other rights owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy, create derivative works of, and make modifications to the Llama Materials.\\nb. Redistribution and Use.\\ni. If you distribute or make available the Llama Materials (or any derivative works thereof), or a product or service (including another AI model) that contains any of them, you shall (A) provide a copy of this Agreement with any such Llama Materials; and (B) prominently display “Built with Llama” on a related website, user interface, blogpost, about page, or product documentation. If you use the Llama Materials or any outputs or results of the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include “Llama” at the beginning of any such AI model name.\\nii. If you receive Llama Materials, or any derivative works thereof, from a Licensee as part of an integrated end user product, then Section 2 of this Agreement will not apply to you.\\_\\niii. You must retain in all copies of the Llama Materials that you distribute the following attribution notice within a “Notice” text file distributed as a part of such copies: “Llama 3.3 is licensed under the Llama 3.3 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.”\\niv. Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the Llama Materials (available at which is hereby incorporated by reference into this Agreement. \\n2. Additional Commercial Terms. If, on the Llama 3.3 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights.\\n3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN “AS IS” BASIS, WITHOUT WARRANTIES OF ANY KIND, AND META DISCLAIMS ALL WARRANTIES OF ANY KIND, BOTH EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS.\\n4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING.\\n5. Intellectual Property.\\na. No trademark licenses are granted under this Agreement, and in connection with the Llama Materials, neither Meta nor Licensee may use any name or mark owned by or associated with the other or any of its affiliates, except as required for reasonable and customary use in describing and redistributing the Llama Materials or as set forth in this Section 5(a). Meta hereby grants you a license to use “Llama” (the “Mark”) solely as required to comply with the last sentence of Section 1.b.i. You will comply with Meta’s brand guidelines (currently accessible at All goodwill arising out of your use of the Mark will inure to the benefit of Meta.\\nb. Subject to Meta’s ownership of Llama Materials and derivatives made by or for Meta, with respect to any derivative works and modifications of the Llama Materials that are made by you, as between you and Meta, you are and will be the owner of such derivative works and modifications.\\nc. If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 3.3 outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials.\\n6. Term and Termination. The term of this Agreement will commence upon your acceptance of this Agreement or access to the Llama Materials and will continue in full force and effect until terminated in accordance with the terms and conditions herein. Meta may terminate this Agreement if you are in breach of any term or condition of this Agreement. Upon termination of this Agreement, you shall delete and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the termination of this Agreement.\\n7. Governing Law and Jurisdiction. This Agreement will be governed and construed under the laws of the State of California without regard to choice of law principles, and the UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement. The courts of California shall have exclusive jurisdiction of any dispute arising out of this Agreement.\\n### Llama 3.3 Acceptable Use Policy\\nMeta is committed to promoting safe and fair use of its tools and features, including Llama 3.3. If you access or use Llama 3.3, you agree to this Acceptable Use Policy (“**Policy**”). The most recent copy of this policy can be found at Uses\\nWe want everyone to use Llama 3.3 safely and responsibly. You agree you will not use, or allow others to use, Llama 3.3 to:\\n1. Violate the law or others’ rights, including to:\\n\\n 1. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as: \\n 1. Violence or terrorism \\n 2. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material \\n 3. Human trafficking, exploitation, and sexual violence \\n 4. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials. \\n 5. Sexual solicitation \\n 6. Any other criminal activity\\n\\n 2. Engage in, promote, incite, or facilitate the harassment, abuse, threatening, or bullying of individuals or groups of individuals\\n\\n 3. Engage in, promote, incite, or facilitate discrimination or other unlawful or harmful conduct in the provision of employment, employment benefits, credit, housing, other economic benefits, or other essential goods and services\\n\\n 4. Engage in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or related professional practices\\n\\n 5. Collect, process, disclose, generate, or infer private or sensitive information about individuals, including information about individuals’ identity, health, or demographic information, unless you have obtained the right to do so in accordance with applicable law\\n\\n 6. Engage in or facilitate any action or generate any content that infringes, misappropriates, or otherwise violates any third-party rights, including the outputs or results of any products or services using the Llama Materials\\n\\n 7. Create, generate, or facilitate the creation of malicious code, malware, computer viruses or do anything else that could disable, overburden, interfere with or impair the proper working, integrity, operation or appearance of a website or computer system\\n\\n 8. Engage in any action, or facilitate any action, to intentionally circumvent or remove usage restrictions or other safety measures, or to enable functionality disabled by Meta\\n\\n2. Engage in, promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals, including use of Llama 3.3 related to the following:\\n\\n 1. Military, warfare, nuclear industries or applications, espionage, use for materials or activities that are subject to the International Traffic Arms Regulations (ITAR) maintained by the United States Department of State or to the U.S. Biological Weapons Anti-Terrorism Act of 1989 or the Chemical Weapons Convention Implementation Act of 1997\\n\\n 2. Guns and illegal weapons (including weapon development)\\n\\n 3. Illegal drugs and regulated/controlled substances\\n\\n 4. Operation of critical infrastructure, transportation technologies, or heavy machinery\\n\\n 5. Self-harm or harm to others, including suicide, cutting, and eating disorders\\n\\n 6. Any content intended to incite or promote violence, abuse, or any infliction of bodily harm to an individual\\n\\n3. Intentionally deceive or mislead others, including use of Llama 3.3 related to the following:\\n\\n 1. Generating, promoting, or furthering fraud or the creation or promotion of disinformation\\n\\n 2. Generating, promoting, or furthering defamatory content, including the creation of defamatory statements, images, or other content\\n\\n 3. Generating, promoting, or further distributing spam\\n\\n 4. Impersonating another individual without consent, authorization, or legal right\\n\\n 5. Representing that the use of Llama 3.3 or outputs are human-generated\\n\\n 6. Generating or facilitating false online engagement, including fake reviews and other means of fake online engagement\\n\\n4. Fail to appropriately disclose to end users any known dangers of your AI system\\n5. Interact with third party tools, models, or software designed to generate unlawful content or engage in unlawful or harmful conduct and/or represent that the outputs of such tools, models, or software are associated with Meta or Llama 3.3\\nWith respect to any multimodal models included in Llama 3.3, the rights granted under Section 1(a) of the Llama 3.3 Community License Agreement are not being granted to you if you are an individual domiciled in, or a company with a principal place of business in, the European Union. This restriction does not apply to end users of a product or service that incorporates any such multimodal models.\\nPlease report any violation of this Policy, software “bug,” or other problems that could lead to a violation of this Policy through one of the following means:\\n* Reporting issues with the model: * Reporting risky content generated by the model: developers.facebook.com/llama\\\\_output\\\\_feedback * Reporting bugs and security concerns: facebook.com/whitehat/info * Reporting violations of the Acceptable Use Policy or unlicensed uses of Llama 3.3: LlamaUseReport@meta.com \" extra_gated_fields: First Name: text Last Name: text Date of birth: date_picker Country: country Affiliation: text Job title: type: select options: - Student - Research Graduate - AI researcher - AI developer/engineer - Reporter - Other geo: ip_location By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy: checkbox extra_gated_description: >- The information you provide will be collected, stored, processed and shared in accordance with the Meta Privacy Policy. extra_gated_button_content: Submit license: llama3.3 --- # huihui-ai/Llama-3.3-70B-Instruct-abliterated This is an uncensored version of meta-llama/Llama-3.3-70B-Instruct created with abliteration (see remove-refusals-with-transformers to know more about it). This is a crude, proof-of-concept implementation to remove refusals from an LLM model without using TransformerLens. If you don't get the result you want, you can try the same question again. ## Use with ollama You can use huihui_ai/llama3.3-abliterated directly, ### Use with transformers Starting with onward, you can run conversational inference using the Transformers abstraction or by leveraging the Auto classes with the function. Make sure to update your transformers installation via . See the snippet below for usage with Transformers:", + "model_explanation_gemini": "A multilingual, uncensored instruction-tuned model based on Meta's Llama-3.3-70B-Instruct, designed for generating responses across various languages (English, French, Italian, Portuguese, Hindi, Spanish, Thai, German) with fewer content restrictions. \n\n**Features**: \n- Multilingual support (8 languages) \n- Derived from Meta's Llama-3.3-70B-Instruct \n- Uncensored/abliterated (fewer output restrictions)" +} \ No newline at end of file diff --git a/model_data_json/isaiahbjork_orpheus-3b-0.1-ft-Q4_K_M-GGUF.json b/model_data_json/isaiahbjork_orpheus-3b-0.1-ft-Q4_K_M-GGUF.json new file mode 100644 index 0000000000000000000000000000000000000000..470e1921bf4ad8a27667e80fefebb157b0f35dd4 --- /dev/null +++ b/model_data_json/isaiahbjork_orpheus-3b-0.1-ft-Q4_K_M-GGUF.json @@ -0,0 +1,20 @@ +{ + "model_id": "isaiahbjork/orpheus-3b-0.1-ft-Q4_K_M-GGUF", + "downloads": 11598, + "tags": [ + "transformers", + "gguf", + "llama-cpp", + "gguf-my-repo", + "text-to-speech", + "en", + "base_model:canopylabs/orpheus-3b-0.1-ft", + "base_model:quantized:canopylabs/orpheus-3b-0.1-ft", + "license:apache-2.0", + "endpoints_compatible", + "region:us", + "conversational" + ], + "description": "--- base_model: canopylabs/orpheus-3b-0.1-ft language: - en library_name: transformers license: apache-2.0 pipeline_tag: text-to-speech tags: - llama-cpp - gguf-my-repo --- # Orpheus-TTS-Local A lightweight client for running Orpheus TTS locally using LM Studio API. {Github Repo]( ## Features - 🎧 High-quality Text-to-Speech using the Orpheus TTS model - 💻 Completely local - no cloud API keys needed - 🔊 Multiple voice options (tara, leah, jess, leo, dan, mia, zac, zoe) - 💾 Save audio to WAV files ## Quick Setup 1. Install LM Studio 2. Install the Orpheus TTS model (orpheus-3b-0.1-ft-q4_k_m.gguf) in LM Studio 3. Load the Orpheus model in LM Studio 4. Start the local server in LM Studio (default: 5. Install dependencies: 6. Run the script: ## Usage ### Options - : The text to convert to speech - : The voice to use (default: tara) - : Output WAV file path (default: auto-generated filename) - : Show available voices - : Temperature for generation (default: 0.6) - : Top-p sampling parameter (default: 0.9) - : Repetition penalty (default: 1.1) ## Available Voices - tara - Best overall voice for general use (default) - leah - jess - leo - dan - mia - zac - zoe ## Emotion You can add emotion to the speech by adding the following tags: ## License Apache 2.0", + "model_explanation_gemini": "A lightweight local text-to-speech model offering high-quality voice synthesis with multiple voice options and offline functionality.\n\nModel Features: \n- High-quality text-to-speech conversion \n- Fully local operation (no cloud dependencies) \n- Multiple voice options (tara, leah, jess, leo, dan, mia, zac, zoe) \n- WAV file output capability \n- Emotion control via text tags \n- Customizable generation parameters (temperature, top-p, repetition penalty)" +} \ No newline at end of file diff --git a/model_data_json/jinaai_jina-embeddings-v2-small-en.json b/model_data_json/jinaai_jina-embeddings-v2-small-en.json new file mode 100644 index 0000000000000000000000000000000000000000..e180b996bcd83cf5f58d91d6462e2ab996fb81f2 --- /dev/null +++ b/model_data_json/jinaai_jina-embeddings-v2-small-en.json @@ -0,0 +1,27 @@ +{ + "model_id": "jinaai/jina-embeddings-v2-small-en", + "downloads": 75783, + "tags": [ + "sentence-transformers", + "pytorch", + "coreml", + "onnx", + "safetensors", + "bert", + "feature-extraction", + "sentence-similarity", + "mteb", + "custom_code", + "en", + "dataset:jinaai/negation-dataset", + "arxiv:2108.12409", + "arxiv:2310.19923", + "license:apache-2.0", + "model-index", + "autotrain_compatible", + "text-embeddings-inference", + "region:us" + ], + "description": "--- tags: - sentence-transformers - feature-extraction - sentence-similarity - mteb datasets: - jinaai/negation-dataset language: en inference: false license: apache-2.0 model-index: - name: jina-embedding-s-en-v2 results: - task: type: Classification dataset: type: mteb/amazon_counterfactual name: MTEB AmazonCounterfactualClassification (en) config: en split: test revision: e8379541af4e31359cca9fbcf4b00f2671dba205 metrics: - type: accuracy value: 71.35820895522387 - type: ap value: 33.99931933598115 - type: f1 value: 65.3853685535555 - task: type: Classification dataset: type: mteb/amazon_polarity name: MTEB AmazonPolarityClassification config: default split: test revision: e2d317d38cd51312af73b3d32a06d1a08b442046 metrics: - type: accuracy value: 82.90140000000001 - type: ap value: 78.01434597815617 - type: f1 value: 82.83357802722676 - task: type: Classification dataset: type: mteb/amazon_reviews_multi name: MTEB AmazonReviewsClassification (en) config: en split: test revision: 1399c76144fd37290681b995c656ef9b2e06e26d metrics: - type: accuracy value: 40.88999999999999 - type: f1 value: 39.209432767163456 - task: type: Retrieval dataset: type: arguana name: MTEB ArguAna config: default split: test revision: None metrics: - type: map_at_1 value: 23.257 - type: map_at_10 value: 37.946000000000005 - type: map_at_100 value: 39.17 - type: map_at_1000 value: 39.181 - type: map_at_3 value: 32.99 - type: map_at_5 value: 35.467999999999996 - type: mrr_at_1 value: 23.541999999999998 - type: mrr_at_10 value: 38.057 - type: mrr_at_100 value: 39.289 - type: mrr_at_1000 value: 39.299 - type: mrr_at_3 value: 33.096 - type: mrr_at_5 value: 35.628 - type: ndcg_at_1 value: 23.257 - type: ndcg_at_10 value: 46.729 - type: ndcg_at_100 value: 51.900999999999996 - type: ndcg_at_1000 value: 52.16 - type: ndcg_at_3 value: 36.323 - type: ndcg_at_5 value: 40.766999999999996 - type: precision_at_1 value: 23.257 - type: precision_at_10 value: 7.510999999999999 - type: precision_at_100 value: 0.976 - type: precision_at_1000 value: 0.1 - type: precision_at_3 value: 15.339 - type: precision_at_5 value: 11.350999999999999 - type: recall_at_1 value: 23.257 - type: recall_at_10 value: 75.107 - type: recall_at_100 value: 97.58200000000001 - type: recall_at_1000 value: 99.57300000000001 - type: recall_at_3 value: 46.017 - type: recall_at_5 value: 56.757000000000005 - task: type: Clustering dataset: type: mteb/arxiv-clustering-p2p name: MTEB ArxivClusteringP2P config: default split: test revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d metrics: - type: v_measure value: 44.02420878391967 - task: type: Clustering dataset: type: mteb/arxiv-clustering-s2s name: MTEB ArxivClusteringS2S config: default split: test revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53 metrics: - type: v_measure value: 35.16136856000258 - task: type: Reranking dataset: type: mteb/askubuntudupquestions-reranking name: MTEB AskUbuntuDupQuestions config: default split: test revision: 2000358ca161889fa9c082cb41daa8dcfb161a54 metrics: - type: map value: 59.61809790513646 - type: mrr value: 73.07215406938397 - task: type: STS dataset: type: mteb/biosses-sts name: MTEB BIOSSES config: default split: test revision: d3fb88f8f02e40887cd149695127462bbcf29b4a metrics: - type: cos_sim_pearson value: 82.0167350090749 - type: cos_sim_spearman value: 80.51569002630401 - type: euclidean_pearson value: 81.46820525099726 - type: euclidean_spearman value: 80.51569002630401 - type: manhattan_pearson value: 81.35596555056757 - type: manhattan_spearman value: 80.12592210903303 - task: type: Classification dataset: type: mteb/banking77 name: MTEB Banking77Classification config: default split: test revision: 0fd18e25b25c072e09e0d92ab615fda904d66300 metrics: - type: accuracy value: 78.25 - type: f1 value: 77.34950913540605 - task: type: Clustering dataset: type: mteb/biorxiv-clustering-p2p name: MTEB BiorxivClusteringP2P config: default split: test revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40 metrics: - type: v_measure value: 35.57238596005698 - task: type: Clustering dataset: type: mteb/biorxiv-clustering-s2s name: MTEB BiorxivClusteringS2S config: default split: test revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908 metrics: - type: v_measure value: 29.066444306196683 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackAndroidRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 31.891000000000002 - type: map_at_10 value: 42.772 - type: map_at_100 value: 44.108999999999995 - type: map_at_1000 value: 44.236 - type: map_at_3 value: 39.289 - type: map_at_5 value: 41.113 - type: mrr_at_1 value: 39.342 - type: mrr_at_10 value: 48.852000000000004 - type: mrr_at_100 value: 49.534 - type: mrr_at_1000 value: 49.582 - type: mrr_at_3 value: 46.089999999999996 - type: mrr_at_5 value: 47.685 - type: ndcg_at_1 value: 39.342 - type: ndcg_at_10 value: 48.988 - type: ndcg_at_100 value: 53.854 - type: ndcg_at_1000 value: 55.955 - type: ndcg_at_3 value: 43.877 - type: ndcg_at_5 value: 46.027 - type: precision_at_1 value: 39.342 - type: precision_at_10 value: 9.285 - type: precision_at_100 value: 1.488 - type: precision_at_1000 value: 0.194 - type: precision_at_3 value: 20.696 - type: precision_at_5 value: 14.878 - type: recall_at_1 value: 31.891000000000002 - type: recall_at_10 value: 60.608 - type: recall_at_100 value: 81.025 - type: recall_at_1000 value: 94.883 - type: recall_at_3 value: 45.694 - type: recall_at_5 value: 51.684 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackEnglishRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 28.778 - type: map_at_10 value: 37.632 - type: map_at_100 value: 38.800000000000004 - type: map_at_1000 value: 38.934999999999995 - type: map_at_3 value: 35.293 - type: map_at_5 value: 36.547000000000004 - type: mrr_at_1 value: 35.35 - type: mrr_at_10 value: 42.936 - type: mrr_at_100 value: 43.69 - type: mrr_at_1000 value: 43.739 - type: mrr_at_3 value: 41.062 - type: mrr_at_5 value: 42.097 - type: ndcg_at_1 value: 35.35 - type: ndcg_at_10 value: 42.528 - type: ndcg_at_100 value: 46.983000000000004 - type: ndcg_at_1000 value: 49.187999999999995 - type: ndcg_at_3 value: 39.271 - type: ndcg_at_5 value: 40.654 - type: precision_at_1 value: 35.35 - type: precision_at_10 value: 7.828 - type: precision_at_100 value: 1.3010000000000002 - type: precision_at_1000 value: 0.17700000000000002 - type: precision_at_3 value: 18.96 - type: precision_at_5 value: 13.120999999999999 - type: recall_at_1 value: 28.778 - type: recall_at_10 value: 50.775000000000006 - type: recall_at_100 value: 69.66799999999999 - type: recall_at_1000 value: 83.638 - type: recall_at_3 value: 40.757 - type: recall_at_5 value: 44.86 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackGamingRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 37.584 - type: map_at_10 value: 49.69 - type: map_at_100 value: 50.639 - type: map_at_1000 value: 50.702999999999996 - type: map_at_3 value: 46.61 - type: map_at_5 value: 48.486000000000004 - type: mrr_at_1 value: 43.009 - type: mrr_at_10 value: 52.949999999999996 - type: mrr_at_100 value: 53.618 - type: mrr_at_1000 value: 53.65299999999999 - type: mrr_at_3 value: 50.605999999999995 - type: mrr_at_5 value: 52.095 - type: ndcg_at_1 value: 43.009 - type: ndcg_at_10 value: 55.278000000000006 - type: ndcg_at_100 value: 59.134 - type: ndcg_at_1000 value: 60.528999999999996 - type: ndcg_at_3 value: 50.184 - type: ndcg_at_5 value: 52.919000000000004 - type: precision_at_1 value: 43.009 - type: precision_at_10 value: 8.821 - type: precision_at_100 value: 1.161 - type: precision_at_1000 value: 0.133 - type: precision_at_3 value: 22.424 - type: precision_at_5 value: 15.436 - type: recall_at_1 value: 37.584 - type: recall_at_10 value: 68.514 - type: recall_at_100 value: 85.099 - type: recall_at_1000 value: 95.123 - type: recall_at_3 value: 55.007 - type: recall_at_5 value: 61.714999999999996 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackGisRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 24.7 - type: map_at_10 value: 32.804 - type: map_at_100 value: 33.738 - type: map_at_1000 value: 33.825 - type: map_at_3 value: 30.639 - type: map_at_5 value: 31.781 - type: mrr_at_1 value: 26.328000000000003 - type: mrr_at_10 value: 34.679 - type: mrr_at_100 value: 35.510000000000005 - type: mrr_at_1000 value: 35.577999999999996 - type: mrr_at_3 value: 32.58 - type: mrr_at_5 value: 33.687 - type: ndcg_at_1 value: 26.328000000000003 - type: ndcg_at_10 value: 37.313 - type: ndcg_at_100 value: 42.004000000000005 - type: ndcg_at_1000 value: 44.232 - type: ndcg_at_3 value: 33.076 - type: ndcg_at_5 value: 34.966 - type: precision_at_1 value: 26.328000000000003 - type: precision_at_10 value: 5.627 - type: precision_at_100 value: 0.8410000000000001 - type: precision_at_1000 value: 0.106 - type: precision_at_3 value: 14.011000000000001 - type: precision_at_5 value: 9.582 - type: recall_at_1 value: 24.7 - type: recall_at_10 value: 49.324 - type: recall_at_100 value: 71.018 - type: recall_at_1000 value: 87.905 - type: recall_at_3 value: 37.7 - type: recall_at_5 value: 42.281 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackMathematicaRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 14.350999999999999 - type: map_at_10 value: 21.745 - type: map_at_100 value: 22.731 - type: map_at_1000 value: 22.852 - type: map_at_3 value: 19.245 - type: map_at_5 value: 20.788 - type: mrr_at_1 value: 18.159 - type: mrr_at_10 value: 25.833000000000002 - type: mrr_at_100 value: 26.728 - type: mrr_at_1000 value: 26.802 - type: mrr_at_3 value: 23.383000000000003 - type: mrr_at_5 value: 24.887999999999998 - type: ndcg_at_1 value: 18.159 - type: ndcg_at_10 value: 26.518000000000004 - type: ndcg_at_100 value: 31.473000000000003 - type: ndcg_at_1000 value: 34.576 - type: ndcg_at_3 value: 21.907 - type: ndcg_at_5 value: 24.39 - type: precision_at_1 value: 18.159 - type: precision_at_10 value: 4.938 - type: precision_at_100 value: 0.853 - type: precision_at_1000 value: 0.125 - type: precision_at_3 value: 10.655000000000001 - type: precision_at_5 value: 7.985 - type: recall_at_1 value: 14.350999999999999 - type: recall_at_10 value: 37.284 - type: recall_at_100 value: 59.11300000000001 - type: recall_at_1000 value: 81.634 - type: recall_at_3 value: 24.753 - type: recall_at_5 value: 30.979 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackPhysicsRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 26.978 - type: map_at_10 value: 36.276 - type: map_at_100 value: 37.547000000000004 - type: map_at_1000 value: 37.678 - type: map_at_3 value: 33.674 - type: map_at_5 value: 35.119 - type: mrr_at_1 value: 32.916000000000004 - type: mrr_at_10 value: 41.798 - type: mrr_at_100 value: 42.72 - type: mrr_at_1000 value: 42.778 - type: mrr_at_3 value: 39.493 - type: mrr_at_5 value: 40.927 - type: ndcg_at_1 value: 32.916000000000004 - type: ndcg_at_10 value: 41.81 - type: ndcg_at_100 value: 47.284 - type: ndcg_at_1000 value: 49.702 - type: ndcg_at_3 value: 37.486999999999995 - type: ndcg_at_5 value: 39.597 - type: precision_at_1 value: 32.916000000000004 - type: precision_at_10 value: 7.411 - type: precision_at_100 value: 1.189 - type: precision_at_1000 value: 0.158 - type: precision_at_3 value: 17.581 - type: precision_at_5 value: 12.397 - type: recall_at_1 value: 26.978 - type: recall_at_10 value: 52.869 - type: recall_at_100 value: 75.78399999999999 - type: recall_at_1000 value: 91.545 - type: recall_at_3 value: 40.717 - type: recall_at_5 value: 46.168 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackProgrammersRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 24.641 - type: map_at_10 value: 32.916000000000004 - type: map_at_100 value: 34.165 - type: map_at_1000 value: 34.286 - type: map_at_3 value: 30.335 - type: map_at_5 value: 31.569000000000003 - type: mrr_at_1 value: 30.593999999999998 - type: mrr_at_10 value: 38.448 - type: mrr_at_100 value: 39.299 - type: mrr_at_1000 value: 39.362 - type: mrr_at_3 value: 36.244 - type: mrr_at_5 value: 37.232 - type: ndcg_at_1 value: 30.593999999999998 - type: ndcg_at_10 value: 38.2 - type: ndcg_at_100 value: 43.742 - type: ndcg_at_1000 value: 46.217000000000006 - type: ndcg_at_3 value: 33.925 - type: ndcg_at_5 value: 35.394 - type: precision_at_1 value: 30.593999999999998 - type: precision_at_10 value: 6.895 - type: precision_at_100 value: 1.1320000000000001 - type: precision_at_1000 value: 0.153 - type: precision_at_3 value: 16.096 - type: precision_at_5 value: 11.05 - type: recall_at_1 value: 24.641 - type: recall_at_10 value: 48.588 - type: recall_at_100 value: 72.841 - type: recall_at_1000 value: 89.535 - type: recall_at_3 value: 36.087 - type: recall_at_5 value: 40.346 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 24.79425 - type: map_at_10 value: 33.12033333333333 - type: map_at_100 value: 34.221333333333334 - type: map_at_1000 value: 34.3435 - type: map_at_3 value: 30.636583333333338 - type: map_at_5 value: 31.974083333333326 - type: mrr_at_1 value: 29.242416666666664 - type: mrr_at_10 value: 37.11675 - type: mrr_at_100 value: 37.93783333333334 - type: mrr_at_1000 value: 38.003083333333336 - type: mrr_at_3 value: 34.904666666666664 - type: mrr_at_5 value: 36.12916666666667 - type: ndcg_at_1 value: 29.242416666666664 - type: ndcg_at_10 value: 38.03416666666667 - type: ndcg_at_100 value: 42.86674999999999 - type: ndcg_at_1000 value: 45.34550000000001 - type: ndcg_at_3 value: 33.76466666666666 - type: ndcg_at_5 value: 35.668666666666674 - type: precision_at_1 value: 29.242416666666664 - type: precision_at_10 value: 6.589833333333334 - type: precision_at_100 value: 1.0693333333333332 - type: precision_at_1000 value: 0.14641666666666667 - type: precision_at_3 value: 15.430749999999998 - type: precision_at_5 value: 10.833833333333333 - type: recall_at_1 value: 24.79425 - type: recall_at_10 value: 48.582916666666655 - type: recall_at_100 value: 69.88499999999999 - type: recall_at_1000 value: 87.211 - type: recall_at_3 value: 36.625499999999995 - type: recall_at_5 value: 41.553999999999995 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackStatsRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 22.767 - type: map_at_10 value: 28.450999999999997 - type: map_at_100 value: 29.332 - type: map_at_1000 value: 29.426000000000002 - type: map_at_3 value: 26.379 - type: map_at_5 value: 27.584999999999997 - type: mrr_at_1 value: 25.46 - type: mrr_at_10 value: 30.974 - type: mrr_at_100 value: 31.784000000000002 - type: mrr_at_1000 value: 31.857999999999997 - type: mrr_at_3 value: 28.962 - type: mrr_at_5 value: 30.066 - type: ndcg_at_1 value: 25.46 - type: ndcg_at_10 value: 32.041 - type: ndcg_at_100 value: 36.522 - type: ndcg_at_1000 value: 39.101 - type: ndcg_at_3 value: 28.152 - type: ndcg_at_5 value: 30.03 - type: precision_at_1 value: 25.46 - type: precision_at_10 value: 4.893 - type: precision_at_100 value: 0.77 - type: precision_at_1000 value: 0.107 - type: precision_at_3 value: 11.605 - type: precision_at_5 value: 8.19 - type: recall_at_1 value: 22.767 - type: recall_at_10 value: 40.71 - type: recall_at_100 value: 61.334999999999994 - type: recall_at_1000 value: 80.567 - type: recall_at_3 value: 30.198000000000004 - type: recall_at_5 value: 34.803 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackTexRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 16.722 - type: map_at_10 value: 22.794 - type: map_at_100 value: 23.7 - type: map_at_1000 value: 23.822 - type: map_at_3 value: 20.781 - type: map_at_5 value: 22.024 - type: mrr_at_1 value: 20.061999999999998 - type: mrr_at_10 value: 26.346999999999998 - type: mrr_at_100 value: 27.153 - type: mrr_at_1000 value: 27.233 - type: mrr_at_3 value: 24.375 - type: mrr_at_5 value: 25.593 - type: ndcg_at_1 value: 20.061999999999998 - type: ndcg_at_10 value: 26.785999999999998 - type: ndcg_at_100 value: 31.319999999999997 - type: ndcg_at_1000 value: 34.346 - type: ndcg_at_3 value: 23.219 - type: ndcg_at_5 value: 25.107000000000003 - type: precision_at_1 value: 20.061999999999998 - type: precision_at_10 value: 4.78 - type: precision_at_100 value: 0.83 - type: precision_at_1000 value: 0.125 - type: precision_at_3 value: 10.874 - type: precision_at_5 value: 7.956 - type: recall_at_1 value: 16.722 - type: recall_at_10 value: 35.204 - type: recall_at_100 value: 55.797 - type: recall_at_1000 value: 77.689 - type: recall_at_3 value: 25.245 - type: recall_at_5 value: 30.115 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackUnixRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 24.842 - type: map_at_10 value: 32.917 - type: map_at_100 value: 33.961000000000006 - type: map_at_1000 value: 34.069 - type: map_at_3 value: 30.595 - type: map_at_5 value: 31.837 - type: mrr_at_1 value: 29.011 - type: mrr_at_10 value: 36.977 - type: mrr_at_100 value: 37.814 - type: mrr_at_1000 value: 37.885999999999996 - type: mrr_at_3 value: 34.966 - type: mrr_at_5 value: 36.043 - type: ndcg_at_1 value: 29.011 - type: ndcg_at_10 value: 37.735 - type: ndcg_at_100 value: 42.683 - type: ndcg_at_1000 value: 45.198 - type: ndcg_at_3 value: 33.650000000000006 - type: ndcg_at_5 value: 35.386 - type: precision_at_1 value: 29.011 - type: precision_at_10 value: 6.259 - type: precision_at_100 value: 0.984 - type: precision_at_1000 value: 0.13 - type: precision_at_3 value: 15.329999999999998 - type: precision_at_5 value: 10.541 - type: recall_at_1 value: 24.842 - type: recall_at_10 value: 48.304 - type: recall_at_100 value: 70.04899999999999 - type: recall_at_1000 value: 87.82600000000001 - type: recall_at_3 value: 36.922 - type: recall_at_5 value: 41.449999999999996 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackWebmastersRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 24.252000000000002 - type: map_at_10 value: 32.293 - type: map_at_100 value: 33.816 - type: map_at_1000 value: 34.053 - type: map_at_3 value: 29.781999999999996 - type: map_at_5 value: 31.008000000000003 - type: mrr_at_1 value: 29.051 - type: mrr_at_10 value: 36.722 - type: mrr_at_100 value: 37.663000000000004 - type: mrr_at_1000 value: 37.734 - type: mrr_at_3 value: 34.354 - type: mrr_at_5 value: 35.609 - type: ndcg_at_1 value: 29.051 - type: ndcg_at_10 value: 37.775999999999996 - type: ndcg_at_100 value: 43.221 - type: ndcg_at_1000 value: 46.116 - type: ndcg_at_3 value: 33.403 - type: ndcg_at_5 value: 35.118 - type: precision_at_1 value: 29.051 - type: precision_at_10 value: 7.332 - type: precision_at_100 value: 1.49 - type: precision_at_1000 value: 0.23600000000000002 - type: precision_at_3 value: 15.415000000000001 - type: precision_at_5 value: 11.107 - type: recall_at_1 value: 24.252000000000002 - type: recall_at_10 value: 47.861 - type: recall_at_100 value: 72.21600000000001 - type: recall_at_1000 value: 90.886 - type: recall_at_3 value: 35.533 - type: recall_at_5 value: 39.959 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackWordpressRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 20.025000000000002 - type: map_at_10 value: 27.154 - type: map_at_100 value: 28.118 - type: map_at_1000 value: 28.237000000000002 - type: map_at_3 value: 25.017 - type: map_at_5 value: 25.832 - type: mrr_at_1 value: 21.627 - type: mrr_at_10 value: 28.884999999999998 - type: mrr_at_100 value: 29.741 - type: mrr_at_1000 value: 29.831999999999997 - type: mrr_at_3 value: 26.741 - type: mrr_at_5 value: 27.628000000000004 - type: ndcg_at_1 value: 21.627 - type: ndcg_at_10 value: 31.436999999999998 - type: ndcg_at_100 value: 36.181000000000004 - type: ndcg_at_1000 value: 38.986 - type: ndcg_at_3 value: 27.025 - type: ndcg_at_5 value: 28.436 - type: precision_at_1 value: 21.627 - type: precision_at_10 value: 5.009 - type: precision_at_100 value: 0.7929999999999999 - type: precision_at_1000 value: 0.11299999999999999 - type: precision_at_3 value: 11.522 - type: precision_at_5 value: 7.763000000000001 - type: recall_at_1 value: 20.025000000000002 - type: recall_at_10 value: 42.954 - type: recall_at_100 value: 64.67500000000001 - type: recall_at_1000 value: 85.301 - type: recall_at_3 value: 30.892999999999997 - type: recall_at_5 value: 34.288000000000004 - task: type: Retrieval dataset: type: climate-fever name: MTEB ClimateFEVER config: default split: test revision: None metrics: - type: map_at_1 value: 10.079 - type: map_at_10 value: 16.930999999999997 - type: map_at_100 value: 18.398999999999997 - type: map_at_1000 value: 18.561 - type: map_at_3 value: 14.294 - type: map_at_5 value: 15.579 - type: mrr_at_1 value: 22.606 - type: mrr_at_10 value: 32.513 - type: mrr_at_100 value: 33.463 - type: mrr_at_1000 value: 33.513999999999996 - type: mrr_at_3 value: 29.479 - type: mrr_at_5 value: 31.3 - type: ndcg_at_1 value: 22.606 - type: ndcg_at_10 value: 24.053 - type: ndcg_at_100 value: 30.258000000000003 - type: ndcg_at_1000 value: 33.516 - type: ndcg_at_3 value: 19.721 - type: ndcg_at_5 value: 21.144 - type: precision_at_1 value: 22.606 - type: precision_at_10 value: 7.55 - type: precision_at_100 value: 1.399 - type: precision_at_1000 value: 0.2 - type: precision_at_3 value: 14.701 - type: precision_at_5 value: 11.192 - type: recall_at_1 value: 10.079 - type: recall_at_10 value: 28.970000000000002 - type: recall_at_100 value: 50.805 - type: recall_at_1000 value: 69.378 - type: recall_at_3 value: 18.199 - type: recall_at_5 value: 22.442 - task: type: Retrieval dataset: type: dbpedia-entity name: MTEB DBPedia config: default split: test revision: None metrics: - type: map_at_1 value: 7.794 - type: map_at_10 value: 15.165999999999999 - type: map_at_100 value: 20.508000000000003 - type: map_at_1000 value: 21.809 - type: map_at_3 value: 11.568000000000001 - type: map_at_5 value: 13.059000000000001 - type: mrr_at_1 value: 56.49999999999999 - type: mrr_at_10 value: 65.90899999999999 - type: mrr_at_100 value: 66.352 - type: mrr_at_1000 value: 66.369 - type: mrr_at_3 value: 64.0 - type: mrr_at_5 value: 65.10000000000001 - type: ndcg_at_1 value: 44.25 - type: ndcg_at_10 value: 32.649 - type: ndcg_at_100 value: 36.668 - type: ndcg_at_1000 value: 43.918 - type: ndcg_at_3 value: 37.096000000000004 - type: ndcg_at_5 value: 34.048 - type: precision_at_1 value: 56.49999999999999 - type: precision_at_10 value: 25.45 - type: precision_at_100 value: 8.055 - type: precision_at_1000 value: 1.7489999999999999 - type: precision_at_3 value: 41.0 - type: precision_at_5 value: 32.85 - type: recall_at_1 value: 7.794 - type: recall_at_10 value: 20.101 - type: recall_at_100 value: 42.448 - type: recall_at_1000 value: 65.88000000000001 - type: recall_at_3 value: 12.753 - type: recall_at_5 value: 15.307 - task: type: Classification dataset: type: mteb/emotion name: MTEB EmotionClassification config: default split: test revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37 metrics: - type: accuracy value: 44.01 - type: f1 value: 38.659680951114964 - task: type: Retrieval dataset: type: fever name: MTEB FEVER config: default split: test revision: None metrics: - type: map_at_1 value: 49.713 - type: map_at_10 value: 61.79 - type: map_at_100 value: 62.28 - type: map_at_1000 value: 62.297000000000004 - type: map_at_3 value: 59.361 - type: map_at_5 value: 60.92100000000001 - type: mrr_at_1 value: 53.405 - type: mrr_at_10 value: 65.79899999999999 - type: mrr_at_100 value: 66.219 - type: mrr_at_1000 value: 66.227 - type: mrr_at_3 value: 63.431000000000004 - type: mrr_at_5 value: 64.98 - type: ndcg_at_1 value: 53.405 - type: ndcg_at_10 value: 68.01899999999999 - type: ndcg_at_100 value: 70.197 - type: ndcg_at_1000 value: 70.571 - type: ndcg_at_3 value: 63.352 - type: ndcg_at_5 value: 66.018 - type: precision_at_1 value: 53.405 - type: precision_at_10 value: 9.119 - type: precision_at_100 value: 1.03 - type: precision_at_1000 value: 0.107 - type: precision_at_3 value: 25.602999999999998 - type: precision_at_5 value: 16.835 - type: recall_at_1 value: 49.713 - type: recall_at_10 value: 83.306 - type: recall_at_100 value: 92.92 - type: recall_at_1000 value: 95.577 - type: recall_at_3 value: 70.798 - type: recall_at_5 value: 77.254 - task: type: Retrieval dataset: type: fiqa name: MTEB FiQA2018 config: default split: test revision: None metrics: - type: map_at_1 value: 15.310000000000002 - type: map_at_10 value: 26.204 - type: map_at_100 value: 27.932000000000002 - type: map_at_1000 value: 28.121000000000002 - type: map_at_3 value: 22.481 - type: map_at_5 value: 24.678 - type: mrr_at_1 value: 29.784 - type: mrr_at_10 value: 39.582 - type: mrr_at_100 value: 40.52 - type: mrr_at_1000 value: 40.568 - type: mrr_at_3 value: 37.114000000000004 - type: mrr_at_5 value: 38.596000000000004 - type: ndcg_at_1 value: 29.784 - type: ndcg_at_10 value: 33.432 - type: ndcg_at_100 value: 40.281 - type: ndcg_at_1000 value: 43.653999999999996 - type: ndcg_at_3 value: 29.612 - type: ndcg_at_5 value: 31.223 - type: precision_at_1 value: 29.784 - type: precision_at_10 value: 9.645 - type: precision_at_100 value: 1.645 - type: precision_at_1000 value: 0.22499999999999998 - type: precision_at_3 value: 20.165 - type: precision_at_5 value: 15.401000000000002 - type: recall_at_1 value: 15.310000000000002 - type: recall_at_10 value: 40.499 - type: recall_at_100 value: 66.643 - type: recall_at_1000 value: 87.059 - type: recall_at_3 value: 27.492 - type: recall_at_5 value: 33.748 - task: type: Retrieval dataset: type: hotpotqa name: MTEB HotpotQA config: default split: test revision: None metrics: - type: map_at_1 value: 33.599000000000004 - type: map_at_10 value: 47.347 - type: map_at_100 value: 48.191 - type: map_at_1000 value: 48.263 - type: map_at_3 value: 44.698 - type: map_at_5 value: 46.278999999999996 - type: mrr_at_1 value: 67.19800000000001 - type: mrr_at_10 value: 74.054 - type: mrr_at_100 value: 74.376 - type: mrr_at_1000 value: 74.392 - type: mrr_at_3 value: 72.849 - type: mrr_at_5 value: 73.643 - type: ndcg_at_1 value: 67.19800000000001 - type: ndcg_at_10 value: 56.482 - type: ndcg_at_100 value: 59.694 - type: ndcg_at_1000 value: 61.204 - type: ndcg_at_3 value: 52.43299999999999 - type: ndcg_at_5 value: 54.608000000000004 - type: precision_at_1 value: 67.19800000000001 - type: precision_at_10 value: 11.613999999999999 - type: precision_at_100 value: 1.415 - type: precision_at_1000 value: 0.16199999999999998 - type: precision_at_3 value: 32.726 - type: precision_at_5 value: 21.349999999999998 - type: recall_at_1 value: 33.599000000000004 - type: recall_at_10 value: 58.069 - type: recall_at_100 value: 70.736 - type: recall_at_1000 value: 80.804 - type: recall_at_3 value: 49.088 - type: recall_at_5 value: 53.376000000000005 - task: type: Classification dataset: type: mteb/imdb name: MTEB ImdbClassification config: default split: test revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7 metrics: - type: accuracy value: 73.64359999999999 - type: ap value: 67.54685976014599 - type: f1 value: 73.55148707559482 - task: type: Retrieval dataset: type: msmarco name: MTEB MSMARCO config: default split: dev revision: None metrics: - type: map_at_1 value: 19.502 - type: map_at_10 value: 30.816 - type: map_at_100 value: 32.007999999999996 - type: map_at_1000 value: 32.067 - type: map_at_3 value: 27.215 - type: map_at_5 value: 29.304000000000002 - type: mrr_at_1 value: 20.072000000000003 - type: mrr_at_10 value: 31.406 - type: mrr_at_100 value: 32.549 - type: mrr_at_1000 value: 32.602 - type: mrr_at_3 value: 27.839000000000002 - type: mrr_at_5 value: 29.926000000000002 - type: ndcg_at_1 value: 20.086000000000002 - type: ndcg_at_10 value: 37.282 - type: ndcg_at_100 value: 43.206 - type: ndcg_at_1000 value: 44.690000000000005 - type: ndcg_at_3 value: 29.932 - type: ndcg_at_5 value: 33.668 - type: precision_at_1 value: 20.086000000000002 - type: precision_at_10 value: 5.961 - type: precision_at_100 value: 0.898 - type: precision_at_1000 value: 0.10200000000000001 - type: precision_at_3 value: 12.856000000000002 - type: precision_at_5 value: 9.596 - type: recall_at_1 value: 19.502 - type: recall_at_10 value: 57.182 - type: recall_at_100 value: 84.952 - type: recall_at_1000 value: 96.34700000000001 - type: recall_at_3 value: 37.193 - type: recall_at_5 value: 46.157 - task: type: Classification dataset: type: mteb/mtop_domain name: MTEB MTOPDomainClassification (en) config: en split: test revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf metrics: - type: accuracy value: 93.96488828089375 - type: f1 value: 93.32119260543482 - task: type: Classification dataset: type: mteb/mtop_intent name: MTEB MTOPIntentClassification (en) config: en split: test revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba metrics: - type: accuracy value: 72.4965800273598 - type: f1 value: 49.34896217536082 - task: type: Classification dataset: type: mteb/amazon_massive_intent name: MTEB MassiveIntentClassification (en) config: en split: test revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 metrics: - type: accuracy value: 67.60928043039678 - type: f1 value: 64.34244712074538 - task: type: Classification dataset: type: mteb/amazon_massive_scenario name: MTEB MassiveScenarioClassification (en) config: en split: test revision: 7d571f92784cd94a019292a1f45445077d0ef634 metrics: - type: accuracy value: 69.75453934095493 - type: f1 value: 68.39224867489249 - task: type: Clustering dataset: type: mteb/medrxiv-clustering-p2p name: MTEB MedrxivClusteringP2P config: default split: test revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73 metrics: - type: v_measure value: 31.862573504920082 - task: type: Clustering dataset: type: mteb/medrxiv-clustering-s2s name: MTEB MedrxivClusteringS2S config: default split: test revision: 35191c8c0dca72d8ff3efcd72aa802307d469663 metrics: - type: v_measure value: 27.511123551196803 - task: type: Reranking dataset: type: mteb/mind_small name: MTEB MindSmallReranking config: default split: test revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69 metrics: - type: map value: 30.99145104942086 - type: mrr value: 32.03606480418627 - task: type: Retrieval dataset: type: nfcorpus name: MTEB NFCorpus config: default split: test revision: None metrics: - type: map_at_1 value: 5.015 - type: map_at_10 value: 11.054 - type: map_at_100 value: 13.773 - type: map_at_1000 value: 15.082999999999998 - type: map_at_3 value: 8.253 - type: map_at_5 value: 9.508999999999999 - type: mrr_at_1 value: 42.105 - type: mrr_at_10 value: 50.44499999999999 - type: mrr_at_100 value: 51.080000000000005 - type: mrr_at_1000 value: 51.129999999999995 - type: mrr_at_3 value: 48.555 - type: mrr_at_5 value: 49.84 - type: ndcg_at_1 value: 40.402 - type: ndcg_at_10 value: 30.403000000000002 - type: ndcg_at_100 value: 28.216 - type: ndcg_at_1000 value: 37.021 - type: ndcg_at_3 value: 35.53 - type: ndcg_at_5 value: 33.202999999999996 - type: precision_at_1 value: 42.105 - type: precision_at_10 value: 22.353 - type: precision_at_100 value: 7.266 - type: precision_at_1000 value: 2.011 - type: precision_at_3 value: 32.921 - type: precision_at_5 value: 28.297 - type: recall_at_1 value: 5.015 - type: recall_at_10 value: 14.393 - type: recall_at_100 value: 28.893 - type: recall_at_1000 value: 60.18 - type: recall_at_3 value: 9.184000000000001 - type: recall_at_5 value: 11.39 - task: type: Retrieval dataset: type: nq name: MTEB NQ config: default split: test revision: None metrics: - type: map_at_1 value: 29.524 - type: map_at_10 value: 44.182 - type: map_at_100 value: 45.228 - type: map_at_1000 value: 45.265 - type: map_at_3 value: 39.978 - type: map_at_5 value: 42.482 - type: mrr_at_1 value: 33.256 - type: mrr_at_10 value: 46.661 - type: mrr_at_100 value: 47.47 - type: mrr_at_1000 value: 47.496 - type: mrr_at_3 value: 43.187999999999995 - type: mrr_at_5 value: 45.330999999999996 - type: ndcg_at_1 value: 33.227000000000004 - type: ndcg_at_10 value: 51.589 - type: ndcg_at_100 value: 56.043 - type: ndcg_at_1000 value: 56.937000000000005 - type: ndcg_at_3 value: 43.751 - type: ndcg_at_5 value: 47.937000000000005 - type: precision_at_1 value: 33.227000000000004 - type: precision_at_10 value: 8.556999999999999 - type: precision_at_100 value: 1.103 - type: precision_at_1000 value: 0.11900000000000001 - type: precision_at_3 value: 19.921 - type: precision_at_5 value: 14.396999999999998 - type: recall_at_1 value: 29.524 - type: recall_at_10 value: 71.615 - type: recall_at_100 value: 91.056 - type: recall_at_1000 value: 97.72800000000001 - type: recall_at_3 value: 51.451 - type: recall_at_5 value: 61.119 - task: type: Retrieval dataset: type: quora name: MTEB QuoraRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 69.596 - type: map_at_10 value: 83.281 - type: map_at_100 value: 83.952 - type: map_at_1000 value: 83.97200000000001 - type: map_at_3 value: 80.315 - type: map_at_5 value: 82.223 - type: mrr_at_1 value: 80.17 - type: mrr_at_10 value: 86.522 - type: mrr_at_100 value: 86.644 - type: mrr_at_1000 value: 86.64500000000001 - type: mrr_at_3 value: 85.438 - type: mrr_at_5 value: 86.21799999999999 - type: ndcg_at_1 value: 80.19 - type: ndcg_at_10 value: 87.19 - type: ndcg_at_100 value: 88.567 - type: ndcg_at_1000 value: 88.70400000000001 - type: ndcg_at_3 value: 84.17999999999999 - type: ndcg_at_5 value: 85.931 - type: precision_at_1 value: 80.19 - type: precision_at_10 value: 13.209000000000001 - type: precision_at_100 value: 1.518 - type: precision_at_1000 value: 0.157 - type: precision_at_3 value: 36.717 - type: precision_at_5 value: 24.248 - type: recall_at_1 value: 69.596 - type: recall_at_10 value: 94.533 - type: recall_at_100 value: 99.322 - type: recall_at_1000 value: 99.965 - type: recall_at_3 value: 85.911 - type: recall_at_5 value: 90.809 - task: type: Clustering dataset: type: mteb/reddit-clustering name: MTEB RedditClustering config: default split: test revision: 24640382cdbf8abc73003fb0fa6d111a705499eb metrics: - type: v_measure value: 49.27650627571912 - task: type: Clustering dataset: type: mteb/reddit-clustering-p2p name: MTEB RedditClusteringP2P config: default split: test revision: 282350215ef01743dc01b456c7f5241fa8937f16 metrics: - type: v_measure value: 57.08550946534183 - task: type: Retrieval dataset: type: scidocs name: MTEB SCIDOCS config: default split: test revision: None metrics: - type: map_at_1 value: 4.568 - type: map_at_10 value: 10.862 - type: map_at_100 value: 12.757 - type: map_at_1000 value: 13.031 - type: map_at_3 value: 7.960000000000001 - type: map_at_5 value: 9.337 - type: mrr_at_1 value: 22.5 - type: mrr_at_10 value: 32.6 - type: mrr_at_100 value: 33.603 - type: mrr_at_1000 value: 33.672000000000004 - type: mrr_at_3 value: 29.299999999999997 - type: mrr_at_5 value: 31.25 - type: ndcg_at_1 value: 22.5 - type: ndcg_at_10 value: 18.605 - type: ndcg_at_100 value: 26.029999999999998 - type: ndcg_at_1000 value: 31.256 - type: ndcg_at_3 value: 17.873 - type: ndcg_at_5 value: 15.511 - type: precision_at_1 value: 22.5 - type: precision_at_10 value: 9.58 - type: precision_at_100 value: 2.033 - type: precision_at_1000 value: 0.33 - type: precision_at_3 value: 16.633 - type: precision_at_5 value: 13.54 - type: recall_at_1 value: 4.568 - type: recall_at_10 value: 19.402 - type: recall_at_100 value: 41.277 - type: recall_at_1000 value: 66.963 - type: recall_at_3 value: 10.112 - type: recall_at_5 value: 13.712 - task: type: STS dataset: type: mteb/sickr-sts name: MTEB SICK-R config: default split: test revision: a6ea5a8cab320b040a23452cc28066d9beae2cee metrics: - type: cos_sim_pearson value: 83.31992291680787 - type: cos_sim_spearman value: 76.7212346922664 - type: euclidean_pearson value: 80.42189271706478 - type: euclidean_spearman value: 76.7212342532493 - type: manhattan_pearson value: 80.33171093031578 - type: manhattan_spearman value: 76.63192883074694 - task: type: STS dataset: type: mteb/sts12-sts name: MTEB STS12 config: default split: test revision: a0d554a64d88156834ff5ae9920b964011b16384 metrics: - type: cos_sim_pearson value: 83.16654278886763 - type: cos_sim_spearman value: 73.66390263429565 - type: euclidean_pearson value: 79.7485360086639 - type: euclidean_spearman value: 73.66389870373436 - type: manhattan_pearson value: 79.73652237443706 - type: manhattan_spearman value: 73.65296117151647 - task: type: STS dataset: type: mteb/sts13-sts name: MTEB STS13 config: default split: test revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca metrics: - type: cos_sim_pearson value: 82.40389689929246 - type: cos_sim_spearman value: 83.29727595993955 - type: euclidean_pearson value: 82.23970587854079 - type: euclidean_spearman value: 83.29727595993955 - type: manhattan_pearson value: 82.18823600831897 - type: manhattan_spearman value: 83.20746192209594 - task: type: STS dataset: type: mteb/sts14-sts name: MTEB STS14 config: default split: test revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375 metrics: - type: cos_sim_pearson value: 81.73505246913413 - type: cos_sim_spearman value: 79.1686548248754 - type: euclidean_pearson value: 80.48889135993412 - type: euclidean_spearman value: 79.16864112930354 - type: manhattan_pearson value: 80.40720651057302 - type: manhattan_spearman value: 79.0640155089286 - task: type: STS dataset: type: mteb/sts15-sts name: MTEB STS15 config: default split: test revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3 metrics: - type: cos_sim_pearson value: 86.3953512879065 - type: cos_sim_spearman value: 87.29947322714338 - type: euclidean_pearson value: 86.59759438529645 - type: euclidean_spearman value: 87.29947511092824 - type: manhattan_pearson value: 86.52097806169155 - type: manhattan_spearman value: 87.22987242146534 - task: type: STS dataset: type: mteb/sts16-sts name: MTEB STS16 config: default split: test revision: 4d8694f8f0e0100860b497b999b3dbed754a0513 metrics: - type: cos_sim_pearson value: 82.48565753792056 - type: cos_sim_spearman value: 83.6049720319893 - type: euclidean_pearson value: 82.56452023172913 - type: euclidean_spearman value: 83.60490168191697 - type: manhattan_pearson value: 82.58079941137872 - type: manhattan_spearman value: 83.60975807374051 - task: type: STS dataset: type: mteb/sts17-crosslingual-sts name: MTEB STS17 (en-en) config: en-en split: test revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d metrics: - type: cos_sim_pearson value: 88.18239976618212 - type: cos_sim_spearman value: 88.23061724730616 - type: euclidean_pearson value: 87.78482472776658 - type: euclidean_spearman value: 88.23061724730616 - type: manhattan_pearson value: 87.75059641730239 - type: manhattan_spearman value: 88.22527413524622 - task: type: STS dataset: type: mteb/sts22-crosslingual-sts name: MTEB STS22 (en) config: en split: test revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80 metrics: - type: cos_sim_pearson value: 63.42816418706765 - type: cos_sim_spearman value: 63.4569864520124 - type: euclidean_pearson value: 64.35405409953853 - type: euclidean_spearman value: 63.4569864520124 - type: manhattan_pearson value: 63.96649236073056 - type: manhattan_spearman value: 63.01448583722708 - task: type: STS dataset: type: mteb/stsbenchmark-sts name: MTEB STSBenchmark config: default split: test revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831 metrics: - type: cos_sim_pearson value: 83.41659638047614 - type: cos_sim_spearman value: 84.03893866106175 - type: euclidean_pearson value: 84.2251203953798 - type: euclidean_spearman value: 84.03893866106175 - type: manhattan_pearson value: 84.22733643205514 - type: manhattan_spearman value: 84.06504411263612 - task: type: Reranking dataset: type: mteb/scidocs-reranking name: MTEB SciDocsRR config: default split: test revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab metrics: - type: map value: 79.75608022582414 - type: mrr value: 94.0947732369301 - task: type: Retrieval dataset: type: scifact name: MTEB SciFact config: default split: test revision: None metrics: - type: map_at_1 value: 50.161 - type: map_at_10 value: 59.458999999999996 - type: map_at_100 value: 60.156 - type: map_at_1000 value: 60.194 - type: map_at_3 value: 56.45400000000001 - type: map_at_5 value: 58.165 - type: mrr_at_1 value: 53.333 - type: mrr_at_10 value: 61.050000000000004 - type: mrr_at_100 value: 61.586 - type: mrr_at_1000 value: 61.624 - type: mrr_at_3 value: 58.889 - type: mrr_at_5 value: 60.122 - type: ndcg_at_1 value: 53.333 - type: ndcg_at_10 value: 63.888999999999996 - type: ndcg_at_100 value: 66.963 - type: ndcg_at_1000 value: 68.062 - type: ndcg_at_3 value: 59.01 - type: ndcg_at_5 value: 61.373999999999995 - type: precision_at_1 value: 53.333 - type: precision_at_10 value: 8.633000000000001 - type: precision_at_100 value: 1.027 - type: precision_at_1000 value: 0.11199999999999999 - type: precision_at_3 value: 23.111 - type: precision_at_5 value: 15.467 - type: recall_at_1 value: 50.161 - type: recall_at_10 value: 75.922 - type: recall_at_100 value: 90.0 - type: recall_at_1000 value: 98.667 - type: recall_at_3 value: 62.90599999999999 - type: recall_at_5 value: 68.828 - task: type: PairClassification dataset: type: mteb/sprintduplicatequestions-pairclassification name: MTEB SprintDuplicateQuestions config: default split: test revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46 metrics: - type: cos_sim_accuracy value: 99.81188118811882 - type: cos_sim_ap value: 95.11619225962413 - type: cos_sim_f1 value: 90.35840484603736 - type: cos_sim_precision value: 91.23343527013252 - type: cos_sim_recall value: 89.5 - type: dot_accuracy value: 99.81188118811882 - type: dot_ap value: 95.11619225962413 - type: dot_f1 value: 90.35840484603736 - type: dot_precision value: 91.23343527013252 - type: dot_recall value: 89.5 - type: euclidean_accuracy value: 99.81188118811882 - type: euclidean_ap value: 95.11619225962413 - type: euclidean_f1 value: 90.35840484603736 - type: euclidean_precision value: 91.23343527013252 - type: euclidean_recall value: 89.5 - type: manhattan_accuracy value: 99.80891089108911 - type: manhattan_ap value: 95.07294266220966 - type: manhattan_f1 value: 90.21794221996959 - type: manhattan_precision value: 91.46968139773895 - type: manhattan_recall value: 89.0 - type: max_accuracy value: 99.81188118811882 - type: max_ap value: 95.11619225962413 - type: max_f1 value: 90.35840484603736 - task: type: Clustering dataset: type: mteb/stackexchange-clustering name: MTEB StackExchangeClustering config: default split: test revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259 metrics: - type: v_measure value: 55.3481874105239 - task: type: Clustering dataset: type: mteb/stackexchange-clustering-p2p name: MTEB StackExchangeClusteringP2P config: default split: test revision: 815ca46b2622cec33ccafc3735d572c266efdb44 metrics: - type: v_measure value: 34.421291695525 - task: type: Reranking dataset: type: mteb/stackoverflowdupquestions-reranking name: MTEB StackOverflowDupQuestions config: default split: test revision: e185fbe320c72810689fc5848eb6114e1ef5ec69 metrics: - type: map value: 49.98746633276634 - type: mrr value: 50.63143249724133 - task: type: Summarization dataset: type: mteb/summeval name: MTEB SummEval config: default split: test revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c metrics: - type: cos_sim_pearson value: 31.009961979844036 - type: cos_sim_spearman value: 30.558416108881044 - type: dot_pearson value: 31.009964941134253 - type: dot_spearman value: 30.545760761761393 - task: type: Retrieval dataset: type: trec-covid name: MTEB TRECCOVID config: default split: test revision: None metrics: - type: map_at_1 value: 0.207 - type: map_at_10 value: 1.6 - type: map_at_100 value: 8.594 - type: map_at_1000 value: 20.213 - type: map_at_3 value: 0.585 - type: map_at_5 value: 0.9039999999999999 - type: mrr_at_1 value: 78.0 - type: mrr_at_10 value: 87.4 - type: mrr_at_100 value: 87.4 - type: mrr_at_1000 value: 87.4 - type: mrr_at_3 value: 86.667 - type: mrr_at_5 value: 87.06700000000001 - type: ndcg_at_1 value: 73.0 - type: ndcg_at_10 value: 65.18 - type: ndcg_at_100 value: 49.631 - type: ndcg_at_1000 value: 43.498999999999995 - type: ndcg_at_3 value: 71.83800000000001 - type: ndcg_at_5 value: 69.271 - type: precision_at_1 value: 78.0 - type: precision_at_10 value: 69.19999999999999 - type: precision_at_100 value: 50.980000000000004 - type: precision_at_1000 value: 19.426 - type: precision_at_3 value: 77.333 - type: precision_at_5 value: 74.0 - type: recall_at_1 value: 0.207 - type: recall_at_10 value: 1.822 - type: recall_at_100 value: 11.849 - type: recall_at_1000 value: 40.492 - type: recall_at_3 value: 0.622 - type: recall_at_5 value: 0.9809999999999999 - task: type: Retrieval dataset: type: webis-touche2020 name: MTEB Touche2020 config: default split: test revision: None metrics: - type: map_at_1 value: 2.001 - type: map_at_10 value: 10.376000000000001 - type: map_at_100 value: 16.936999999999998 - type: map_at_1000 value: 18.615000000000002 - type: map_at_3 value: 5.335999999999999 - type: map_at_5 value: 7.374 - type: mrr_at_1 value: 20.408 - type: mrr_at_10 value: 38.29 - type: mrr_at_100 value: 39.33 - type: mrr_at_1000 value: 39.347 - type: mrr_at_3 value: 32.993 - type: mrr_at_5 value: 36.973 - type: ndcg_at_1 value: 17.347 - type: ndcg_at_10 value: 23.515 - type: ndcg_at_100 value: 37.457 - type: ndcg_at_1000 value: 49.439 - type: ndcg_at_3 value: 22.762999999999998 - type: ndcg_at_5 value: 22.622 - type: precision_at_1 value: 20.408 - type: precision_at_10 value: 22.448999999999998 - type: precision_at_100 value: 8.184 - type: precision_at_1000 value: 1.608 - type: precision_at_3 value: 25.85 - type: precision_at_5 value: 25.306 - type: recall_at_1 value: 2.001 - type: recall_at_10 value: 17.422 - type: recall_at_100 value: 51.532999999999994 - type: recall_at_1000 value: 87.466 - type: recall_at_3 value: 6.861000000000001 - type: recall_at_5 value: 10.502 - task: type: Classification dataset: type: mteb/toxic_conversations_50k name: MTEB ToxicConversationsClassification config: default split: test revision: d7c0de2777da35d6aae2200a62c6e0e5af397c4c metrics: - type: accuracy value: 71.54419999999999 - type: ap value: 14.372170450843907 - type: f1 value: 54.94420257390529 - task: type: Classification dataset: type: mteb/tweet_sentiment_extraction name: MTEB TweetSentimentExtractionClassification config: default split: test revision: d604517c81ca91fe16a244d1248fc021f9ecee7a metrics: - type: accuracy value: 59.402942840973395 - type: f1 value: 59.4166538875571 - task: type: Clustering dataset: type: mteb/twentynewsgroups-clustering name: MTEB TwentyNewsgroupsClustering config: default split: test revision: 6125ec4e24fa026cec8a478383ee943acfbd5449 metrics: - type: v_measure value: 41.569064336457906 - task: type: PairClassification dataset: type: mteb/twittersemeval2015-pairclassification name: MTEB TwitterSemEval2015 config: default split: test revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1 metrics: - type: cos_sim_accuracy value: 85.31322644096085 - type: cos_sim_ap value: 72.14518894837381 - type: cos_sim_f1 value: 66.67489813557229 - type: cos_sim_precision value: 62.65954977953121 - type: cos_sim_recall value: 71.2401055408971 - type: dot_accuracy value: 85.31322644096085 - type: dot_ap value: 72.14521480685293 - type: dot_f1 value: 66.67489813557229 - type: dot_precision value: 62.65954977953121 - type: dot_recall value: 71.2401055408971 - type: euclidean_accuracy value: 85.31322644096085 - type: euclidean_ap value: 72.14520820485349 - type: euclidean_f1 value: 66.67489813557229 - type: euclidean_precision value: 62.65954977953121 - type: euclidean_recall value: 71.2401055408971 - type: manhattan_accuracy value: 85.21785778148656 - type: manhattan_ap value: 72.01177147657364 - type: manhattan_f1 value: 66.62594673833374 - type: manhattan_precision value: 62.0336669699727 - type: manhattan_recall value: 71.95250659630607 - type: max_accuracy value: 85.31322644096085 - type: max_ap value: 72.14521480685293 - type: max_f1 value: 66.67489813557229 - task: type: PairClassification dataset: type: mteb/twitterurlcorpus-pairclassification name: MTEB TwitterURLCorpus config: default split: test revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf metrics: - type: cos_sim_accuracy value: 89.12756626693057 - type: cos_sim_ap value: 86.05430786440826 - type: cos_sim_f1 value: 78.27759692216631 - type: cos_sim_precision value: 75.33466248931929 - type: cos_sim_recall value: 81.45980905451185 - type: dot_accuracy value: 89.12950673341872 - type: dot_ap value: 86.05431161145492 - type: dot_f1 value: 78.27759692216631 - type: dot_precision value: 75.33466248931929 - type: dot_recall value: 81.45980905451185 - type: euclidean_accuracy value: 89.12756626693057 - type: euclidean_ap value: 86.05431303247397 - type: euclidean_f1 value: 78.27759692216631 - type: euclidean_precision value: 75.33466248931929 - type: euclidean_recall value: 81.45980905451185 - type: manhattan_accuracy value: 89.04994760740482 - type: manhattan_ap value: 86.00860610892074 - type: manhattan_f1 value: 78.1846776005392 - type: manhattan_precision value: 76.10438839480975 - type: manhattan_recall value: 80.3818909762858 - type: max_accuracy value: 89.12950673341872 - type: max_ap value: 86.05431303247397 - type: max_f1 value: 78.27759692216631 ---

\"Jina

The text embedding set trained by .

## Quick Start The easiest way to starting using is to use Jina AI's Embedding API. ## Intended Usage & Model Info is an English, monolingual **embedding model** supporting **8192 sequence length**. It is based on a BERT architecture (JinaBERT) that supports the symmetric bidirectional variant of ALiBi to allow longer sequence length. The backbone is pretrained on the C4 dataset. The model is further trained on Jina AI's collection of more than 400 millions of sentence pairs and hard negatives. These pairs were obtained from various domains and were carefully selected through a thorough cleaning process. The embedding model was trained using 512 sequence length, but extrapolates to 8k sequence length (or even longer) thanks to ALiBi. This makes our model useful for a range of use cases, especially when processing long documents is needed, including long document retrieval, semantic textual similarity, text reranking, recommendation, RAG and LLM-based generative search, etc. This model has 33 million parameters, which enables lightning-fast and memory efficient inference, while still delivering impressive performance. Additionally, we provide the following embedding models: - []( 33 million parameters **(you are here)**. - []( 137 million parameters. - []( 161 million parameters Chinese-English Bilingual embeddings. - []( 161 million parameters German-English Bilingual embeddings. - [](): Spanish-English Bilingual embeddings (soon). ## Data & Parameters Jina Embeddings V2 technical report ## Usage **
Please apply mean pooling when integrating the model.**

### Why mean pooling? takes all token embeddings from model output and averaging them at sentence/paragraph level. It has been proved to be the most effective way to produce high-quality sentence embeddings. We offer an function to deal with this. However, if you would like to do it without using the default function:

You can use Jina Embedding models directly from transformers package. If you only want to handle shorter sequence, such as 2k, pass the parameter to the function: The latest sentence-transformers also supports Jina embeddings: ## Alternatives to Using Transformers Package 1. _Managed SaaS_: Get started with a free key on Jina AI's Embedding API. 2. _Private and high-performance deployment_: Get started by picking from our suite of models and deploy them on AWS Sagemaker. ## RAG Performance According to the latest blog post from LLamaIndex, > In summary, to achieve the peak performance in both hit rate and MRR, the combination of OpenAI or JinaAI-Base embeddings with the CohereRerank/bge-reranker-large reranker stands out. ## Plans 1. Bilingual embedding models supporting more European & Asian languages, including Spanish, French, Italian and Japanese. 2. Multimodal embedding models enable Multimodal RAG applications. 3. High-performt rerankers. ## Trouble Shooting **Loading of Model Code failed** If you forgot to pass the flag when calling or initializing the model via the class, you will receive an error that the model weights could not be initialized. This is caused by tranformers falling back to creating a default BERT model, instead of a jina-embedding model: ## Contact Join our Discord community and chat with other community members about ideas. ## Citation If you find Jina Embeddings useful in your research, please cite the following paper:", + "model_explanation_gemini": "Generates sentence embeddings for tasks like similarity, classification, retrieval, and clustering in English.\n\nFeatures: \n- Sentence embedding generation \n- Supports similarity, classification, retrieval, and clustering tasks \n- Optimized for English language \n- Evaluated on MTEB benchmarks \n\nComparison: \nPerforms competitively on MTEB benchmarks for tasks like classification (e.g., 82.9% accuracy on AmazonPolarity) and retrieval (e.g., 39.18 MAP@1000 on Ar" +} \ No newline at end of file diff --git a/model_data_json/ku-nlp_deberta-v2-tiny-japanese.json b/model_data_json/ku-nlp_deberta-v2-tiny-japanese.json new file mode 100644 index 0000000000000000000000000000000000000000..32af3ef49c69525410db48f0521a899a48c60513 --- /dev/null +++ b/model_data_json/ku-nlp_deberta-v2-tiny-japanese.json @@ -0,0 +1,22 @@ +{ + "model_id": "ku-nlp/deberta-v2-tiny-japanese", + "downloads": 80838, + "tags": [ + "transformers", + "pytorch", + "safetensors", + "deberta-v2", + "fill-mask", + "deberta", + "ja", + "dataset:wikipedia", + "dataset:cc100", + "dataset:oscar", + "license:cc-by-sa-4.0", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: ja license: cc-by-sa-4.0 library_name: transformers tags: - deberta - deberta-v2 - fill-mask datasets: - wikipedia - cc100 - oscar metrics: - accuracy mask_token: \"[MASK]\" widget: - text: \"京都 大学 で 自然 言語 処理 を [MASK] する 。\" --- # Model Card for Japanese DeBERTa V2 tiny ## Model description This is a Japanese DeBERTa V2 tiny model pre-trained on Japanese Wikipedia, the Japanese portion of CC-100, and the Japanese portion of OSCAR. ## How to use You can use this model for masked language modeling as follows: You can also fine-tune this model on downstream tasks. ## Tokenization The input text should be segmented into words by Juman++ in advance. Juman++ 2.0.0-rc3 was used for pre-training. Each word is tokenized into subwords by sentencepiece. ## Training data We used the following corpora for pre-training: - Japanese Wikipedia (as of 20221020, 3.2GB, 27M sentences, 1.3M documents) - Japanese portion of CC-100 (85GB, 619M sentences, 66M documents) - Japanese portion of OSCAR (54GB, 326M sentences, 25M documents) Note that we filtered out documents annotated with \"header\", \"footer\", or \"noisy\" tags in OSCAR. Also note that Japanese Wikipedia was duplicated 10 times to make the total size of the corpus comparable to that of CC-100 and OSCAR. As a result, the total size of the training data is 171GB. ## Training procedure We first segmented texts in the corpora into words using Juman++. Then, we built a sentencepiece model with 32000 tokens including words (JumanDIC) and subwords induced by the unigram language model of sentencepiece. We tokenized the segmented corpora into subwords using the sentencepiece model and trained the Japanese DeBERTa model using transformers library. The training took 33 hours using 8 NVIDIA A100-SXM4-40GB GPUs. The following hyperparameters were used during pre-training: - learning_rate: 1e-3 - per_device_train_batch_size: 128 - distributed_type: multi-GPU - num_devices: 8 - gradient_accumulation_steps: 6 - total_train_batch_size: 6,144 - max_seq_length: 512 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-06 - lr_scheduler_type: linear schedule with warmup - training_steps: 100,000 - warmup_steps: 10,000 The accuracy of the trained model on the masked language modeling task was 0.593. The evaluation set consists of 5,000 randomly sampled documents from each of the training corpora. ## Acknowledgments This work was supported by Joint Usage/Research Center for Interdisciplinary Large-scale Information Infrastructures (JHPCN) through General Collaboration Project no. jh221004, \"Developing a Platform for Constructing and Sharing of Large-Scale Japanese Language Models\". For training models, we used the mdx: a platform for the data-driven future.", + "model_explanation_gemini": "A Japanese DeBERTa V2 tiny model pre-trained for masked language modeling on Japanese Wikipedia, CC-100, and OSCAR datasets, with input text requiring Juman++ word segmentation.\n\nFeatures: \n- Language: Japanese \n- Architecture: DeBERTa V2 tiny \n- Task: Masked language modeling \n- Training data: Japanese Wikipedia, CC-100, OSCAR (171GB total) \n- Tokenization: Juman++ word segmentation + SentencePiece subwords \n- License" +} \ No newline at end of file diff --git a/model_data_json/laion_CLIP-ViT-L-14-laion2B-s32B-b82K.json b/model_data_json/laion_CLIP-ViT-L-14-laion2B-s32B-b82K.json new file mode 100644 index 0000000000000000000000000000000000000000..a11aaba8a45d0e367153baa1739aa806af24b5af --- /dev/null +++ b/model_data_json/laion_CLIP-ViT-L-14-laion2B-s32B-b82K.json @@ -0,0 +1,19 @@ +{ + "model_id": "laion/CLIP-ViT-L-14-laion2B-s32B-b82K", + "downloads": 71845, + "tags": [ + "open_clip", + "pytorch", + "tensorboard", + "safetensors", + "clip", + "zero-shot-image-classification", + "arxiv:2110.09456", + "arxiv:2111.09883", + "arxiv:1910.04867", + "license:mit", + "region:us" + ], + "description": "--- license: mit widget: - src: >- candidate_labels: playing music, playing sports example_title: Cat & Dog library_name: open_clip pipeline_tag: zero-shot-image-classification --- # Model Card for CLIP ViT-L/14 - LAION-2B # Table of Contents 1. Model Details 2. Uses 3. Training Details 4. Evaluation 5. Acknowledgements 6. Citation 7. How To Get Started With the Model # Model Details ## Model Description A CLIP ViT L/14 model trained with the LAION-2B English subset of LAION-5B ( using OpenCLIP ( Model training ('babysitting') done by Ross Wightman on the JUWELS Booster supercomputer. See acknowledgements below. # Uses As per the original OpenAI CLIP model card, this model is intended as a research output for research communities. We hope that this model will enable researchers to better understand and explore zero-shot, arbitrary image classification. We also hope it can be used for interdisciplinary studies of the potential impact of such model. The OpenAI CLIP paper includes a discussion of potential downstream impacts to provide an example for this sort of analysis. Additionally, the LAION-5B blog ( and upcoming paper include additional discussion as it relates specifically to the training dataset. ## Direct Use Zero-shot image classification, image and text retrieval, among others. ## Downstream Use Image classification and other image task fine-tuning, linear probe image classification, image generation guiding and conditioning, among others. ## Out-of-Scope Use As per the OpenAI models, **Any** deployed use case of the model - whether commercial or not - is currently out of scope. Non-deployed use cases such as image search in a constrained environment, are also not recommended unless there is thorough in-domain testing of the model with a specific, fixed class taxonomy. This is because our safety assessment demonstrated a high need for task specific testing especially given the variability of CLIP’s performance with different class taxonomies. This makes untested and unconstrained deployment of the model in any use case currently potentially harmful. Certain use cases which would fall under the domain of surveillance and facial recognition are always out-of-scope regardless of performance of the model. This is because the use of artificial intelligence for tasks such as these can be premature currently given the lack of testing norms and checks to ensure its fair use. Since the model has not been purposefully trained in or evaluated on any languages other than English, its use should be limited to English language use cases. Further the above notice, the LAION-5B dataset used in training of these models has additional considerations, see below. # Training Details ## Training Data This model was trained with the 2 Billion sample English subset of LAION-5B ( **IMPORTANT NOTE:** The motivation behind dataset creation is to democratize research and experimentation around large-scale multi-modal model training and handling of uncurated, large-scale datasets crawled from publically available internet. Our recommendation is therefore to use the dataset for research purposes. Be aware that this large-scale dataset is uncurated. Keep in mind that the uncurated nature of the dataset means that collected links may lead to strongly discomforting and disturbing content for a human viewer. Therefore, please use the demo links with caution and at your own risk. It is possible to extract a “safe” subset by filtering out samples based on the safety tags (using a customized trained NSFW classifier that we built). While this strongly reduces the chance for encountering potentially harmful content when viewing, we cannot entirely exclude the possibility for harmful content being still present in safe mode, so that the warning holds also there. We think that providing the dataset openly to broad research and other interested communities will allow for transparent investigation of benefits that come along with training large-scale models as well as pitfalls and dangers that may stay unreported or unnoticed when working with closed large datasets that remain restricted to a small community. Providing our dataset openly, we however do not recommend using it for creating ready-to-go industrial products, as the basic research about general properties and safety of such large-scale models, which we would like to encourage with this release, is still in progress. ## Training Procedure The model was trained on 384 A100 GPUs using 200M sample 'virtual' epochs where dataset shards were sampled with replacement. The model was trained with 160 virtual epochs for a total of 32B samples seen. The first 68 epochs were trained with float16 AMP, global batch size 79K (208 per GPU). Initially running to epoch 75, where the loss spiked and training failed with NaN. Romain Beaumont was training H/14 and g/14 models at the same time on Stability cluster and hit similar instabilities. Collectively we tried restarts with, * different dataset shuffle seed * different LR * gradient clipping * modifications to the architecture * Norm modifications (stable norm for final, post embed norm for text transformer) as per thanks to Phil Wang * Extra attention block norms ala Normformer ( * Scaled cosine attention ala Swin-V2 ( None of the above ended up working. Most blew up within the same epoch as original, with the exception of architecture mods. * Normformer mods signifcantly altered the network such that resuming did not quickly converge to previous performance, this was abandoned but might be worth trying from start. * Scaled cosine attn initially looked promising and lasted until epoch 90 before loss suddenly increased and appeared to remain 'stuck'. In the end, restarting at epoch 69 with precision solved all instabilities and training continued from there with global batch size 86k (224 per GPU). On A100 GPUs, had a minimal impact on the throughput once matmuls were enabled in PyTorch. Approximately 10% slower than . Romain similary changed the precision but ended up using to resolve issues. ### Slum Script # Evaluation Evaluation done with code in the LAION CLIP Benchmark suite. ## Testing Data, Factors & Metrics ### Testing Data The testing is performed with VTAB+ (A combination of VTAB ( w/ additional robustness datasets) for classification and COCO and Flickr for retrieval. **TODO** - more detail ## Results The model achieves a 75.3 zero-shot top-1 accuracy on ImageNet-1k. An initial round of benchmarks have been performed on a wider range of datasets, currently viewable at **TODO** - create table for just this model's metrics. # Acknowledgements Acknowledging the Gauss Centre for Supercomputing e.V. ( for funding this part of work by providing computing time through the John von Neumann Institute for Computing (NIC) on the GCS Supercomputer JUWELS Booster at Jülich Supercomputing Centre (JSC). # Citation **BibTeX:** LAION-5B OpenAI CLIP paper OpenCLIP software # How to Get Started with the Model Use the code below to get started with the model. ** TODO ** - Hugging Face transformers, OpenCLIP, and timm getting started snippets", + "model_explanation_gemini": "Performs zero-shot image classification and text-image retrieval by matching images with text prompts without task-specific training.\n\n**Model Features:**\n- Architecture: CLIP ViT-L/14 (Vision Transformer, Large/14 patch size)\n- Training Data: LAION-2B (English subset of LAION-5B)\n- Capabilities: Zero-shot image classification, image/text retrieval, fine-tuning for downstream tasks\n- Training: Done via OpenCLIP on JUWELS Booster supercomputer" +} \ No newline at end of file diff --git a/model_data_json/laion_CLIP-convnext_large_d_320.laion2B-s29B-b131K-ft-soup.json b/model_data_json/laion_CLIP-convnext_large_d_320.laion2B-s29B-b131K-ft-soup.json new file mode 100644 index 0000000000000000000000000000000000000000..fed4e91b0d043a2714cb8a85fdbfdf4fc7b07fb7 --- /dev/null +++ b/model_data_json/laion_CLIP-convnext_large_d_320.laion2B-s29B-b131K-ft-soup.json @@ -0,0 +1,18 @@ +{ + "model_id": "laion/CLIP-convnext_large_d_320.laion2B-s29B-b131K-ft-soup", + "downloads": 81153, + "tags": [ + "open_clip", + "tensorboard", + "safetensors", + "zero-shot-image-classification", + "clip", + "arxiv:2201.03545", + "arxiv:2210.08402", + "arxiv:1910.04867", + "license:mit", + "region:us" + ], + "description": "--- tags: - zero-shot-image-classification - clip license: mit library_name: open_clip pipeline_tag: zero-shot-image-classification --- # Model card for CLIP-convnext_large_d_320.laion2B-s29B-b131K-ft-soup # Table of Contents 1. Model Details 2. Uses 3. Training Details 4. Evaluation 5. Acknowledgements 6. Citation # Model Details ## Model Description A series of CLIP ConvNeXt-Large (w/ extra text depth, vision MLP head) models trained on the LAION-2B (english) subset of LAION-5B using OpenCLIP. The models utilize: * the timm ConvNeXt-Large model () as the image tower * a MLP () head in vision tower instead of the single projection of other CLIP models * a text tower with same width but 4 layers more depth than ViT-L / RN50x16 models (depth 16, embed dim 768). This 320x320 resolution model is a soup (weight average) of 3 fine-tunes of CLIP-convnext_large_d.laion2B-s26B-b102K-augreg at a higher resolution. It is an average of 3 fine-tunes from the final checkpoint of the original 256x256 training run w/ an additional ~2-3B samples for each fine-tune and a lower learning rate. Each fine-tune was a different learning rate (1e-4, 6e-5, 5e-5), and diff # of samples (3.2B, 2B, 2.5B). At 320x320, the ConvNext-Large-D is significantly more efficient than the L/14 model at 336x336 that OpenAI fine-tuned. L/14-336 model is 2.5x more GMAC, 2.8x more activations, and 1.22x more parameters. | Model | Dataset | Resolution | AugReg | Top-1 ImageNet Zero-Shot (%) | | ----- | ------- | ---------- | ------------ | --------- | | convnext_large_d.laion2b_s26b_b102k-augreg | LAION-2B | 256x256 | RRC (0.33, 1.0), RE (0.35), SD (0.1), D(0.1) | 75.9 | | convnext_large_d_320.laion2b_s29b_b131k-ft | LAION-2B | 320x320 | RRC (0.5, 1.0), RE (0.4), SD (0.1), D(0.0) | 76.6 | | convnext_large_d_320.laion2b_s29b_b131k-ft-soup | LAION-2B | 320x320 | RRC (0.5, 1.0), RE (0.4), SD (0.1), D(0.0) | 76.9 | RRC = Random Resize Crop (crop pcts), RE = Random Erasing (prob), SD = Stochastic Depth (prob) -- image tower only, D = Dropout (prob) -- image tower head only LAION-A = LAION Aesthetic, an ~900M sample subset of LAION-2B with pHash dedupe and asthetic score filtering. Model training done by Ross Wightman on the stability.ai cluster. # Uses As per the original OpenAI CLIP model card, this model is intended as a research output for research communities. We hope that this model will enable researchers to better understand and explore zero-shot, arbitrary image classification. We also hope it can be used for interdisciplinary studies of the potential impact of such model. The OpenAI CLIP paper includes a discussion of potential downstream impacts to provide an example for this sort of analysis. Additionally, the LAION-5B blog ( and upcoming paper include additional discussion as it relates specifically to the training dataset. ## Direct Use Zero-shot image classification, image and text retrieval, among others. ## Downstream Use Image classification and other image task fine-tuning, linear probe image classification, image generation guiding and conditioning, among others. ## Out-of-Scope Use As per the OpenAI models, **Any** deployed use case of the model - whether commercial or not - is currently out of scope. Non-deployed use cases such as image search in a constrained environment, are also not recommended unless there is thorough in-domain testing of the model with a specific, fixed class taxonomy. This is because our safety assessment demonstrated a high need for task specific testing especially given the variability of CLIP’s performance with different class taxonomies. This makes untested and unconstrained deployment of the model in any use case currently potentially harmful. Certain use cases which would fall under the domain of surveillance and facial recognition are always out-of-scope regardless of performance of the model. This is because the use of artificial intelligence for tasks such as these can be premature currently given the lack of testing norms and checks to ensure its fair use. Since the model has not been purposefully trained in or evaluated on any languages other than English, its use should be limited to English language use cases. Further the above notice, the LAION-5B dataset used in training of these models has additional considerations, see below. # Training Details ## Training Data This model was trained with LAION-2B -- A 2 billion sample English subset of LAION-5B ( **IMPORTANT NOTE:** The motivation behind dataset creation is to democratize research and experimentation around large-scale multi-modal model training and handling of uncurated, large-scale datasets crawled from publically available internet. Our recommendation is therefore to use the dataset for research purposes. Be aware that this large-scale dataset is uncurated. Keep in mind that the uncurated nature of the dataset means that collected links may lead to strongly discomforting and disturbing content for a human viewer. Therefore, please use the demo links with caution and at your own risk. It is possible to extract a “safe” subset by filtering out samples based on the safety tags (using a customized trained NSFW classifier that we built). While this strongly reduces the chance for encountering potentially harmful content when viewing, we cannot entirely exclude the possibility for harmful content being still present in safe mode, so that the warning holds also there. We think that providing the dataset openly to broad research and other interested communities will allow for transparent investigation of benefits that come along with training large-scale models as well as pitfalls and dangers that may stay unreported or unnoticed when working with closed large datasets that remain restricted to a small community. Providing our dataset openly, we however do not recommend using it for creating ready-to-go industrial products, as the basic research about general properties and safety of such large-scale models, which we would like to encourage with this release, is still in progress. ## Training Procedure All 320x320 model fine-tunes were trained with a global batch size of 131072 for 10-16 checkpoint intervals of 203.7M samples for a total of ~2-3B samples seen over fine-tune. For 320x320 models, a slurm script w/ srun below was used on 64 8-GPU (A100 40GB) nodes (Stability). # Evaluation Evaluation done with code in the LAION CLIP Benchmark suite. ## Testing Data, Factors & Metrics ### Testing Data The testing is performed with VTAB+ (A combination of VTAB ( w/ additional robustness datasets) for classification and COCO and Flickr for retrieval. ## Results The models achieve between 75.9 and 76.9 top-1 zero-shot accuracy on ImageNet-1k. Zero-shot curve of origina from-scratch 256x256 training: An initial round of benchmarks have been performed on a wider range of datasets, to be viewable at # Acknowledgements Acknowledging stability.ai for compute used to train this model. # Citation **BibTeX:** LAION-5B OpenCLIP software OpenAI CLIP paper", + "model_explanation_gemini": "A CLIP-based model for zero-shot image classification and retrieval, using a ConvNeXt-Large architecture with enhanced vision and text towers, trained on LAION-2B and fine-tuned at 320x320 resolution for improved efficiency and performance. \n\n**Features:** \n- **Architecture:** ConvNeXt-Large (vision tower) with MLP head, deeper text tower (16 layers, 768 embed dim). \n- **Training Data:** LAION-2B (English" +} \ No newline at end of file diff --git a/model_data_json/laion_clap-htsat-fused.json b/model_data_json/laion_clap-htsat-fused.json new file mode 100644 index 0000000000000000000000000000000000000000..8267f50cf1cfc4098c9660195c1c4852ed4c2f70 --- /dev/null +++ b/model_data_json/laion_clap-htsat-fused.json @@ -0,0 +1,17 @@ +{ + "model_id": "laion/clap-htsat-fused", + "downloads": 81868, + "tags": [ + "transformers", + "pytorch", + "safetensors", + "clap", + "feature-extraction", + "arxiv:2211.06687", + "license:apache-2.0", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: apache-2.0 --- # Model card for CLAP Model card for CLAP: Contrastive Language-Audio Pretraining !clap_image # Table of Contents 0. TL;DR 1. Model Details 2. Usage 3. Uses 4. Citation # TL;DR The abstract of the paper states that: > Contrastive learning has shown remarkable success in the field of multimodal representation learning. In this paper, we propose a pipeline of contrastive language-audio pretraining to develop an audio representation by combining audio data with natural language descriptions. To accomplish this target, we first release LAION-Audio-630K, a large collection of 633,526 audio-text pairs from different data sources. Second, we construct a contrastive language-audio pretraining model by considering different audio encoders and text encoders. We incorporate the feature fusion mechanism and keyword-to-caption augmentation into the model design to further enable the model to process audio inputs of variable lengths and enhance the performance. Third, we perform comprehensive experiments to evaluate our model across three tasks: text-to-audio retrieval, zero-shot audio classification, and supervised audio classification. The results demonstrate that our model achieves superior performance in text-to-audio retrieval task. In audio classification tasks, the model achieves state-of-the-art performance in the zero-shot setting and is able to obtain performance comparable to models' results in the non-zero-shot setting. LAION-Audio-630K and the proposed model are both available to the public. # Usage You can use this model for zero shot audio classification or extracting audio and/or textual features. # Uses ## Perform zero-shot audio classification ### Using ## Run the model: You can also get the audio and text embeddings using ### Run the model on CPU: ### Run the model on GPU: # Citation If you are using this model for your work, please consider citing the original paper:", + "model_explanation_gemini": "\"Contrastive Language-Audio Pretraining (CLAP) model for text-to-audio retrieval, zero-shot audio classification, and supervised audio classification, leveraging fused audio-text features and variable-length input processing.\"\n\n**Features**: \n1. Processes audio-text pairs for multimodal representation learning. \n2. Supports variable-length audio inputs via feature fusion. \n3. Enables zero-shot audio classification and text-to-audio retrieval. \n4. Outperforms others in retrieval and zero-shot tasks, matching" +} \ No newline at end of file diff --git a/model_data_json/laion_clap-htsat-unfused.json b/model_data_json/laion_clap-htsat-unfused.json new file mode 100644 index 0000000000000000000000000000000000000000..937aa81811f1028c3eccd2b830199967007e62cc --- /dev/null +++ b/model_data_json/laion_clap-htsat-unfused.json @@ -0,0 +1,16 @@ +{ + "model_id": "laion/clap-htsat-unfused", + "downloads": 72130, + "tags": [ + "transformers", + "pytorch", + "clap", + "feature-extraction", + "arxiv:2211.06687", + "license:apache-2.0", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: apache-2.0 --- # Model card for CLAP Model card for CLAP: Contrastive Language-Audio Pretraining !clap_image # Table of Contents 0. TL;DR 1. Model Details 2. Usage 3. Uses 4. Citation # TL;DR The abstract of the paper states that: > Contrastive learning has shown remarkable success in the field of multimodal representation learning. In this paper, we propose a pipeline of contrastive language-audio pretraining to develop an audio representation by combining audio data with natural language descriptions. To accomplish this target, we first release LAION-Audio-630K, a large collection of 633,526 audio-text pairs from different data sources. Second, we construct a contrastive language-audio pretraining model by considering different audio encoders and text encoders. We incorporate the feature fusion mechanism and keyword-to-caption augmentation into the model design to further enable the model to process audio inputs of variable lengths and enhance the performance. Third, we perform comprehensive experiments to evaluate our model across three tasks: text-to-audio retrieval, zero-shot audio classification, and supervised audio classification. The results demonstrate that our model achieves superior performance in text-to-audio retrieval task. In audio classification tasks, the model achieves state-of-the-art performance in the zero-shot setting and is able to obtain performance comparable to models' results in the non-zero-shot setting. LAION-Audio-630K and the proposed model are both available to the public. # Usage You can use this model for zero shot audio classification or extracting audio and/or textual features. # Uses ## Perform zero-shot audio classification ### Using ## Run the model: You can also get the audio and text embeddings using ### Run the model on CPU: ### Run the model on GPU: # Citation If you are using this model for your work, please consider citing the original paper:", + "model_explanation_gemini": "\"Contrastive Language-Audio Pretraining (CLAP) model for text-to-audio retrieval, zero-shot audio classification, and supervised audio classification, trained on 633,526 audio-text pairs with feature fusion and keyword augmentation.\"\n\nFeatures: \n- Processes variable-length audio inputs \n- Generates audio/text embeddings \n- Supports zero-shot classification \n- Compatible with CPU/GPU inference \n\nComparison: \nOutperforms other models in text-to-audio retrieval and achieves state-of-the-art zero" +} \ No newline at end of file diff --git a/model_data_json/llm-book_bert-base-japanese-v3-ner-wikipedia-dataset.json b/model_data_json/llm-book_bert-base-japanese-v3-ner-wikipedia-dataset.json new file mode 100644 index 0000000000000000000000000000000000000000..ce1e3c9752f83a63f0d541846b1851c35f89ec60 --- /dev/null +++ b/model_data_json/llm-book_bert-base-japanese-v3-ner-wikipedia-dataset.json @@ -0,0 +1,18 @@ +{ + "model_id": "llm-book/bert-base-japanese-v3-ner-wikipedia-dataset", + "downloads": 78162, + "tags": [ + "transformers", + "pytorch", + "bert", + "token-classification", + "ja", + "dataset:llm-book/ner-wikipedia-dataset", + "license:apache-2.0", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: - ja license: apache-2.0 library_name: transformers datasets: - llm-book/ner-wikipedia-dataset pipeline_tag: token-classification metrics: - seqeval - precision - recall - f1 --- # llm-book/bert-base-japanese-v3-ner-wikipedia-dataset 「大規模言語モデル入門」の第6章で紹介している固有表現認識のモデルです。 cl-tohoku/bert-base-japanese-v3をllm-book/ner-wikipedia-datasetでファインチューニングして構築されています。 ## 関連リンク * GitHubリポジトリ * Colabノートブック * データセット * 大規模言語モデル入門(Amazon.co.jp) * 大規模言語モデル入門(gihyo.jp) ## 使い方 ## ライセンス Apache License 2.0", + "model_explanation_gemini": "Performs Japanese named entity recognition (NER) by fine-tuning 'bert-base-japanese-v3' on a Wikipedia-derived dataset.\n\nModel Features: \n- Language: Japanese \n- Base Model: cl-tohoku/bert-base-japanese-v3 \n- Fine-tuning Dataset: llm-book/ner-wikipedia-dataset \n- Task: Token-classification (NER) \n- Metrics: seqeval, precision, recall, f1 \n- License: Apache 2.0 \n\nComparison:" +} \ No newline at end of file diff --git a/model_data_json/llmrails_ember-v1.json b/model_data_json/llmrails_ember-v1.json new file mode 100644 index 0000000000000000000000000000000000000000..9f53718bc066d95e811b44cbb62b226441172e81 --- /dev/null +++ b/model_data_json/llmrails_ember-v1.json @@ -0,0 +1,26 @@ +{ + "model_id": "llmrails/ember-v1", + "downloads": 71420, + "tags": [ + "sentence-transformers", + "pytorch", + "safetensors", + "bert", + "feature-extraction", + "mteb", + "sentence-similarity", + "transformers", + "en", + "arxiv:2205.12035", + "arxiv:2209.11055", + "doi:10.57967/hf/2919", + "license:mit", + "model-index", + "autotrain_compatible", + "text-embeddings-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- tags: - mteb - sentence-transformers - feature-extraction - sentence-similarity - transformers language: en license: mit model-index: - name: ember_v1 results: - task: type: Classification dataset: type: mteb/amazon_counterfactual name: MTEB AmazonCounterfactualClassification (en) config: en split: test revision: e8379541af4e31359cca9fbcf4b00f2671dba205 metrics: - type: accuracy value: 76.05970149253731 - type: ap value: 38.76045348512767 - type: f1 value: 69.8824007294685 - task: type: Classification dataset: type: mteb/amazon_polarity name: MTEB AmazonPolarityClassification config: default split: test revision: e2d317d38cd51312af73b3d32a06d1a08b442046 metrics: - type: accuracy value: 91.977 - type: ap value: 88.63507587170176 - type: f1 value: 91.9524133311038 - task: type: Classification dataset: type: mteb/amazon_reviews_multi name: MTEB AmazonReviewsClassification (en) config: en split: test revision: 1399c76144fd37290681b995c656ef9b2e06e26d metrics: - type: accuracy value: 47.938 - type: f1 value: 47.58273047536129 - task: type: Retrieval dataset: type: arguana name: MTEB ArguAna config: default split: test revision: None metrics: - type: map_at_1 value: 41.252 - type: map_at_10 value: 56.567 - type: map_at_100 value: 57.07600000000001 - type: map_at_1000 value: 57.08 - type: map_at_3 value: 52.394 - type: map_at_5 value: 55.055 - type: mrr_at_1 value: 42.39 - type: mrr_at_10 value: 57.001999999999995 - type: mrr_at_100 value: 57.531 - type: mrr_at_1000 value: 57.535000000000004 - type: mrr_at_3 value: 52.845 - type: mrr_at_5 value: 55.47299999999999 - type: ndcg_at_1 value: 41.252 - type: ndcg_at_10 value: 64.563 - type: ndcg_at_100 value: 66.667 - type: ndcg_at_1000 value: 66.77 - type: ndcg_at_3 value: 56.120000000000005 - type: ndcg_at_5 value: 60.889 - type: precision_at_1 value: 41.252 - type: precision_at_10 value: 8.982999999999999 - type: precision_at_100 value: 0.989 - type: precision_at_1000 value: 0.1 - type: precision_at_3 value: 22.309 - type: precision_at_5 value: 15.690000000000001 - type: recall_at_1 value: 41.252 - type: recall_at_10 value: 89.82900000000001 - type: recall_at_100 value: 98.86200000000001 - type: recall_at_1000 value: 99.644 - type: recall_at_3 value: 66.927 - type: recall_at_5 value: 78.45 - task: type: Clustering dataset: type: mteb/arxiv-clustering-p2p name: MTEB ArxivClusteringP2P config: default split: test revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d metrics: - type: v_measure value: 48.5799968717232 - task: type: Clustering dataset: type: mteb/arxiv-clustering-s2s name: MTEB ArxivClusteringS2S config: default split: test revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53 metrics: - type: v_measure value: 43.142844164856136 - task: type: Reranking dataset: type: mteb/askubuntudupquestions-reranking name: MTEB AskUbuntuDupQuestions config: default split: test revision: 2000358ca161889fa9c082cb41daa8dcfb161a54 metrics: - type: map value: 64.45997990276463 - type: mrr value: 77.85560392208592 - task: type: STS dataset: type: mteb/biosses-sts name: MTEB BIOSSES config: default split: test revision: d3fb88f8f02e40887cd149695127462bbcf29b4a metrics: - type: cos_sim_pearson value: 86.38299310075898 - type: cos_sim_spearman value: 85.81038898286454 - type: euclidean_pearson value: 84.28002556389774 - type: euclidean_spearman value: 85.80315990248238 - type: manhattan_pearson value: 83.9755390675032 - type: manhattan_spearman value: 85.30435335611396 - task: type: Classification dataset: type: mteb/banking77 name: MTEB Banking77Classification config: default split: test revision: 0fd18e25b25c072e09e0d92ab615fda904d66300 metrics: - type: accuracy value: 87.89935064935065 - type: f1 value: 87.87886687103833 - task: type: Clustering dataset: type: mteb/biorxiv-clustering-p2p name: MTEB BiorxivClusteringP2P config: default split: test revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40 metrics: - type: v_measure value: 38.84335510371379 - task: type: Clustering dataset: type: mteb/biorxiv-clustering-s2s name: MTEB BiorxivClusteringS2S config: default split: test revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908 metrics: - type: v_measure value: 36.377963093857005 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackAndroidRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 32.557 - type: map_at_10 value: 44.501000000000005 - type: map_at_100 value: 46.11 - type: map_at_1000 value: 46.232 - type: map_at_3 value: 40.711000000000006 - type: map_at_5 value: 42.937 - type: mrr_at_1 value: 40.916000000000004 - type: mrr_at_10 value: 51.317 - type: mrr_at_100 value: 52.003 - type: mrr_at_1000 value: 52.044999999999995 - type: mrr_at_3 value: 48.569 - type: mrr_at_5 value: 50.322 - type: ndcg_at_1 value: 40.916000000000004 - type: ndcg_at_10 value: 51.353 - type: ndcg_at_100 value: 56.762 - type: ndcg_at_1000 value: 58.555 - type: ndcg_at_3 value: 46.064 - type: ndcg_at_5 value: 48.677 - type: precision_at_1 value: 40.916000000000004 - type: precision_at_10 value: 9.927999999999999 - type: precision_at_100 value: 1.592 - type: precision_at_1000 value: 0.20600000000000002 - type: precision_at_3 value: 22.078999999999997 - type: precision_at_5 value: 16.08 - type: recall_at_1 value: 32.557 - type: recall_at_10 value: 63.942 - type: recall_at_100 value: 86.436 - type: recall_at_1000 value: 97.547 - type: recall_at_3 value: 48.367 - type: recall_at_5 value: 55.818 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackEnglishRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 32.106 - type: map_at_10 value: 42.55 - type: map_at_100 value: 43.818 - type: map_at_1000 value: 43.952999999999996 - type: map_at_3 value: 39.421 - type: map_at_5 value: 41.276 - type: mrr_at_1 value: 39.936 - type: mrr_at_10 value: 48.484 - type: mrr_at_100 value: 49.123 - type: mrr_at_1000 value: 49.163000000000004 - type: mrr_at_3 value: 46.221000000000004 - type: mrr_at_5 value: 47.603 - type: ndcg_at_1 value: 39.936 - type: ndcg_at_10 value: 48.25 - type: ndcg_at_100 value: 52.674 - type: ndcg_at_1000 value: 54.638 - type: ndcg_at_3 value: 44.05 - type: ndcg_at_5 value: 46.125 - type: precision_at_1 value: 39.936 - type: precision_at_10 value: 9.096 - type: precision_at_100 value: 1.473 - type: precision_at_1000 value: 0.19499999999999998 - type: precision_at_3 value: 21.295 - type: precision_at_5 value: 15.121 - type: recall_at_1 value: 32.106 - type: recall_at_10 value: 58.107 - type: recall_at_100 value: 76.873 - type: recall_at_1000 value: 89.079 - type: recall_at_3 value: 45.505 - type: recall_at_5 value: 51.479 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackGamingRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 41.513 - type: map_at_10 value: 54.571999999999996 - type: map_at_100 value: 55.579 - type: map_at_1000 value: 55.626 - type: map_at_3 value: 51.127 - type: map_at_5 value: 53.151 - type: mrr_at_1 value: 47.398 - type: mrr_at_10 value: 57.82000000000001 - type: mrr_at_100 value: 58.457 - type: mrr_at_1000 value: 58.479000000000006 - type: mrr_at_3 value: 55.32899999999999 - type: mrr_at_5 value: 56.89999999999999 - type: ndcg_at_1 value: 47.398 - type: ndcg_at_10 value: 60.599000000000004 - type: ndcg_at_100 value: 64.366 - type: ndcg_at_1000 value: 65.333 - type: ndcg_at_3 value: 54.98 - type: ndcg_at_5 value: 57.874 - type: precision_at_1 value: 47.398 - type: precision_at_10 value: 9.806 - type: precision_at_100 value: 1.2590000000000001 - type: precision_at_1000 value: 0.13799999999999998 - type: precision_at_3 value: 24.619 - type: precision_at_5 value: 16.878 - type: recall_at_1 value: 41.513 - type: recall_at_10 value: 74.91799999999999 - type: recall_at_100 value: 90.96 - type: recall_at_1000 value: 97.923 - type: recall_at_3 value: 60.013000000000005 - type: recall_at_5 value: 67.245 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackGisRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 26.319 - type: map_at_10 value: 35.766999999999996 - type: map_at_100 value: 36.765 - type: map_at_1000 value: 36.829 - type: map_at_3 value: 32.888 - type: map_at_5 value: 34.538999999999994 - type: mrr_at_1 value: 28.249000000000002 - type: mrr_at_10 value: 37.766 - type: mrr_at_100 value: 38.62 - type: mrr_at_1000 value: 38.667 - type: mrr_at_3 value: 35.009 - type: mrr_at_5 value: 36.608000000000004 - type: ndcg_at_1 value: 28.249000000000002 - type: ndcg_at_10 value: 41.215 - type: ndcg_at_100 value: 46.274 - type: ndcg_at_1000 value: 48.007 - type: ndcg_at_3 value: 35.557 - type: ndcg_at_5 value: 38.344 - type: precision_at_1 value: 28.249000000000002 - type: precision_at_10 value: 6.429 - type: precision_at_100 value: 0.9480000000000001 - type: precision_at_1000 value: 0.11399999999999999 - type: precision_at_3 value: 15.179 - type: precision_at_5 value: 10.734 - type: recall_at_1 value: 26.319 - type: recall_at_10 value: 56.157999999999994 - type: recall_at_100 value: 79.65 - type: recall_at_1000 value: 92.73 - type: recall_at_3 value: 40.738 - type: recall_at_5 value: 47.418 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackMathematicaRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 18.485 - type: map_at_10 value: 27.400999999999996 - type: map_at_100 value: 28.665000000000003 - type: map_at_1000 value: 28.79 - type: map_at_3 value: 24.634 - type: map_at_5 value: 26.313 - type: mrr_at_1 value: 23.134 - type: mrr_at_10 value: 32.332 - type: mrr_at_100 value: 33.318 - type: mrr_at_1000 value: 33.384 - type: mrr_at_3 value: 29.664 - type: mrr_at_5 value: 31.262 - type: ndcg_at_1 value: 23.134 - type: ndcg_at_10 value: 33.016 - type: ndcg_at_100 value: 38.763 - type: ndcg_at_1000 value: 41.619 - type: ndcg_at_3 value: 28.017999999999997 - type: ndcg_at_5 value: 30.576999999999998 - type: precision_at_1 value: 23.134 - type: precision_at_10 value: 6.069999999999999 - type: precision_at_100 value: 1.027 - type: precision_at_1000 value: 0.14200000000000002 - type: precision_at_3 value: 13.599 - type: precision_at_5 value: 9.975000000000001 - type: recall_at_1 value: 18.485 - type: recall_at_10 value: 45.39 - type: recall_at_100 value: 69.876 - type: recall_at_1000 value: 90.023 - type: recall_at_3 value: 31.587 - type: recall_at_5 value: 38.164 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackPhysicsRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 30.676 - type: map_at_10 value: 41.785 - type: map_at_100 value: 43.169000000000004 - type: map_at_1000 value: 43.272 - type: map_at_3 value: 38.462 - type: map_at_5 value: 40.32 - type: mrr_at_1 value: 37.729 - type: mrr_at_10 value: 47.433 - type: mrr_at_100 value: 48.303000000000004 - type: mrr_at_1000 value: 48.337 - type: mrr_at_3 value: 45.011 - type: mrr_at_5 value: 46.455 - type: ndcg_at_1 value: 37.729 - type: ndcg_at_10 value: 47.921 - type: ndcg_at_100 value: 53.477 - type: ndcg_at_1000 value: 55.300000000000004 - type: ndcg_at_3 value: 42.695 - type: ndcg_at_5 value: 45.175 - type: precision_at_1 value: 37.729 - type: precision_at_10 value: 8.652999999999999 - type: precision_at_100 value: 1.336 - type: precision_at_1000 value: 0.168 - type: precision_at_3 value: 20.18 - type: precision_at_5 value: 14.302000000000001 - type: recall_at_1 value: 30.676 - type: recall_at_10 value: 60.441 - type: recall_at_100 value: 83.37 - type: recall_at_1000 value: 95.092 - type: recall_at_3 value: 45.964 - type: recall_at_5 value: 52.319 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackProgrammersRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 24.978 - type: map_at_10 value: 35.926 - type: map_at_100 value: 37.341 - type: map_at_1000 value: 37.445 - type: map_at_3 value: 32.748 - type: map_at_5 value: 34.207 - type: mrr_at_1 value: 31.163999999999998 - type: mrr_at_10 value: 41.394 - type: mrr_at_100 value: 42.321 - type: mrr_at_1000 value: 42.368 - type: mrr_at_3 value: 38.964999999999996 - type: mrr_at_5 value: 40.135 - type: ndcg_at_1 value: 31.163999999999998 - type: ndcg_at_10 value: 42.191 - type: ndcg_at_100 value: 48.083999999999996 - type: ndcg_at_1000 value: 50.21 - type: ndcg_at_3 value: 36.979 - type: ndcg_at_5 value: 38.823 - type: precision_at_1 value: 31.163999999999998 - type: precision_at_10 value: 7.968 - type: precision_at_100 value: 1.2550000000000001 - type: precision_at_1000 value: 0.16199999999999998 - type: precision_at_3 value: 18.075 - type: precision_at_5 value: 12.626000000000001 - type: recall_at_1 value: 24.978 - type: recall_at_10 value: 55.410000000000004 - type: recall_at_100 value: 80.562 - type: recall_at_1000 value: 94.77600000000001 - type: recall_at_3 value: 40.359 - type: recall_at_5 value: 45.577 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 26.812166666666666 - type: map_at_10 value: 36.706916666666665 - type: map_at_100 value: 37.94016666666666 - type: map_at_1000 value: 38.05358333333333 - type: map_at_3 value: 33.72408333333334 - type: map_at_5 value: 35.36508333333333 - type: mrr_at_1 value: 31.91516666666667 - type: mrr_at_10 value: 41.09716666666666 - type: mrr_at_100 value: 41.931916666666666 - type: mrr_at_1000 value: 41.98458333333333 - type: mrr_at_3 value: 38.60183333333333 - type: mrr_at_5 value: 40.031916666666675 - type: ndcg_at_1 value: 31.91516666666667 - type: ndcg_at_10 value: 42.38725 - type: ndcg_at_100 value: 47.56291666666667 - type: ndcg_at_1000 value: 49.716499999999996 - type: ndcg_at_3 value: 37.36491666666667 - type: ndcg_at_5 value: 39.692166666666665 - type: precision_at_1 value: 31.91516666666667 - type: precision_at_10 value: 7.476749999999999 - type: precision_at_100 value: 1.1869166666666668 - type: precision_at_1000 value: 0.157 - type: precision_at_3 value: 17.275249999999996 - type: precision_at_5 value: 12.25825 - type: recall_at_1 value: 26.812166666666666 - type: recall_at_10 value: 54.82933333333333 - type: recall_at_100 value: 77.36508333333333 - type: recall_at_1000 value: 92.13366666666667 - type: recall_at_3 value: 40.83508333333334 - type: recall_at_5 value: 46.85083333333334 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackStatsRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 25.352999999999998 - type: map_at_10 value: 33.025999999999996 - type: map_at_100 value: 33.882 - type: map_at_1000 value: 33.983999999999995 - type: map_at_3 value: 30.995 - type: map_at_5 value: 32.113 - type: mrr_at_1 value: 28.834 - type: mrr_at_10 value: 36.14 - type: mrr_at_100 value: 36.815 - type: mrr_at_1000 value: 36.893 - type: mrr_at_3 value: 34.305 - type: mrr_at_5 value: 35.263 - type: ndcg_at_1 value: 28.834 - type: ndcg_at_10 value: 37.26 - type: ndcg_at_100 value: 41.723 - type: ndcg_at_1000 value: 44.314 - type: ndcg_at_3 value: 33.584 - type: ndcg_at_5 value: 35.302 - type: precision_at_1 value: 28.834 - type: precision_at_10 value: 5.736 - type: precision_at_100 value: 0.876 - type: precision_at_1000 value: 0.117 - type: precision_at_3 value: 14.468 - type: precision_at_5 value: 9.847 - type: recall_at_1 value: 25.352999999999998 - type: recall_at_10 value: 47.155 - type: recall_at_100 value: 68.024 - type: recall_at_1000 value: 87.26899999999999 - type: recall_at_3 value: 37.074 - type: recall_at_5 value: 41.352 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackTexRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 17.845 - type: map_at_10 value: 25.556 - type: map_at_100 value: 26.787 - type: map_at_1000 value: 26.913999999999998 - type: map_at_3 value: 23.075000000000003 - type: map_at_5 value: 24.308 - type: mrr_at_1 value: 21.714 - type: mrr_at_10 value: 29.543999999999997 - type: mrr_at_100 value: 30.543 - type: mrr_at_1000 value: 30.618000000000002 - type: mrr_at_3 value: 27.174 - type: mrr_at_5 value: 28.409000000000002 - type: ndcg_at_1 value: 21.714 - type: ndcg_at_10 value: 30.562 - type: ndcg_at_100 value: 36.27 - type: ndcg_at_1000 value: 39.033 - type: ndcg_at_3 value: 26.006 - type: ndcg_at_5 value: 27.843 - type: precision_at_1 value: 21.714 - type: precision_at_10 value: 5.657 - type: precision_at_100 value: 1 - type: precision_at_1000 value: 0.14100000000000001 - type: precision_at_3 value: 12.4 - type: precision_at_5 value: 8.863999999999999 - type: recall_at_1 value: 17.845 - type: recall_at_10 value: 41.72 - type: recall_at_100 value: 67.06400000000001 - type: recall_at_1000 value: 86.515 - type: recall_at_3 value: 28.78 - type: recall_at_5 value: 33.629999999999995 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackUnixRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 26.695 - type: map_at_10 value: 36.205999999999996 - type: map_at_100 value: 37.346000000000004 - type: map_at_1000 value: 37.447 - type: map_at_3 value: 32.84 - type: map_at_5 value: 34.733000000000004 - type: mrr_at_1 value: 31.343 - type: mrr_at_10 value: 40.335 - type: mrr_at_100 value: 41.162 - type: mrr_at_1000 value: 41.221000000000004 - type: mrr_at_3 value: 37.329 - type: mrr_at_5 value: 39.068999999999996 - type: ndcg_at_1 value: 31.343 - type: ndcg_at_10 value: 41.996 - type: ndcg_at_100 value: 47.096 - type: ndcg_at_1000 value: 49.4 - type: ndcg_at_3 value: 35.902 - type: ndcg_at_5 value: 38.848 - type: precision_at_1 value: 31.343 - type: precision_at_10 value: 7.146 - type: precision_at_100 value: 1.098 - type: precision_at_1000 value: 0.14100000000000001 - type: precision_at_3 value: 16.014 - type: precision_at_5 value: 11.735 - type: recall_at_1 value: 26.695 - type: recall_at_10 value: 55.525000000000006 - type: recall_at_100 value: 77.376 - type: recall_at_1000 value: 93.476 - type: recall_at_3 value: 39.439 - type: recall_at_5 value: 46.501 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackWebmastersRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 24.196 - type: map_at_10 value: 33.516 - type: map_at_100 value: 35.202 - type: map_at_1000 value: 35.426 - type: map_at_3 value: 30.561 - type: map_at_5 value: 31.961000000000002 - type: mrr_at_1 value: 29.644 - type: mrr_at_10 value: 38.769 - type: mrr_at_100 value: 39.843 - type: mrr_at_1000 value: 39.888 - type: mrr_at_3 value: 36.132999999999996 - type: mrr_at_5 value: 37.467 - type: ndcg_at_1 value: 29.644 - type: ndcg_at_10 value: 39.584 - type: ndcg_at_100 value: 45.964 - type: ndcg_at_1000 value: 48.27 - type: ndcg_at_3 value: 34.577999999999996 - type: ndcg_at_5 value: 36.498000000000005 - type: precision_at_1 value: 29.644 - type: precision_at_10 value: 7.668 - type: precision_at_100 value: 1.545 - type: precision_at_1000 value: 0.242 - type: precision_at_3 value: 16.271 - type: precision_at_5 value: 11.620999999999999 - type: recall_at_1 value: 24.196 - type: recall_at_10 value: 51.171 - type: recall_at_100 value: 79.212 - type: recall_at_1000 value: 92.976 - type: recall_at_3 value: 36.797999999999995 - type: recall_at_5 value: 42.006 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackWordpressRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 21.023 - type: map_at_10 value: 29.677 - type: map_at_100 value: 30.618000000000002 - type: map_at_1000 value: 30.725 - type: map_at_3 value: 27.227 - type: map_at_5 value: 28.523 - type: mrr_at_1 value: 22.921 - type: mrr_at_10 value: 31.832 - type: mrr_at_100 value: 32.675 - type: mrr_at_1000 value: 32.751999999999995 - type: mrr_at_3 value: 29.513 - type: mrr_at_5 value: 30.89 - type: ndcg_at_1 value: 22.921 - type: ndcg_at_10 value: 34.699999999999996 - type: ndcg_at_100 value: 39.302 - type: ndcg_at_1000 value: 41.919000000000004 - type: ndcg_at_3 value: 29.965999999999998 - type: ndcg_at_5 value: 32.22 - type: precision_at_1 value: 22.921 - type: precision_at_10 value: 5.564 - type: precision_at_100 value: 0.8340000000000001 - type: precision_at_1000 value: 0.11800000000000001 - type: precision_at_3 value: 13.123999999999999 - type: precision_at_5 value: 9.316 - type: recall_at_1 value: 21.023 - type: recall_at_10 value: 48.015 - type: recall_at_100 value: 68.978 - type: recall_at_1000 value: 88.198 - type: recall_at_3 value: 35.397 - type: recall_at_5 value: 40.701 - task: type: Retrieval dataset: type: climate-fever name: MTEB ClimateFEVER config: default split: test revision: None metrics: - type: map_at_1 value: 11.198 - type: map_at_10 value: 19.336000000000002 - type: map_at_100 value: 21.382 - type: map_at_1000 value: 21.581 - type: map_at_3 value: 15.992 - type: map_at_5 value: 17.613 - type: mrr_at_1 value: 25.080999999999996 - type: mrr_at_10 value: 36.032 - type: mrr_at_100 value: 37.1 - type: mrr_at_1000 value: 37.145 - type: mrr_at_3 value: 32.595 - type: mrr_at_5 value: 34.553 - type: ndcg_at_1 value: 25.080999999999996 - type: ndcg_at_10 value: 27.290999999999997 - type: ndcg_at_100 value: 35.31 - type: ndcg_at_1000 value: 38.885 - type: ndcg_at_3 value: 21.895999999999997 - type: ndcg_at_5 value: 23.669999999999998 - type: precision_at_1 value: 25.080999999999996 - type: precision_at_10 value: 8.645 - type: precision_at_100 value: 1.7209999999999999 - type: precision_at_1000 value: 0.23900000000000002 - type: precision_at_3 value: 16.287 - type: precision_at_5 value: 12.625 - type: recall_at_1 value: 11.198 - type: recall_at_10 value: 33.355000000000004 - type: recall_at_100 value: 60.912 - type: recall_at_1000 value: 80.89 - type: recall_at_3 value: 20.055 - type: recall_at_5 value: 25.14 - task: type: Retrieval dataset: type: dbpedia-entity name: MTEB DBPedia config: default split: test revision: None metrics: - type: map_at_1 value: 9.228 - type: map_at_10 value: 20.018 - type: map_at_100 value: 28.388999999999996 - type: map_at_1000 value: 30.073 - type: map_at_3 value: 14.366999999999999 - type: map_at_5 value: 16.705000000000002 - type: mrr_at_1 value: 69 - type: mrr_at_10 value: 77.058 - type: mrr_at_100 value: 77.374 - type: mrr_at_1000 value: 77.384 - type: mrr_at_3 value: 75.708 - type: mrr_at_5 value: 76.608 - type: ndcg_at_1 value: 57.49999999999999 - type: ndcg_at_10 value: 41.792 - type: ndcg_at_100 value: 47.374 - type: ndcg_at_1000 value: 55.13 - type: ndcg_at_3 value: 46.353 - type: ndcg_at_5 value: 43.702000000000005 - type: precision_at_1 value: 69 - type: precision_at_10 value: 32.85 - type: precision_at_100 value: 10.708 - type: precision_at_1000 value: 2.024 - type: precision_at_3 value: 49.5 - type: precision_at_5 value: 42.05 - type: recall_at_1 value: 9.228 - type: recall_at_10 value: 25.635 - type: recall_at_100 value: 54.894 - type: recall_at_1000 value: 79.38 - type: recall_at_3 value: 15.68 - type: recall_at_5 value: 19.142 - task: type: Classification dataset: type: mteb/emotion name: MTEB EmotionClassification config: default split: test revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37 metrics: - type: accuracy value: 52.035 - type: f1 value: 46.85325505614071 - task: type: Retrieval dataset: type: fever name: MTEB FEVER config: default split: test revision: None metrics: - type: map_at_1 value: 70.132 - type: map_at_10 value: 79.527 - type: map_at_100 value: 79.81200000000001 - type: map_at_1000 value: 79.828 - type: map_at_3 value: 78.191 - type: map_at_5 value: 79.092 - type: mrr_at_1 value: 75.563 - type: mrr_at_10 value: 83.80199999999999 - type: mrr_at_100 value: 83.93 - type: mrr_at_1000 value: 83.933 - type: mrr_at_3 value: 82.818 - type: mrr_at_5 value: 83.505 - type: ndcg_at_1 value: 75.563 - type: ndcg_at_10 value: 83.692 - type: ndcg_at_100 value: 84.706 - type: ndcg_at_1000 value: 85.001 - type: ndcg_at_3 value: 81.51 - type: ndcg_at_5 value: 82.832 - type: precision_at_1 value: 75.563 - type: precision_at_10 value: 10.245 - type: precision_at_100 value: 1.0959999999999999 - type: precision_at_1000 value: 0.11399999999999999 - type: precision_at_3 value: 31.518 - type: precision_at_5 value: 19.772000000000002 - type: recall_at_1 value: 70.132 - type: recall_at_10 value: 92.204 - type: recall_at_100 value: 96.261 - type: recall_at_1000 value: 98.17399999999999 - type: recall_at_3 value: 86.288 - type: recall_at_5 value: 89.63799999999999 - task: type: Retrieval dataset: type: fiqa name: MTEB FiQA2018 config: default split: test revision: None metrics: - type: map_at_1 value: 22.269 - type: map_at_10 value: 36.042 - type: map_at_100 value: 37.988 - type: map_at_1000 value: 38.162 - type: map_at_3 value: 31.691000000000003 - type: map_at_5 value: 33.988 - type: mrr_at_1 value: 44.907000000000004 - type: mrr_at_10 value: 53.348 - type: mrr_at_100 value: 54.033 - type: mrr_at_1000 value: 54.064 - type: mrr_at_3 value: 50.977 - type: mrr_at_5 value: 52.112 - type: ndcg_at_1 value: 44.907000000000004 - type: ndcg_at_10 value: 44.302 - type: ndcg_at_100 value: 51.054 - type: ndcg_at_1000 value: 53.822 - type: ndcg_at_3 value: 40.615 - type: ndcg_at_5 value: 41.455999999999996 - type: precision_at_1 value: 44.907000000000004 - type: precision_at_10 value: 12.176 - type: precision_at_100 value: 1.931 - type: precision_at_1000 value: 0.243 - type: precision_at_3 value: 27.16 - type: precision_at_5 value: 19.567999999999998 - type: recall_at_1 value: 22.269 - type: recall_at_10 value: 51.188 - type: recall_at_100 value: 75.924 - type: recall_at_1000 value: 92.525 - type: recall_at_3 value: 36.643 - type: recall_at_5 value: 42.27 - task: type: Retrieval dataset: type: hotpotqa name: MTEB HotpotQA config: default split: test revision: None metrics: - type: map_at_1 value: 40.412 - type: map_at_10 value: 66.376 - type: map_at_100 value: 67.217 - type: map_at_1000 value: 67.271 - type: map_at_3 value: 62.741 - type: map_at_5 value: 65.069 - type: mrr_at_1 value: 80.824 - type: mrr_at_10 value: 86.53 - type: mrr_at_100 value: 86.67399999999999 - type: mrr_at_1000 value: 86.678 - type: mrr_at_3 value: 85.676 - type: mrr_at_5 value: 86.256 - type: ndcg_at_1 value: 80.824 - type: ndcg_at_10 value: 74.332 - type: ndcg_at_100 value: 77.154 - type: ndcg_at_1000 value: 78.12400000000001 - type: ndcg_at_3 value: 69.353 - type: ndcg_at_5 value: 72.234 - type: precision_at_1 value: 80.824 - type: precision_at_10 value: 15.652 - type: precision_at_100 value: 1.7840000000000003 - type: precision_at_1000 value: 0.191 - type: precision_at_3 value: 44.911 - type: precision_at_5 value: 29.221000000000004 - type: recall_at_1 value: 40.412 - type: recall_at_10 value: 78.25800000000001 - type: recall_at_100 value: 89.196 - type: recall_at_1000 value: 95.544 - type: recall_at_3 value: 67.367 - type: recall_at_5 value: 73.05199999999999 - task: type: Classification dataset: type: mteb/imdb name: MTEB ImdbClassification config: default split: test revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7 metrics: - type: accuracy value: 92.78880000000001 - type: ap value: 89.39251741048801 - type: f1 value: 92.78019950076781 - task: type: Retrieval dataset: type: msmarco name: MTEB MSMARCO config: default split: dev revision: None metrics: - type: map_at_1 value: 22.888 - type: map_at_10 value: 35.146 - type: map_at_100 value: 36.325 - type: map_at_1000 value: 36.372 - type: map_at_3 value: 31.3 - type: map_at_5 value: 33.533 - type: mrr_at_1 value: 23.480999999999998 - type: mrr_at_10 value: 35.777 - type: mrr_at_100 value: 36.887 - type: mrr_at_1000 value: 36.928 - type: mrr_at_3 value: 31.989 - type: mrr_at_5 value: 34.202 - type: ndcg_at_1 value: 23.496 - type: ndcg_at_10 value: 42.028999999999996 - type: ndcg_at_100 value: 47.629 - type: ndcg_at_1000 value: 48.785000000000004 - type: ndcg_at_3 value: 34.227000000000004 - type: ndcg_at_5 value: 38.207 - type: precision_at_1 value: 23.496 - type: precision_at_10 value: 6.596 - type: precision_at_100 value: 0.9400000000000001 - type: precision_at_1000 value: 0.104 - type: precision_at_3 value: 14.513000000000002 - type: precision_at_5 value: 10.711 - type: recall_at_1 value: 22.888 - type: recall_at_10 value: 63.129999999999995 - type: recall_at_100 value: 88.90299999999999 - type: recall_at_1000 value: 97.69 - type: recall_at_3 value: 42.014 - type: recall_at_5 value: 51.554 - task: type: Classification dataset: type: mteb/mtop_domain name: MTEB MTOPDomainClassification (en) config: en split: test revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf metrics: - type: accuracy value: 94.59188326493388 - type: f1 value: 94.36568950290486 - task: type: Classification dataset: type: mteb/mtop_intent name: MTEB MTOPIntentClassification (en) config: en split: test revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba metrics: - type: accuracy value: 79.25672594619242 - type: f1 value: 59.52405059722216 - task: type: Classification dataset: type: mteb/amazon_massive_intent name: MTEB MassiveIntentClassification (en) config: en split: test revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 metrics: - type: accuracy value: 77.4142568930733 - type: f1 value: 75.23044196543388 - task: type: Classification dataset: type: mteb/amazon_massive_scenario name: MTEB MassiveScenarioClassification (en) config: en split: test revision: 7d571f92784cd94a019292a1f45445077d0ef634 metrics: - type: accuracy value: 80.44720914593141 - type: f1 value: 80.41049641537015 - task: type: Clustering dataset: type: mteb/medrxiv-clustering-p2p name: MTEB MedrxivClusteringP2P config: default split: test revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73 metrics: - type: v_measure value: 31.960921474993775 - task: type: Clustering dataset: type: mteb/medrxiv-clustering-s2s name: MTEB MedrxivClusteringS2S config: default split: test revision: 35191c8c0dca72d8ff3efcd72aa802307d469663 metrics: - type: v_measure value: 30.88042240204361 - task: type: Reranking dataset: type: mteb/mind_small name: MTEB MindSmallReranking config: default split: test revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69 metrics: - type: map value: 32.27071371606404 - type: mrr value: 33.541450459533856 - task: type: Retrieval dataset: type: nfcorpus name: MTEB NFCorpus config: default split: test revision: None metrics: - type: map_at_1 value: 6.551 - type: map_at_10 value: 14.359 - type: map_at_100 value: 18.157 - type: map_at_1000 value: 19.659 - type: map_at_3 value: 10.613999999999999 - type: map_at_5 value: 12.296 - type: mrr_at_1 value: 47.368 - type: mrr_at_10 value: 56.689 - type: mrr_at_100 value: 57.24399999999999 - type: mrr_at_1000 value: 57.284 - type: mrr_at_3 value: 54.489 - type: mrr_at_5 value: 55.928999999999995 - type: ndcg_at_1 value: 45.511 - type: ndcg_at_10 value: 36.911 - type: ndcg_at_100 value: 34.241 - type: ndcg_at_1000 value: 43.064 - type: ndcg_at_3 value: 42.348 - type: ndcg_at_5 value: 39.884 - type: precision_at_1 value: 46.749 - type: precision_at_10 value: 27.028000000000002 - type: precision_at_100 value: 8.52 - type: precision_at_1000 value: 2.154 - type: precision_at_3 value: 39.525 - type: precision_at_5 value: 34.18 - type: recall_at_1 value: 6.551 - type: recall_at_10 value: 18.602 - type: recall_at_100 value: 34.882999999999996 - type: recall_at_1000 value: 66.049 - type: recall_at_3 value: 11.872 - type: recall_at_5 value: 14.74 - task: type: Retrieval dataset: type: nq name: MTEB NQ config: default split: test revision: None metrics: - type: map_at_1 value: 27.828999999999997 - type: map_at_10 value: 43.606 - type: map_at_100 value: 44.656 - type: map_at_1000 value: 44.690000000000005 - type: map_at_3 value: 39.015 - type: map_at_5 value: 41.625 - type: mrr_at_1 value: 31.518 - type: mrr_at_10 value: 46.047 - type: mrr_at_100 value: 46.846 - type: mrr_at_1000 value: 46.867999999999995 - type: mrr_at_3 value: 42.154 - type: mrr_at_5 value: 44.468999999999994 - type: ndcg_at_1 value: 31.518 - type: ndcg_at_10 value: 51.768 - type: ndcg_at_100 value: 56.184999999999995 - type: ndcg_at_1000 value: 56.92 - type: ndcg_at_3 value: 43.059999999999995 - type: ndcg_at_5 value: 47.481 - type: precision_at_1 value: 31.518 - type: precision_at_10 value: 8.824 - type: precision_at_100 value: 1.131 - type: precision_at_1000 value: 0.12 - type: precision_at_3 value: 19.969 - type: precision_at_5 value: 14.502 - type: recall_at_1 value: 27.828999999999997 - type: recall_at_10 value: 74.244 - type: recall_at_100 value: 93.325 - type: recall_at_1000 value: 98.71799999999999 - type: recall_at_3 value: 51.601 - type: recall_at_5 value: 61.841 - task: type: Retrieval dataset: type: quora name: MTEB QuoraRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 71.54 - type: map_at_10 value: 85.509 - type: map_at_100 value: 86.137 - type: map_at_1000 value: 86.151 - type: map_at_3 value: 82.624 - type: map_at_5 value: 84.425 - type: mrr_at_1 value: 82.45 - type: mrr_at_10 value: 88.344 - type: mrr_at_100 value: 88.437 - type: mrr_at_1000 value: 88.437 - type: mrr_at_3 value: 87.417 - type: mrr_at_5 value: 88.066 - type: ndcg_at_1 value: 82.45 - type: ndcg_at_10 value: 89.092 - type: ndcg_at_100 value: 90.252 - type: ndcg_at_1000 value: 90.321 - type: ndcg_at_3 value: 86.404 - type: ndcg_at_5 value: 87.883 - type: precision_at_1 value: 82.45 - type: precision_at_10 value: 13.496 - type: precision_at_100 value: 1.536 - type: precision_at_1000 value: 0.157 - type: precision_at_3 value: 37.833 - type: precision_at_5 value: 24.79 - type: recall_at_1 value: 71.54 - type: recall_at_10 value: 95.846 - type: recall_at_100 value: 99.715 - type: recall_at_1000 value: 99.979 - type: recall_at_3 value: 88.01299999999999 - type: recall_at_5 value: 92.32000000000001 - task: type: Clustering dataset: type: mteb/reddit-clustering name: MTEB RedditClustering config: default split: test revision: 24640382cdbf8abc73003fb0fa6d111a705499eb metrics: - type: v_measure value: 57.60557586253866 - task: type: Clustering dataset: type: mteb/reddit-clustering-p2p name: MTEB RedditClusteringP2P config: default split: test revision: 282350215ef01743dc01b456c7f5241fa8937f16 metrics: - type: v_measure value: 64.0287172242051 - task: type: Retrieval dataset: type: scidocs name: MTEB SCIDOCS config: default split: test revision: None metrics: - type: map_at_1 value: 3.9849999999999994 - type: map_at_10 value: 11.397 - type: map_at_100 value: 13.985 - type: map_at_1000 value: 14.391000000000002 - type: map_at_3 value: 7.66 - type: map_at_5 value: 9.46 - type: mrr_at_1 value: 19.8 - type: mrr_at_10 value: 31.958 - type: mrr_at_100 value: 33.373999999999995 - type: mrr_at_1000 value: 33.411 - type: mrr_at_3 value: 28.316999999999997 - type: mrr_at_5 value: 30.297 - type: ndcg_at_1 value: 19.8 - type: ndcg_at_10 value: 19.580000000000002 - type: ndcg_at_100 value: 29.555999999999997 - type: ndcg_at_1000 value: 35.882 - type: ndcg_at_3 value: 17.544 - type: ndcg_at_5 value: 15.815999999999999 - type: precision_at_1 value: 19.8 - type: precision_at_10 value: 10.61 - type: precision_at_100 value: 2.501 - type: precision_at_1000 value: 0.40099999999999997 - type: precision_at_3 value: 16.900000000000002 - type: precision_at_5 value: 14.44 - type: recall_at_1 value: 3.9849999999999994 - type: recall_at_10 value: 21.497 - type: recall_at_100 value: 50.727999999999994 - type: recall_at_1000 value: 81.27499999999999 - type: recall_at_3 value: 10.263 - type: recall_at_5 value: 14.643 - task: type: STS dataset: type: mteb/sickr-sts name: MTEB SICK-R config: default split: test revision: a6ea5a8cab320b040a23452cc28066d9beae2cee metrics: - type: cos_sim_pearson value: 85.0087509585503 - type: cos_sim_spearman value: 81.74697270664319 - type: euclidean_pearson value: 81.80424382731947 - type: euclidean_spearman value: 81.29794251968431 - type: manhattan_pearson value: 81.81524666226125 - type: manhattan_spearman value: 81.29475370198963 - task: type: STS dataset: type: mteb/sts12-sts name: MTEB STS12 config: default split: test revision: a0d554a64d88156834ff5ae9920b964011b16384 metrics: - type: cos_sim_pearson value: 86.44442736429552 - type: cos_sim_spearman value: 78.51011398910948 - type: euclidean_pearson value: 83.36181801196723 - type: euclidean_spearman value: 79.47272621331535 - type: manhattan_pearson value: 83.3660113483837 - type: manhattan_spearman value: 79.47695922566032 - task: type: STS dataset: type: mteb/sts13-sts name: MTEB STS13 config: default split: test revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca metrics: - type: cos_sim_pearson value: 85.82923943323635 - type: cos_sim_spearman value: 86.62037823380983 - type: euclidean_pearson value: 83.56369548403958 - type: euclidean_spearman value: 84.2176755481191 - type: manhattan_pearson value: 83.55460702084464 - type: manhattan_spearman value: 84.18617930921467 - task: type: STS dataset: type: mteb/sts14-sts name: MTEB STS14 config: default split: test revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375 metrics: - type: cos_sim_pearson value: 84.09071068110103 - type: cos_sim_spearman value: 83.05697553913335 - type: euclidean_pearson value: 81.1377457216497 - type: euclidean_spearman value: 81.74714169016676 - type: manhattan_pearson value: 81.0893424142723 - type: manhattan_spearman value: 81.7058918219677 - task: type: STS dataset: type: mteb/sts15-sts name: MTEB STS15 config: default split: test revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3 metrics: - type: cos_sim_pearson value: 87.61132157220429 - type: cos_sim_spearman value: 88.38581627185445 - type: euclidean_pearson value: 86.14904510913374 - type: euclidean_spearman value: 86.5452758925542 - type: manhattan_pearson value: 86.1484025377679 - type: manhattan_spearman value: 86.55483841566252 - task: type: STS dataset: type: mteb/sts16-sts name: MTEB STS16 config: default split: test revision: 4d8694f8f0e0100860b497b999b3dbed754a0513 metrics: - type: cos_sim_pearson value: 85.46195145161064 - type: cos_sim_spearman value: 86.82409112251158 - type: euclidean_pearson value: 84.75479672288957 - type: euclidean_spearman value: 85.41144307151548 - type: manhattan_pearson value: 84.70914329694165 - type: manhattan_spearman value: 85.38477943384089 - task: type: STS dataset: type: mteb/sts17-crosslingual-sts name: MTEB STS17 (en-en) config: en-en split: test revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d metrics: - type: cos_sim_pearson value: 88.06351289930238 - type: cos_sim_spearman value: 87.90311138579116 - type: euclidean_pearson value: 86.17651467063077 - type: euclidean_spearman value: 84.89447802019073 - type: manhattan_pearson value: 86.3267677479595 - type: manhattan_spearman value: 85.00472295103874 - task: type: STS dataset: type: mteb/sts22-crosslingual-sts name: MTEB STS22 (en) config: en split: test revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80 metrics: - type: cos_sim_pearson value: 67.78311975978767 - type: cos_sim_spearman value: 66.76465685245887 - type: euclidean_pearson value: 67.21687806595443 - type: euclidean_spearman value: 65.05776733534435 - type: manhattan_pearson value: 67.14008143635883 - type: manhattan_spearman value: 65.25247076149701 - task: type: STS dataset: type: mteb/stsbenchmark-sts name: MTEB STSBenchmark config: default split: test revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831 metrics: - type: cos_sim_pearson value: 86.7403488889418 - type: cos_sim_spearman value: 87.76870289783061 - type: euclidean_pearson value: 84.83171077794671 - type: euclidean_spearman value: 85.50579695091902 - type: manhattan_pearson value: 84.83074260180555 - type: manhattan_spearman value: 85.47589026938667 - task: type: Reranking dataset: type: mteb/scidocs-reranking name: MTEB SciDocsRR config: default split: test revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab metrics: - type: map value: 87.56234016237356 - type: mrr value: 96.26124238869338 - task: type: Retrieval dataset: type: scifact name: MTEB SciFact config: default split: test revision: None metrics: - type: map_at_1 value: 59.660999999999994 - type: map_at_10 value: 69.105 - type: map_at_100 value: 69.78 - type: map_at_1000 value: 69.80199999999999 - type: map_at_3 value: 65.991 - type: map_at_5 value: 68.02 - type: mrr_at_1 value: 62.666999999999994 - type: mrr_at_10 value: 70.259 - type: mrr_at_100 value: 70.776 - type: mrr_at_1000 value: 70.796 - type: mrr_at_3 value: 67.889 - type: mrr_at_5 value: 69.52199999999999 - type: ndcg_at_1 value: 62.666999999999994 - type: ndcg_at_10 value: 73.425 - type: ndcg_at_100 value: 75.955 - type: ndcg_at_1000 value: 76.459 - type: ndcg_at_3 value: 68.345 - type: ndcg_at_5 value: 71.319 - type: precision_at_1 value: 62.666999999999994 - type: precision_at_10 value: 9.667 - type: precision_at_100 value: 1.09 - type: precision_at_1000 value: 0.11299999999999999 - type: precision_at_3 value: 26.333000000000002 - type: precision_at_5 value: 17.732999999999997 - type: recall_at_1 value: 59.660999999999994 - type: recall_at_10 value: 85.422 - type: recall_at_100 value: 96.167 - type: recall_at_1000 value: 100 - type: recall_at_3 value: 72.044 - type: recall_at_5 value: 79.428 - task: type: PairClassification dataset: type: mteb/sprintduplicatequestions-pairclassification name: MTEB SprintDuplicateQuestions config: default split: test revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46 metrics: - type: cos_sim_accuracy value: 99.86435643564356 - type: cos_sim_ap value: 96.83057412333741 - type: cos_sim_f1 value: 93.04215337734891 - type: cos_sim_precision value: 94.53044375644994 - type: cos_sim_recall value: 91.60000000000001 - type: dot_accuracy value: 99.7910891089109 - type: dot_ap value: 94.10681982106397 - type: dot_f1 value: 89.34881373043918 - type: dot_precision value: 90.21406727828746 - type: dot_recall value: 88.5 - type: euclidean_accuracy value: 99.85544554455446 - type: euclidean_ap value: 96.78545104478602 - type: euclidean_f1 value: 92.65143992055613 - type: euclidean_precision value: 92.01183431952663 - type: euclidean_recall value: 93.30000000000001 - type: manhattan_accuracy value: 99.85841584158416 - type: manhattan_ap value: 96.80748903307823 - type: manhattan_f1 value: 92.78247884519662 - type: manhattan_precision value: 92.36868186323092 - type: manhattan_recall value: 93.2 - type: max_accuracy value: 99.86435643564356 - type: max_ap value: 96.83057412333741 - type: max_f1 value: 93.04215337734891 - task: type: Clustering dataset: type: mteb/stackexchange-clustering name: MTEB StackExchangeClustering config: default split: test revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259 metrics: - type: v_measure value: 65.53971025855282 - task: type: Clustering dataset: type: mteb/stackexchange-clustering-p2p name: MTEB StackExchangeClusteringP2P config: default split: test revision: 815ca46b2622cec33ccafc3735d572c266efdb44 metrics: - type: v_measure value: 33.97791591490788 - task: type: Reranking dataset: type: mteb/stackoverflowdupquestions-reranking name: MTEB StackOverflowDupQuestions config: default split: test revision: e185fbe320c72810689fc5848eb6114e1ef5ec69 metrics: - type: map value: 55.852215301355066 - type: mrr value: 56.85527809608691 - task: type: Summarization dataset: type: mteb/summeval name: MTEB SummEval config: default split: test revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c metrics: - type: cos_sim_pearson value: 31.21442519856758 - type: cos_sim_spearman value: 30.822536216936825 - type: dot_pearson value: 28.661325528121807 - type: dot_spearman value: 28.1435226478879 - task: type: Retrieval dataset: type: trec-covid name: MTEB TRECCOVID config: default split: test revision: None metrics: - type: map_at_1 value: 0.183 - type: map_at_10 value: 1.526 - type: map_at_100 value: 7.915 - type: map_at_1000 value: 19.009 - type: map_at_3 value: 0.541 - type: map_at_5 value: 0.8659999999999999 - type: mrr_at_1 value: 68 - type: mrr_at_10 value: 81.186 - type: mrr_at_100 value: 81.186 - type: mrr_at_1000 value: 81.186 - type: mrr_at_3 value: 80 - type: mrr_at_5 value: 80.9 - type: ndcg_at_1 value: 64 - type: ndcg_at_10 value: 64.13799999999999 - type: ndcg_at_100 value: 47.632000000000005 - type: ndcg_at_1000 value: 43.037 - type: ndcg_at_3 value: 67.542 - type: ndcg_at_5 value: 67.496 - type: precision_at_1 value: 68 - type: precision_at_10 value: 67.80000000000001 - type: precision_at_100 value: 48.980000000000004 - type: precision_at_1000 value: 19.036 - type: precision_at_3 value: 72 - type: precision_at_5 value: 71.2 - type: recall_at_1 value: 0.183 - type: recall_at_10 value: 1.799 - type: recall_at_100 value: 11.652999999999999 - type: recall_at_1000 value: 40.086 - type: recall_at_3 value: 0.5930000000000001 - type: recall_at_5 value: 0.983 - task: type: Retrieval dataset: type: webis-touche2020 name: MTEB Touche2020 config: default split: test revision: None metrics: - type: map_at_1 value: 2.29 - type: map_at_10 value: 9.489 - type: map_at_100 value: 15.051 - type: map_at_1000 value: 16.561999999999998 - type: map_at_3 value: 5.137 - type: map_at_5 value: 6.7989999999999995 - type: mrr_at_1 value: 28.571 - type: mrr_at_10 value: 45.699 - type: mrr_at_100 value: 46.461000000000006 - type: mrr_at_1000 value: 46.461000000000006 - type: mrr_at_3 value: 41.837 - type: mrr_at_5 value: 43.163000000000004 - type: ndcg_at_1 value: 23.469 - type: ndcg_at_10 value: 23.544999999999998 - type: ndcg_at_100 value: 34.572 - type: ndcg_at_1000 value: 46.035 - type: ndcg_at_3 value: 27.200000000000003 - type: ndcg_at_5 value: 25.266 - type: precision_at_1 value: 28.571 - type: precision_at_10 value: 22.041 - type: precision_at_100 value: 7.3469999999999995 - type: precision_at_1000 value: 1.484 - type: precision_at_3 value: 29.932 - type: precision_at_5 value: 26.531 - type: recall_at_1 value: 2.29 - type: recall_at_10 value: 15.895999999999999 - type: recall_at_100 value: 45.518 - type: recall_at_1000 value: 80.731 - type: recall_at_3 value: 6.433 - type: recall_at_5 value: 9.484 - task: type: Classification dataset: type: mteb/toxic_conversations_50k name: MTEB ToxicConversationsClassification config: default split: test revision: d7c0de2777da35d6aae2200a62c6e0e5af397c4c metrics: - type: accuracy value: 71.4178 - type: ap value: 14.575240629602373 - type: f1 value: 55.02449563229096 - task: type: Classification dataset: type: mteb/tweet_sentiment_extraction name: MTEB TweetSentimentExtractionClassification config: default split: test revision: d604517c81ca91fe16a244d1248fc021f9ecee7a metrics: - type: accuracy value: 60.00282965478212 - type: f1 value: 60.34413028768773 - task: type: Clustering dataset: type: mteb/twentynewsgroups-clustering name: MTEB TwentyNewsgroupsClustering config: default split: test revision: 6125ec4e24fa026cec8a478383ee943acfbd5449 metrics: - type: v_measure value: 50.409448342549936 - task: type: PairClassification dataset: type: mteb/twittersemeval2015-pairclassification name: MTEB TwitterSemEval2015 config: default split: test revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1 metrics: - type: cos_sim_accuracy value: 87.62591643321214 - type: cos_sim_ap value: 79.28766491329633 - type: cos_sim_f1 value: 71.98772064466617 - type: cos_sim_precision value: 69.8609731876862 - type: cos_sim_recall value: 74.24802110817942 - type: dot_accuracy value: 84.75293556654945 - type: dot_ap value: 69.72705761174353 - type: dot_f1 value: 65.08692852543464 - type: dot_precision value: 63.57232704402516 - type: dot_recall value: 66.6754617414248 - type: euclidean_accuracy value: 87.44710019669786 - type: euclidean_ap value: 79.11021477292638 - type: euclidean_f1 value: 71.5052389470994 - type: euclidean_precision value: 69.32606541129832 - type: euclidean_recall value: 73.82585751978891 - type: manhattan_accuracy value: 87.42325803182929 - type: manhattan_ap value: 79.05094494327616 - type: manhattan_f1 value: 71.36333985649055 - type: manhattan_precision value: 70.58064516129032 - type: manhattan_recall value: 72.16358839050132 - type: max_accuracy value: 87.62591643321214 - type: max_ap value: 79.28766491329633 - type: max_f1 value: 71.98772064466617 - task: type: PairClassification dataset: type: mteb/twitterurlcorpus-pairclassification name: MTEB TwitterURLCorpus config: default split: test revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf metrics: - type: cos_sim_accuracy value: 88.85202002561415 - type: cos_sim_ap value: 85.9835303311168 - type: cos_sim_f1 value: 78.25741142443962 - type: cos_sim_precision value: 73.76635768811342 - type: cos_sim_recall value: 83.3307668617185 - type: dot_accuracy value: 88.20584468506229 - type: dot_ap value: 83.591632302697 - type: dot_f1 value: 76.81739705396173 - type: dot_precision value: 73.45275728837373 - type: dot_recall value: 80.50508161379734 - type: euclidean_accuracy value: 88.64633057787093 - type: euclidean_ap value: 85.25705123182283 - type: euclidean_f1 value: 77.18535726329199 - type: euclidean_precision value: 75.17699437997226 - type: euclidean_recall value: 79.30397289805975 - type: manhattan_accuracy value: 88.63274731245392 - type: manhattan_ap value: 85.2376825633018 - type: manhattan_f1 value: 77.15810785937788 - type: manhattan_precision value: 73.92255061014319 - type: manhattan_recall value: 80.68986757006468 - type: max_accuracy value: 88.85202002561415 - type: max_ap value: 85.9835303311168 - type: max_f1 value: 78.25741142443962 ---

ember-v1

This model has been trained on an extensive corpus of text pairs that encompass a broad spectrum of domains, including finance, science, medicine, law, and various others. During the training process, we incorporated techniques derived from the RetroMAE and SetFit research papers. ### Plans - The research paper will be published soon. - The v2 of the model is currently in development and will feature an extended maximum sequence length of 4,000 tokens. ## Usage Use with transformers: Use with sentence-transformers: ## Massive Text Embedding Benchmark (MTEB) Evaluation Our model achieve state-of-the-art performance on MTEB leaderboard | Model Name | Dimension | Sequence Length | Average (56) | |:-----------------------------------------------------------------------:|:---------:|:---:|:------------:| | ember-v1 | 1024 | 512 | **63.54** | | bge-large-en-v1.5 | 1024 | 512 | 63.23 | | bge-base-en-v1.5 | 768 | 512 | 63.05 | | text-embedding-ada-002 | 1536 | 8191 | 60.99 | ### Limitation This model exclusively caters to English texts, and any lengthy texts will be truncated to a maximum of 512 tokens. ## License MIT ## Citation", + "model_explanation_gemini": "Generates sentence embeddings for tasks like classification, retrieval, clustering, and similarity scoring across various English datasets.\n\nFeatures: \n- Sentence embedding generation \n- Text classification \n- Sentence similarity measurement \n- Information retrieval \n- Clustering \n- Reranking \n\nComparison: \nOutperforms other models on specific benchmarks like MTEB AmazonPolarityClassification (91.98% accuracy) and MTEB Banking77Classification (87.9% accuracy), while showing competitive results in retrieval and similarity tasks." +} \ No newline at end of file diff --git a/model_data_json/lmms-lab_llava-onevision-qwen2-7b-ov.json b/model_data_json/lmms-lab_llava-onevision-qwen2-7b-ov.json new file mode 100644 index 0000000000000000000000000000000000000000..8291c379067b56cafcd108bde65ccacb25dccbee --- /dev/null +++ b/model_data_json/lmms-lab_llava-onevision-qwen2-7b-ov.json @@ -0,0 +1,23 @@ +{ + "model_id": "lmms-lab/llava-onevision-qwen2-7b-ov", + "downloads": 80043, + "tags": [ + "transformers", + "safetensors", + "llava", + "text-generation", + "multimodal", + "conversational", + "en", + "zh", + "dataset:lmms-lab/LLaVA-OneVision-Data", + "arxiv:2408.03326", + "license:apache-2.0", + "model-index", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- datasets: - lmms-lab/LLaVA-OneVision-Data language: - en - zh library_name: transformers license: apache-2.0 metrics: - accuracy tags: - multimodal model-index: - name: llava-onevision-qwen-7b-ov results: - task: type: multimodal dataset: name: AI2D type: ai2d metrics: - type: accuracy value: 81.4 name: accuracy verified: true - task: type: multimodal dataset: name: ChartQA type: chartqa metrics: - type: accuracy value: 80.0 name: accuracy verified: true - task: type: multimodal dataset: name: DocVQA type: docvqa metrics: - type: accuracy value: 90.2 name: accuracy verified: true - task: type: multimodal dataset: name: InfoVQA type: infovqa metrics: - type: accuracy value: 70.7 name: accuracy verified: true - task: type: multimodal dataset: name: MathVerse type: mathverse metrics: - type: accuracy value: 26.2 name: accuracy verified: true - task: type: multimodal dataset: name: MathVista type: mathvista metrics: - type: accuracy value: 63.2 name: accuracy verified: true - task: type: multimodal dataset: name: MMBench type: mmbench metrics: - type: accuracy value: 80.8 name: accuracy verified: true - task: type: multimodal dataset: name: MME-Perception type: mme-perception metrics: - type: score value: 1580 name: score verified: true - task: type: multimodal dataset: name: MME-Cognition type: mme-cognition metrics: - type: score value: 418 name: score verified: true - task: type: multimodal dataset: name: MMMU type: mmmu metrics: - type: accuracy value: 48.8 name: accuracy verified: true - task: type: multimodal dataset: name: MMVet type: mmvet metrics: - type: accuracy value: 57.5 name: accuracy verified: true - task: type: multimodal dataset: name: MMStar type: mmstar metrics: - type: accuracy value: 61.7 name: accuracy verified: true - task: type: multimodal dataset: name: Seed-Bench type: seed-bench metrics: - type: accuracy value: 75.4 name: accuracy verified: true - task: type: multimodal dataset: name: Science-QA type: science-qa metrics: - type: accuracy value: 96.0 name: accuracy verified: true - task: type: multimodal dataset: name: ImageDC type: imagedc metrics: - type: accuracy value: 88.9 name: accuracy verified: true - task: type: multimodal dataset: name: MMLBench type: mmlbench metrics: - type: accuracy value: 77.1 name: accuracy verified: true - task: type: multimodal dataset: name: RealWorldQA type: realworldqa metrics: - type: accuracy value: 66.3 name: accuracy verified: true - task: type: multimodal dataset: name: Vibe-Eval type: vibe-eval metrics: - type: accuracy value: 51.7 name: accuracy verified: true - task: type: multimodal dataset: name: LLaVA-W type: llava-w metrics: - type: accuracy value: 90.7 name: accuracy verified: true - task: type: multimodal dataset: name: LLaVA-Wilder type: l-wilder metrics: - type: accuracy value: 67.8 name: accuracy verified: true - task: type: multimodal dataset: name: ActNet-QA type: actnet-qa metrics: - type: accuracy value: 56.6 name: accuracy verified: true - task: type: multimodal dataset: name: EgoSchema type: egoschema metrics: - type: accuracy value: 60.1 name: accuracy verified: true - task: type: multimodal dataset: name: MLVU type: mlvu metrics: - type: accuracy value: 64.7 name: accuracy verified: true - task: type: multimodal dataset: name: MVBench type: mvbench metrics: - type: accuracy value: 56.7 name: accuracy verified: true - task: type: multimodal dataset: name: NextQA type: nextqa metrics: - type: accuracy value: 79.4 name: accuracy verified: true - task: type: multimodal dataset: name: PercepTest type: percepTest metrics: - type: accuracy value: 49.7 name: accuracy verified: true - task: type: multimodal dataset: name: SeedBench type: seedbench metrics: - type: accuracy value: 56.9 name: accuracy verified: true - task: type: multimodal dataset: name: VideoChatGPT type: videochatgpt metrics: - type: score value: 3.49 name: score verified: true - task: type: multimodal dataset: name: VideoDC type: videodc metrics: - type: score value: 3.75 name: score verified: true - task: type: multimodal dataset: name: VideoMME type: videomme metrics: - type: accuracy value: 58.2 name: accuracy verified: true --- # LLaVA-OneVision !banner Play with the model on the LLaVA OneVision Chat. ## Table of Contents 1. Model Summary 2. Use 3. Limitations 4. Training 5. License 6. Citation ## Model Summary The LLaVA-OneVision models are 0.5/7/72B parameter models trained on LLaVA-OneVision, based on Qwen2 language model with a context window of 32K tokens. - **Repository:** LLaVA-VL/LLaVA-NeXT - **Project Website:** llava-onevision.lmms-lab.com - **Paper:** LLaVA-OneVision - **Point of Contact:** Bo Li - **Languages:** English, Chinese ## Use ### Intended use The model was trained on LLaVA-OneVision Dataset and have the ability to interact with images, multi-image and videos. **Feel free to share your generations in the Community tab!** ### Generation We provide the simple generation process for using our model. For more details, you could refer to Github. # Training ## Model - **Architecture:** SO400M + Qwen2 - **Pretraining Stage:** LCS-558K, 1 epoch, projector - **Mid Stage:** A mixture of 4.7M high-quality synthetic data, 1 epoch, full model - **Final-Image Stage:** A mixture of 3.6M single-image data, 1 epoch, full model - **OneVision Stage:** A mixture of 1.6M single-image/multi-image/video data, 1 epoch, full model - **Precision:** bfloat16 ## Hardware & Software - **GPUs:** 256 * Nvidia Tesla A100 (for whole model series training) - **Orchestration:** Huggingface Trainer - **Neural networks:** PyTorch # Citation", + "model_explanation_gemini": "A multimodal model excelling in visual question answering and document understanding, achieving high accuracy across diverse datasets like AI2D (81.4%), DocVQA (90.2%), and Science-QA (96.0%). \n\n**Features**: \n- **Multimodal capabilities** (text + vision) \n- **High accuracy** on document/visual QA tasks \n- **Broad dataset coverage** (AI2D, ChartQA, MathVista, etc.) \n- **Bilingual" +} \ No newline at end of file diff --git a/model_data_json/lmstudio-community_gemma-3-12b-it-GGUF.json b/model_data_json/lmstudio-community_gemma-3-12b-it-GGUF.json new file mode 100644 index 0000000000000000000000000000000000000000..25b690f51538a7c5c7607be77d5406d65f01ab91 --- /dev/null +++ b/model_data_json/lmstudio-community_gemma-3-12b-it-GGUF.json @@ -0,0 +1,16 @@ +{ + "model_id": "lmstudio-community/gemma-3-12b-it-GGUF", + "downloads": 78493, + "tags": [ + "gguf", + "image-text-to-text", + "base_model:google/gemma-3-12b-it", + "base_model:quantized:google/gemma-3-12b-it", + "license:gemma", + "endpoints_compatible", + "region:us", + "conversational" + ], + "description": "--- quantized_by: bartowski pipeline_tag: image-text-to-text extra_gated_prompt: >- To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license license: gemma extra_gated_heading: Access Gemma on Hugging Face base_model: google/gemma-3-12b-it --- ## 💫 Community Model> gemma 3 12b it by Google *👾 LM Studio Community models highlights program. Highlighting new & noteworthy models by the community. Join the conversation on Discord*. **Model creator:** google
**Original model**: gemma-3-12b-it
**GGUF quantization:** provided by bartowski based on release b4877
Requires llama.cpp runtime v1.19.0 ## Technical Details Supports a context length of 128k tokens, with a max output of 8192. Multimodal supporting images normalized to 896 x 896 resolution. Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Requires latest (currently beta) llama.cpp runtime. ## Special thanks 🙏 Special thanks to Georgi Gerganov and the whole team working on llama.cpp for making all of this possible. ## Disclaimers LM Studio is not the creator, originator, or owner of any Model featured in the Community Model Program. Each Community Model is created and provided by third parties. LM Studio does not endorse, support, represent or guarantee the completeness, truthfulness, accuracy, or reliability of any Community Model. You understand that Community Models can produce content that might be offensive, harmful, inaccurate or otherwise inappropriate, or deceptive. Each Community Model is the sole responsibility of the person or entity who originated such Model. LM Studio may not monitor or control the Community Models and cannot, and does not, take responsibility for any such Model. LM Studio disclaims all warranties or guarantees about the accuracy, reliability or benefits of the Community Models. LM Studio further disclaims any warranty that the Community Model will meet your requirements, be secure, uninterrupted or available at any time or location, or error-free, viruses-free, or that any errors will be corrected, or otherwise. You will be solely responsible for any damage resulting from your use of or access to the Community Models, your downloading of any Community Model, or use of any other Community Model provided by or through LM Studio.", + "model_explanation_gemini": "\"Quantized version of Google's Gemma 3 12B model for multimodal text generation and image understanding tasks with 128k token context support.\"\n\nFeatures: \n- 12B parameter GGUF-quantized model \n- Multimodal (text + image) support (896x896 resolution) \n- 128k token context length \n- 8192 token max output \n- Optimized for llama.cpp runtime (v1.19.0+) \n- Tasks: QA," +} \ No newline at end of file diff --git a/model_data_json/lpiccinelli_unidepth-v2-vitl14.json b/model_data_json/lpiccinelli_unidepth-v2-vitl14.json new file mode 100644 index 0000000000000000000000000000000000000000..f998ec84550e4f420028b0a4855faf4747e94feb --- /dev/null +++ b/model_data_json/lpiccinelli_unidepth-v2-vitl14.json @@ -0,0 +1,15 @@ +{ + "model_id": "lpiccinelli/unidepth-v2-vitl14", + "downloads": 78452, + "tags": [ + "UniDepth", + "pytorch", + "safetensors", + "model_hub_mixin", + "monocular-metric-depth-estimation", + "pytorch_model_hub_mixin", + "region:us" + ], + "description": "--- library_name: UniDepth tags: - model_hub_mixin - monocular-metric-depth-estimation - pytorch_model_hub_mixin --- This model has been pushed to the Hub using the PytorchModelHubMixin integration: - Library: - Docs: [More Information Needed]", + "model_explanation_gemini": "Estimates monocular metric depth from single images using a vision transformer architecture.\n\nFeatures: \n- Monocular depth estimation \n- Metric depth prediction \n- ViT-L/14 backbone \n\nComparison: \nNo explicit comparative analysis provided in the description." +} \ No newline at end of file diff --git a/model_data_json/m-a-p_MERT-v1-95M.json b/model_data_json/m-a-p_MERT-v1-95M.json new file mode 100644 index 0000000000000000000000000000000000000000..3a40e69a8ce3ee8e08ac95355d1c177f72678fda --- /dev/null +++ b/model_data_json/m-a-p_MERT-v1-95M.json @@ -0,0 +1,18 @@ +{ + "model_id": "m-a-p/MERT-v1-95M", + "downloads": 74062, + "tags": [ + "transformers", + "pytorch", + "mert_model", + "feature-extraction", + "music", + "audio-classification", + "custom_code", + "arxiv:2306.00107", + "license:cc-by-nc-4.0", + "region:us" + ], + "description": "--- license: cc-by-nc-4.0 inference: false tags: - music pipeline_tag: audio-classification --- # Introduction to our series work The development log of our Music Audio Pre-training (m-a-p) model family: - 02/06/2023: arxiv pre-print and training codes released. - 17/03/2023: we release two advanced music understanding models, MERT-v1-95M and MERT-v1-330M , trained with new paradigm and dataset. They outperform the previous models and can better generalize to more tasks. - 14/03/2023: we retrained the MERT-v0 model with open-source-only music dataset MERT-v0-public - 29/12/2022: a music understanding model MERT-v0 trained with **MLM** paradigm, which performs better at downstream tasks. - 29/10/2022: a pre-trained MIR model music2vec trained with **BYOL** paradigm. Here is a table for quick model pick-up: | Name | Pre-train Paradigm | Training Data (hour) | Pre-train Context (second) | Model Size | Transformer Layer-Dimension | Feature Rate | Sample Rate | Release Date | | ------------------------------------------------------------ | ------------------ | -------------------- | ---------------------------- | ---------- | --------------------------- | ------------ | ----------- | ------------ | | MERT-v1-330M | MLM | 160K | 5 | 330M | 24-1024 | 75 Hz | 24K Hz | 17/03/2023 | | MERT-v1-95M | MLM | 20K | 5 | 95M | 12-768 | 75 Hz | 24K Hz | 17/03/2023 | | MERT-v0-public | MLM | 900 | 5 | 95M | 12-768 | 50 Hz | 16K Hz | 14/03/2023 | | MERT-v0 | MLM | 1000 | 5 | 95 M | 12-768 | 50 Hz | 16K Hz | 29/12/2022 | | music2vec-v1 | BYOL | 1000 | 30 | 95 M | 12-768 | 50 Hz | 16K Hz | 30/10/2022 | ## Explanation The m-a-p models share the similar model architecture and the most distinguished difference is the paradigm in used pre-training. Other than that, there are several nuance technical configuration needs to know before using: - **Model Size**: the number of parameters that would be loaded to memory. Please select the appropriate size fitting your hardware. - **Transformer Layer-Dimension**: The number of transformer layers and the corresponding feature dimensions can be outputted from our model. This is marked out because features extracted by **different layers could have various performance depending on tasks**. - **Feature Rate**: Given a 1-second audio input, the number of features output by the model. - **Sample Rate**: The frequency of audio that the model is trained with. # Introduction to MERT-v1 Compared to MERT-v0, we introduce multiple new things in the MERT-v1 pre-training: - Change the pseudo labels to 8 codebooks from encodec, which potentially has higher quality and empower our model to support music generation. - MLM prediction with in-batch noise mixture. - Train with higher audio frequency (24K Hz). - Train with more audio data (up to 160 thousands of hours). - More available model sizes 95M and 330M. More details will be written in our coming-soon paper. # Model Usage # Citation", + "model_explanation_gemini": "\"MERT-v1-95M is a 95M-parameter music understanding model pre-trained with MLM, designed for audio classification and outperforms earlier versions with improved generalization and higher-quality features.\"\n\n**Model Features**: \n- Pre-training paradigm: MLM (Masked Language Modeling) \n- Training data: 20K hours \n- Model size: 95M parameters \n- Architecture: 12-layer Transformer (768-dim) \n- Feature rate: 75 Hz \n- Sample rate" +} \ No newline at end of file diff --git a/model_data_json/meta-llama_Llama-2-13b-chat-hf.json b/model_data_json/meta-llama_Llama-2-13b-chat-hf.json new file mode 100644 index 0000000000000000000000000000000000000000..196825e5ab9ac46d57fe2c4fb6b9674fa6d29129 --- /dev/null +++ b/model_data_json/meta-llama_Llama-2-13b-chat-hf.json @@ -0,0 +1,24 @@ +{ + "model_id": "meta-llama/Llama-2-13b-chat-hf", + "downloads": 131738, + "tags": [ + "transformers", + "pytorch", + "safetensors", + "llama", + "text-generation", + "facebook", + "meta", + "llama-2", + "conversational", + "en", + "arxiv:2307.09288", + "license:llama2", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- extra_gated_heading: You need to share contact information with Meta to access this model extra_gated_prompt: >- ### LLAMA 2 COMMUNITY LICENSE AGREEMENT \"Agreement\" means the terms and conditions for use, reproduction, distribution and modification of the Llama Materials set forth herein. \"Documentation\" means the specifications, manuals and documentation accompanying Llama 2 distributed by Meta at \"Licensee\" or \"you\" means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity's behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf. \"Llama 2\" means the foundational large language models and software and algorithms, including machine-learning model code, trained model weights, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing distributed by Meta at ai.meta.com/resources/models-and-libraries/llama-downloads/. \"Llama Materials\" means, collectively, Meta's proprietary Llama 2 and documentation (and any portion thereof) made available under this Agreement. \"Meta\" or \"we\" means Meta Platforms Ireland Limited (if you are located in or, if you are an entity, your principal place of business is in the EEA or Switzerland) and Meta Platforms, Inc. (if you are located outside of the EEA or Switzerland). By clicking \"I Accept\" below or by using or distributing any portion or element of the Llama Materials, you agree to be bound by this Agreement. 1. License Rights and Redistribution. a. Grant of Rights. You are granted a non-exclusive, worldwide, non- transferable and royalty-free limited license under Meta's intellectual property or other rights owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy, create derivative works of, and make modifications to the Llama Materials. b. Redistribution and Use. i. If you distribute or make the Llama Materials, or any derivative works thereof, available to a third party, you shall provide a copy of this Agreement to such third party. ii. If you receive Llama Materials, or any derivative works thereof, from a Licensee as part of an integrated end user product, then Section 2 of this Agreement will not apply to you. iii. You must retain in all copies of the Llama Materials that you distribute the following attribution notice within a \"Notice\" text file distributed as a part of such copies: \"Llama 2 is licensed under the LLAMA 2 Community License, Copyright (c) Meta Platforms, Inc. All Rights Reserved.\" iv. Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the Llama Materials (available at which is hereby incorporated by reference into this Agreement. v. You will not use the Llama Materials or any output or results of the Llama Materials to improve any other large language model (excluding Llama 2 or derivative works thereof). 2. Additional Commercial Terms. If, on the Llama 2 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee's affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights. 3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN \"AS IS\" BASIS, WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS. 4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING. 5. Intellectual Property. a. No trademark licenses are granted under this Agreement, and in connection with the Llama Materials, neither Meta nor Licensee may use any name or mark owned by or associated with the other or any of its affiliates, except as required for reasonable and customary use in describing and redistributing the Llama Materials. b. Subject to Meta's ownership of Llama Materials and derivatives made by or for Meta, with respect to any derivative works and modifications of the Llama Materials that are made by you, as between you and Meta, you are and will be the owner of such derivative works and modifications. c. If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 2 outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials. 6. Term and Termination. The term of this Agreement will commence upon your acceptance of this Agreement or access to the Llama Materials and will continue in full force and effect until terminated in accordance with the terms and conditions herein. Meta may terminate this Agreement if you are in breach of any term or condition of this Agreement. Upon termination of this Agreement, you shall delete and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the termination of this Agreement. 7. Governing Law and Jurisdiction. This Agreement will be governed and construed under the laws of the State of California without regard to choice of law principles, and the UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement. The courts of California shall have exclusive jurisdiction of any dispute arising out of this Agreement. ### Llama 2 Acceptable Use Policy Meta is committed to promoting safe and fair use of its tools and features, including Llama 2. If you access or use Llama 2, you agree to this Acceptable Use Policy (“Policy”). The most recent copy of this policy can be found at ai.meta.com/llama/use-policy. #### Prohibited Uses We want everyone to use Llama 2 safely and responsibly. You agree you will not use, or allow others to use, Llama 2 to: 1. Violate the law or others’ rights, including to: 1. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as: 1. Violence or terrorism 2. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material 3. Human trafficking, exploitation, and sexual violence 4. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials. 5. Sexual solicitation 6. Any other criminal activity 2. Engage in, promote, incite, or facilitate the harassment, abuse, threatening, or bullying of individuals or groups of individuals 3. Engage in, promote, incite, or facilitate discrimination or other unlawful or harmful conduct in the provision of employment, employment benefits, credit, housing, other economic benefits, or other essential goods and services 4. Engage in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or related professional practices 5. Collect, process, disclose, generate, or infer health, demographic, or other sensitive personal or private information about individuals without rights and consents required by applicable laws 6. Engage in or facilitate any action or generate any content that infringes, misappropriates, or otherwise violates any third-party rights, including the outputs or results of any products or services using the Llama 2 Materials 7. Create, generate, or facilitate the creation of malicious code, malware, computer viruses or do anything else that could disable, overburden, interfere with or impair the proper working, integrity, operation or appearance of a website or computer system 2. Engage in, promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals, including use of Llama 2 related to the following: 1. Military, warfare, nuclear industries or applications, espionage, use for materials or activities that are subject to the International Traffic Arms Regulations (ITAR) maintained by the United States Department of State 2. Guns and illegal weapons (including weapon development) 3. Illegal drugs and regulated/controlled substances 4. Operation of critical infrastructure, transportation technologies, or heavy machinery 5. Self-harm or harm to others, including suicide, cutting, and eating disorders 6. Any content intended to incite or promote violence, abuse, or any infliction of bodily harm to an individual 3. Intentionally deceive or mislead others, including use of Llama 2 related to the following: 1. Generating, promoting, or furthering fraud or the creation or promotion of disinformation 2. Generating, promoting, or furthering defamatory content, including the creation of defamatory statements, images, or other content 3. Generating, promoting, or further distributing spam 4. Impersonating another individual without consent, authorization, or legal right 5. Representing that the use of Llama 2 or outputs are human-generated 6. Generating or facilitating false online engagement, including fake reviews and other means of fake online engagement 4. Fail to appropriately disclose to end users any known dangers of your AI system Please report any violation of this Policy, software “bug,” or other problems that could lead to a violation of this Policy through one of the following means: * Reporting issues with the model: github.com/facebookresearch/llama * Reporting risky content generated by the model: developers.facebook.com/llama_output_feedback * Reporting bugs and security concerns: facebook.com/whitehat/info * Reporting violations of the Acceptable Use Policy or unlicensed uses of Llama: LlamaUseReport@meta.com extra_gated_fields: First Name: text Last Name: text Date of birth: date_picker Country: country Affiliation: text geo: ip_location By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy: checkbox extra_gated_description: >- The information you provide will be collected, stored, processed and shared in accordance with the Meta Privacy Policy. extra_gated_button_content: Submit language: - en pipeline_tag: text-generation tags: - facebook - meta - pytorch - llama - llama-2 license: llama2 --- # **Llama 2** Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is the repository for the 13B fine-tuned model, optimized for dialogue use cases and converted for the Hugging Face Transformers format. Links to other models can be found in the index at the bottom. ## Model Details *Note: Use of this model is governed by the Meta license. In order to download the model weights and tokenizer, please visit the website and accept our License before requesting access here.* Meta developed and publicly released the Llama 2 family of large language models (LLMs), a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama-2-Chat, are optimized for dialogue use cases. Llama-2-Chat models outperform open-source chat models on most benchmarks we tested, and in our human evaluations for helpfulness and safety, are on par with some popular closed-source models like ChatGPT and PaLM. **Model Developers** Meta **Variations** Llama 2 comes in a range of parameter sizes — 7B, 13B, and 70B — as well as pretrained and fine-tuned variations. **Input** Models input text only. **Output** Models generate text only. **Model Architecture** Llama 2 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align to human preferences for helpfulness and safety. ||Training Data|Params|Content Length|GQA|Tokens|LR| |---|---|---|---|---|---|---| |Llama 2|*A new mix of publicly available online data*|7B|4k|✗|2.0T|3.0 x 10-4| |Llama 2|*A new mix of publicly available online data*|13B|4k|✗|2.0T|3.0 x 10-4| |Llama 2|*A new mix of publicly available online data*|70B|4k|✔|2.0T|1.5 x 10-4| *Llama 2 family of models.* Token counts refer to pretraining data only. All models are trained with a global batch-size of 4M tokens. Bigger models - 70B -- use Grouped-Query Attention (GQA) for improved inference scalability. **Model Dates** Llama 2 was trained between January 2023 and July 2023. **Status** This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. **License** A custom commercial license is available at: **Research Paper** \"Llama-2: Open Foundation and Fine-tuned Chat Models\" ## Intended Use **Intended Use Cases** Llama 2 is intended for commercial and research use in English. Tuned models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. To get the expected features and performance for the chat versions, a specific formatting needs to be followed, including the and tags, and tokens, and the whitespaces and breaklines in between (we recommend calling on inputs to avoid double-spaces). See our reference code in github for details: []( **Out-of-scope Uses** Use in any manner that violates applicable laws or regulations (including trade compliance laws).Use in languages other than English. Use in any other way that is prohibited by the Acceptable Use Policy and Licensing Agreement for Llama 2. ## Hardware and Software **Training Factors** We used custom training libraries, Meta's Research Super Cluster, and production clusters for pretraining. Fine-tuning, annotation, and evaluation were also performed on third-party cloud compute. **Carbon Footprint** Pretraining utilized a cumulative 3.3M GPU hours of computation on hardware of type A100-80GB (TDP of 350-400W). Estimated total emissions were 539 tCO2eq, 100% of which were offset by Meta’s sustainability program. ||Time (GPU hours)|Power Consumption (W)|Carbon Emitted(tCO2eq)| |---|---|---|---| |Llama 2 7B|184320|400|31.22| |Llama 2 13B|368640|400|62.44| |Llama 2 70B|1720320|400|291.42| |Total|3311616||539.00| **CO2 emissions during pretraining.** Time: total GPU time required for training each model. Power Consumption: peak power capacity per GPU device for the GPUs used adjusted for power usage efficiency. 100% of the emissions are directly offset by Meta's sustainability program, and because we are openly releasing these models, the pretraining costs do not need to be incurred by others. ## Training Data **Overview** Llama 2 was pretrained on 2 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over one million new human-annotated examples. Neither the pretraining nor the fine-tuning datasets include Meta user data. **Data Freshness** The pretraining data has a cutoff of September 2022, but some tuning data is more recent, up to July 2023. ## Evaluation Results In this section, we report the results for the Llama 1 and Llama 2 models on standard academic benchmarks.For all the evaluations, we use our internal evaluations library. |Model|Size|Code|Commonsense Reasoning|World Knowledge|Reading Comprehension|Math|MMLU|BBH|AGI Eval| |---|---|---|---|---|---|---|---|---|---| |Llama 1|7B|14.1|60.8|46.2|58.5|6.95|35.1|30.3|23.9| |Llama 1|13B|18.9|66.1|52.6|62.3|10.9|46.9|37.0|33.9| |Llama 1|33B|26.0|70.0|58.4|67.6|21.4|57.8|39.8|41.7| |Llama 1|65B|30.7|70.7|60.5|68.6|30.8|63.4|43.5|47.6| |Llama 2|7B|16.8|63.9|48.9|61.3|14.6|45.3|32.6|29.3| |Llama 2|13B|24.5|66.9|55.4|65.8|28.7|54.8|39.4|39.1| |Llama 2|70B|**37.5**|**71.9**|**63.6**|**69.4**|**35.2**|**68.9**|**51.2**|**54.2**| **Overall performance on grouped academic benchmarks.** *Code:* We report the average pass@1 scores of our models on HumanEval and MBPP. *Commonsense Reasoning:* We report the average of PIQA, SIQA, HellaSwag, WinoGrande, ARC easy and challenge, OpenBookQA, and CommonsenseQA. We report 7-shot results for CommonSenseQA and 0-shot results for all other benchmarks. *World Knowledge:* We evaluate the 5-shot performance on NaturalQuestions and TriviaQA and report the average. *Reading Comprehension:* For reading comprehension, we report the 0-shot average on SQuAD, QuAC, and BoolQ. *MATH:* We report the average of the GSM8K (8 shot) and MATH (4 shot) benchmarks at top 1. |||TruthfulQA|Toxigen| |---|---|---|---| |Llama 1|7B|27.42|23.00| |Llama 1|13B|41.74|23.08| |Llama 1|33B|44.19|22.57| |Llama 1|65B|48.71|21.77| |Llama 2|7B|33.29|**21.25**| |Llama 2|13B|41.86|26.10| |Llama 2|70B|**50.18**|24.60| **Evaluation of pretrained LLMs on automatic safety benchmarks.** For TruthfulQA, we present the percentage of generations that are both truthful and informative (the higher the better). For ToxiGen, we present the percentage of toxic generations (the smaller the better). |||TruthfulQA|Toxigen| |---|---|---|---| |Llama-2-Chat|7B|57.04|**0.00**| |Llama-2-Chat|13B|62.18|**0.00**| |Llama-2-Chat|70B|**64.14**|0.01| **Evaluation of fine-tuned LLMs on different safety datasets.** Same metric definitions as above. ## Ethical Considerations and Limitations Llama 2 is a new technology that carries risks with use. Testing conducted to date has been in English, and has not covered, nor could it cover all scenarios. For these reasons, as with all LLMs, Llama 2’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 2, developers should perform safety testing and tuning tailored to their specific applications of the model. Please see the Responsible Use Guide available at ## Reporting Issues Please report any software “bug,” or other problems with the models through one of the following means: - Reporting issues with the model: github.com/facebookresearch/llama - Reporting problematic content generated by the model: developers.facebook.com/llama_output_feedback - Reporting bugs and security concerns: facebook.com/whitehat/info ## Llama Model Index |Model|Llama2|Llama2-hf|Llama2-chat|Llama2-chat-hf| |---|---|---|---|---| |7B| Link | Link | Link | Link| |13B| Link | Link | Link | Link| |70B| Link | Link | Link | Link|", + "model_explanation_gemini": "A 13B-parameter Llama 2-based model fine-tuned for chat applications, requiring Meta's license agreement for access.\n\n**Features:** \n- Large language model (13B parameters) \n- Optimized for chat/dialog use cases \n- Requires Meta's community license agreement \n- Restricts redistribution/commercial use (700M user threshold) \n- Prohibits using outputs to improve competing models \n\n**Comparison:** \nThis model is part of the Llama 2 family (" +} \ No newline at end of file diff --git a/model_data_json/meta-llama_Llama-2-7b-chat-hf.json b/model_data_json/meta-llama_Llama-2-7b-chat-hf.json new file mode 100644 index 0000000000000000000000000000000000000000..09ab0b663f00bc7e5839cd08587309e567aa5327 --- /dev/null +++ b/model_data_json/meta-llama_Llama-2-7b-chat-hf.json @@ -0,0 +1,24 @@ +{ + "model_id": "meta-llama/Llama-2-7b-chat-hf", + "downloads": 1110112, + "tags": [ + "transformers", + "pytorch", + "safetensors", + "llama", + "text-generation", + "facebook", + "meta", + "llama-2", + "conversational", + "en", + "arxiv:2307.09288", + "license:llama2", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- extra_gated_heading: You need to share contact information with Meta to access this model extra_gated_prompt: >- ### LLAMA 2 COMMUNITY LICENSE AGREEMENT \"Agreement\" means the terms and conditions for use, reproduction, distribution and modification of the Llama Materials set forth herein. \"Documentation\" means the specifications, manuals and documentation accompanying Llama 2 distributed by Meta at \"Licensee\" or \"you\" means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity's behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf. \"Llama 2\" means the foundational large language models and software and algorithms, including machine-learning model code, trained model weights, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing distributed by Meta at ai.meta.com/resources/models-and-libraries/llama-downloads/. \"Llama Materials\" means, collectively, Meta's proprietary Llama 2 and documentation (and any portion thereof) made available under this Agreement. \"Meta\" or \"we\" means Meta Platforms Ireland Limited (if you are located in or, if you are an entity, your principal place of business is in the EEA or Switzerland) and Meta Platforms, Inc. (if you are located outside of the EEA or Switzerland). By clicking \"I Accept\" below or by using or distributing any portion or element of the Llama Materials, you agree to be bound by this Agreement. 1. License Rights and Redistribution. a. Grant of Rights. You are granted a non-exclusive, worldwide, non- transferable and royalty-free limited license under Meta's intellectual property or other rights owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy, create derivative works of, and make modifications to the Llama Materials. b. Redistribution and Use. i. If you distribute or make the Llama Materials, or any derivative works thereof, available to a third party, you shall provide a copy of this Agreement to such third party. ii. If you receive Llama Materials, or any derivative works thereof, from a Licensee as part of an integrated end user product, then Section 2 of this Agreement will not apply to you. iii. You must retain in all copies of the Llama Materials that you distribute the following attribution notice within a \"Notice\" text file distributed as a part of such copies: \"Llama 2 is licensed under the LLAMA 2 Community License, Copyright (c) Meta Platforms, Inc. All Rights Reserved.\" iv. Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the Llama Materials (available at which is hereby incorporated by reference into this Agreement. v. You will not use the Llama Materials or any output or results of the Llama Materials to improve any other large language model (excluding Llama 2 or derivative works thereof). 2. Additional Commercial Terms. If, on the Llama 2 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee's affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights. 3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN \"AS IS\" BASIS, WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS. 4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING. 5. Intellectual Property. a. No trademark licenses are granted under this Agreement, and in connection with the Llama Materials, neither Meta nor Licensee may use any name or mark owned by or associated with the other or any of its affiliates, except as required for reasonable and customary use in describing and redistributing the Llama Materials. b. Subject to Meta's ownership of Llama Materials and derivatives made by or for Meta, with respect to any derivative works and modifications of the Llama Materials that are made by you, as between you and Meta, you are and will be the owner of such derivative works and modifications. c. If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 2 outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials. 6. Term and Termination. The term of this Agreement will commence upon your acceptance of this Agreement or access to the Llama Materials and will continue in full force and effect until terminated in accordance with the terms and conditions herein. Meta may terminate this Agreement if you are in breach of any term or condition of this Agreement. Upon termination of this Agreement, you shall delete and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the termination of this Agreement. 7. Governing Law and Jurisdiction. This Agreement will be governed and construed under the laws of the State of California without regard to choice of law principles, and the UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement. The courts of California shall have exclusive jurisdiction of any dispute arising out of this Agreement. ### Llama 2 Acceptable Use Policy Meta is committed to promoting safe and fair use of its tools and features, including Llama 2. If you access or use Llama 2, you agree to this Acceptable Use Policy (“Policy”). The most recent copy of this policy can be found at ai.meta.com/llama/use-policy. #### Prohibited Uses We want everyone to use Llama 2 safely and responsibly. You agree you will not use, or allow others to use, Llama 2 to: 1. Violate the law or others’ rights, including to: 1. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as: 1. Violence or terrorism 2. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material 3. Human trafficking, exploitation, and sexual violence 4. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials. 5. Sexual solicitation 6. Any other criminal activity 2. Engage in, promote, incite, or facilitate the harassment, abuse, threatening, or bullying of individuals or groups of individuals 3. Engage in, promote, incite, or facilitate discrimination or other unlawful or harmful conduct in the provision of employment, employment benefits, credit, housing, other economic benefits, or other essential goods and services 4. Engage in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or related professional practices 5. Collect, process, disclose, generate, or infer health, demographic, or other sensitive personal or private information about individuals without rights and consents required by applicable laws 6. Engage in or facilitate any action or generate any content that infringes, misappropriates, or otherwise violates any third-party rights, including the outputs or results of any products or services using the Llama 2 Materials 7. Create, generate, or facilitate the creation of malicious code, malware, computer viruses or do anything else that could disable, overburden, interfere with or impair the proper working, integrity, operation or appearance of a website or computer system 2. Engage in, promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals, including use of Llama 2 related to the following: 1. Military, warfare, nuclear industries or applications, espionage, use for materials or activities that are subject to the International Traffic Arms Regulations (ITAR) maintained by the United States Department of State 2. Guns and illegal weapons (including weapon development) 3. Illegal drugs and regulated/controlled substances 4. Operation of critical infrastructure, transportation technologies, or heavy machinery 5. Self-harm or harm to others, including suicide, cutting, and eating disorders 6. Any content intended to incite or promote violence, abuse, or any infliction of bodily harm to an individual 3. Intentionally deceive or mislead others, including use of Llama 2 related to the following: 1. Generating, promoting, or furthering fraud or the creation or promotion of disinformation 2. Generating, promoting, or furthering defamatory content, including the creation of defamatory statements, images, or other content 3. Generating, promoting, or further distributing spam 4. Impersonating another individual without consent, authorization, or legal right 5. Representing that the use of Llama 2 or outputs are human-generated 6. Generating or facilitating false online engagement, including fake reviews and other means of fake online engagement 4. Fail to appropriately disclose to end users any known dangers of your AI system Please report any violation of this Policy, software “bug,” or other problems that could lead to a violation of this Policy through one of the following means: * Reporting issues with the model: github.com/facebookresearch/llama * Reporting risky content generated by the model: developers.facebook.com/llama_output_feedback * Reporting bugs and security concerns: facebook.com/whitehat/info * Reporting violations of the Acceptable Use Policy or unlicensed uses of Llama: LlamaUseReport@meta.com extra_gated_fields: First Name: text Last Name: text Date of birth: date_picker Country: country Affiliation: text geo: ip_location By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy: checkbox extra_gated_description: >- The information you provide will be collected, stored, processed and shared in accordance with the Meta Privacy Policy. extra_gated_button_content: Submit language: - en pipeline_tag: text-generation tags: - facebook - meta - pytorch - llama - llama-2 license: llama2 --- # **Llama 2** Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is the repository for the 7B fine-tuned model, optimized for dialogue use cases and converted for the Hugging Face Transformers format. Links to other models can be found in the index at the bottom. ## Model Details *Note: Use of this model is governed by the Meta license. In order to download the model weights and tokenizer, please visit the website and accept our License before requesting access here.* Meta developed and publicly released the Llama 2 family of large language models (LLMs), a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama-2-Chat, are optimized for dialogue use cases. Llama-2-Chat models outperform open-source chat models on most benchmarks we tested, and in our human evaluations for helpfulness and safety, are on par with some popular closed-source models like ChatGPT and PaLM. **Model Developers** Meta **Variations** Llama 2 comes in a range of parameter sizes — 7B, 13B, and 70B — as well as pretrained and fine-tuned variations. **Input** Models input text only. **Output** Models generate text only. **Model Architecture** Llama 2 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align to human preferences for helpfulness and safety. ||Training Data|Params|Content Length|GQA|Tokens|LR| |---|---|---|---|---|---|---| |Llama 2|*A new mix of publicly available online data*|7B|4k|✗|2.0T|3.0 x 10-4| |Llama 2|*A new mix of publicly available online data*|13B|4k|✗|2.0T|3.0 x 10-4| |Llama 2|*A new mix of publicly available online data*|70B|4k|✔|2.0T|1.5 x 10-4| *Llama 2 family of models.* Token counts refer to pretraining data only. All models are trained with a global batch-size of 4M tokens. Bigger models - 70B -- use Grouped-Query Attention (GQA) for improved inference scalability. **Model Dates** Llama 2 was trained between January 2023 and July 2023. **Status** This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. **License** A custom commercial license is available at: **Research Paper** \"Llama-2: Open Foundation and Fine-tuned Chat Models\" ## Intended Use **Intended Use Cases** Llama 2 is intended for commercial and research use in English. Tuned models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. To get the expected features and performance for the chat versions, a specific formatting needs to be followed, including the and tags, and tokens, and the whitespaces and breaklines in between (we recommend calling on inputs to avoid double-spaces). See our reference code in github for details: []( **Out-of-scope Uses** Use in any manner that violates applicable laws or regulations (including trade compliance laws).Use in languages other than English. Use in any other way that is prohibited by the Acceptable Use Policy and Licensing Agreement for Llama 2. ## Hardware and Software **Training Factors** We used custom training libraries, Meta's Research Super Cluster, and production clusters for pretraining. Fine-tuning, annotation, and evaluation were also performed on third-party cloud compute. **Carbon Footprint** Pretraining utilized a cumulative 3.3M GPU hours of computation on hardware of type A100-80GB (TDP of 350-400W). Estimated total emissions were 539 tCO2eq, 100% of which were offset by Meta’s sustainability program. ||Time (GPU hours)|Power Consumption (W)|Carbon Emitted(tCO2eq)| |---|---|---|---| |Llama 2 7B|184320|400|31.22| |Llama 2 13B|368640|400|62.44| |Llama 2 70B|1720320|400|291.42| |Total|3311616||539.00| **CO2 emissions during pretraining.** Time: total GPU time required for training each model. Power Consumption: peak power capacity per GPU device for the GPUs used adjusted for power usage efficiency. 100% of the emissions are directly offset by Meta's sustainability program, and because we are openly releasing these models, the pretraining costs do not need to be incurred by others. ## Training Data **Overview** Llama 2 was pretrained on 2 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over one million new human-annotated examples. Neither the pretraining nor the fine-tuning datasets include Meta user data. **Data Freshness** The pretraining data has a cutoff of September 2022, but some tuning data is more recent, up to July 2023. ## Evaluation Results In this section, we report the results for the Llama 1 and Llama 2 models on standard academic benchmarks.For all the evaluations, we use our internal evaluations library. |Model|Size|Code|Commonsense Reasoning|World Knowledge|Reading Comprehension|Math|MMLU|BBH|AGI Eval| |---|---|---|---|---|---|---|---|---|---| |Llama 1|7B|14.1|60.8|46.2|58.5|6.95|35.1|30.3|23.9| |Llama 1|13B|18.9|66.1|52.6|62.3|10.9|46.9|37.0|33.9| |Llama 1|33B|26.0|70.0|58.4|67.6|21.4|57.8|39.8|41.7| |Llama 1|65B|30.7|70.7|60.5|68.6|30.8|63.4|43.5|47.6| |Llama 2|7B|16.8|63.9|48.9|61.3|14.6|45.3|32.6|29.3| |Llama 2|13B|24.5|66.9|55.4|65.8|28.7|54.8|39.4|39.1| |Llama 2|70B|**37.5**|**71.9**|**63.6**|**69.4**|**35.2**|**68.9**|**51.2**|**54.2**| **Overall performance on grouped academic benchmarks.** *Code:* We report the average pass@1 scores of our models on HumanEval and MBPP. *Commonsense Reasoning:* We report the average of PIQA, SIQA, HellaSwag, WinoGrande, ARC easy and challenge, OpenBookQA, and CommonsenseQA. We report 7-shot results for CommonSenseQA and 0-shot results for all other benchmarks. *World Knowledge:* We evaluate the 5-shot performance on NaturalQuestions and TriviaQA and report the average. *Reading Comprehension:* For reading comprehension, we report the 0-shot average on SQuAD, QuAC, and BoolQ. *MATH:* We report the average of the GSM8K (8 shot) and MATH (4 shot) benchmarks at top 1. |||TruthfulQA|Toxigen| |---|---|---|---| |Llama 1|7B|27.42|23.00| |Llama 1|13B|41.74|23.08| |Llama 1|33B|44.19|22.57| |Llama 1|65B|48.71|21.77| |Llama 2|7B|33.29|**21.25**| |Llama 2|13B|41.86|26.10| |Llama 2|70B|**50.18**|24.60| **Evaluation of pretrained LLMs on automatic safety benchmarks.** For TruthfulQA, we present the percentage of generations that are both truthful and informative (the higher the better). For ToxiGen, we present the percentage of toxic generations (the smaller the better). |||TruthfulQA|Toxigen| |---|---|---|---| |Llama-2-Chat|7B|57.04|**0.00**| |Llama-2-Chat|13B|62.18|**0.00**| |Llama-2-Chat|70B|**64.14**|0.01| **Evaluation of fine-tuned LLMs on different safety datasets.** Same metric definitions as above. ## Ethical Considerations and Limitations Llama 2 is a new technology that carries risks with use. Testing conducted to date has been in English, and has not covered, nor could it cover all scenarios. For these reasons, as with all LLMs, Llama 2’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 2, developers should perform safety testing and tuning tailored to their specific applications of the model. Please see the Responsible Use Guide available at ## Reporting Issues Please report any software “bug,” or other problems with the models through one of the following means: - Reporting issues with the model: github.com/facebookresearch/llama - Reporting problematic content generated by the model: developers.facebook.com/llama_output_feedback - Reporting bugs and security concerns: facebook.com/whitehat/info ## Llama Model Index |Model|Llama2|Llama2-hf|Llama2-chat|Llama2-chat-hf| |---|---|---|---|---| |7B| Link | Link | Link | Link| |13B| Link | Link | Link | Link| |70B| Link | Link | Link | Link|", + "model_explanation_gemini": "A 7B-parameter chat model fine-tuned for dialogue use cases, requiring Meta's license agreement for access.\n\nFeatures: \n- Fine-tuned for chat/dialogue applications \n- 7 billion parameter size \n- Requires acceptance of Meta's LLAMA 2 Community License Agreement \n- Includes model weights and inference code \n- Restrictions on redistribution and commercial use \n\nComparison: \nThis model is part of Meta's Llama-2 series, with the 7B-chat variant being" +} \ No newline at end of file diff --git a/model_data_json/meta-llama_Llama-2-7b-hf.json b/model_data_json/meta-llama_Llama-2-7b-hf.json new file mode 100644 index 0000000000000000000000000000000000000000..dd2fbd5385a2cee8dde6d43e7316c20e718675f0 --- /dev/null +++ b/model_data_json/meta-llama_Llama-2-7b-hf.json @@ -0,0 +1,23 @@ +{ + "model_id": "meta-llama/Llama-2-7b-hf", + "downloads": 899278, + "tags": [ + "transformers", + "pytorch", + "safetensors", + "llama", + "text-generation", + "facebook", + "meta", + "llama-2", + "en", + "arxiv:2307.09288", + "license:llama2", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- extra_gated_heading: You need to share contact information with Meta to access this model extra_gated_prompt: >- ### LLAMA 2 COMMUNITY LICENSE AGREEMENT \"Agreement\" means the terms and conditions for use, reproduction, distribution and modification of the Llama Materials set forth herein. \"Documentation\" means the specifications, manuals and documentation accompanying Llama 2 distributed by Meta at \"Licensee\" or \"you\" means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity's behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf. \"Llama 2\" means the foundational large language models and software and algorithms, including machine-learning model code, trained model weights, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing distributed by Meta at ai.meta.com/resources/models-and-libraries/llama-downloads/. \"Llama Materials\" means, collectively, Meta's proprietary Llama 2 and documentation (and any portion thereof) made available under this Agreement. \"Meta\" or \"we\" means Meta Platforms Ireland Limited (if you are located in or, if you are an entity, your principal place of business is in the EEA or Switzerland) and Meta Platforms, Inc. (if you are located outside of the EEA or Switzerland). By clicking \"I Accept\" below or by using or distributing any portion or element of the Llama Materials, you agree to be bound by this Agreement. 1. License Rights and Redistribution. a. Grant of Rights. You are granted a non-exclusive, worldwide, non- transferable and royalty-free limited license under Meta's intellectual property or other rights owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy, create derivative works of, and make modifications to the Llama Materials. b. Redistribution and Use. i. If you distribute or make the Llama Materials, or any derivative works thereof, available to a third party, you shall provide a copy of this Agreement to such third party. ii. If you receive Llama Materials, or any derivative works thereof, from a Licensee as part of an integrated end user product, then Section 2 of this Agreement will not apply to you. iii. You must retain in all copies of the Llama Materials that you distribute the following attribution notice within a \"Notice\" text file distributed as a part of such copies: \"Llama 2 is licensed under the LLAMA 2 Community License, Copyright (c) Meta Platforms, Inc. All Rights Reserved.\" iv. Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the Llama Materials (available at which is hereby incorporated by reference into this Agreement. v. You will not use the Llama Materials or any output or results of the Llama Materials to improve any other large language model (excluding Llama 2 or derivative works thereof). 2. Additional Commercial Terms. If, on the Llama 2 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee's affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights. 3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN \"AS IS\" BASIS, WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS. 4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING. 5. Intellectual Property. a. No trademark licenses are granted under this Agreement, and in connection with the Llama Materials, neither Meta nor Licensee may use any name or mark owned by or associated with the other or any of its affiliates, except as required for reasonable and customary use in describing and redistributing the Llama Materials. b. Subject to Meta's ownership of Llama Materials and derivatives made by or for Meta, with respect to any derivative works and modifications of the Llama Materials that are made by you, as between you and Meta, you are and will be the owner of such derivative works and modifications. c. If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 2 outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials. 6. Term and Termination. The term of this Agreement will commence upon your acceptance of this Agreement or access to the Llama Materials and will continue in full force and effect until terminated in accordance with the terms and conditions herein. Meta may terminate this Agreement if you are in breach of any term or condition of this Agreement. Upon termination of this Agreement, you shall delete and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the termination of this Agreement. 7. Governing Law and Jurisdiction. This Agreement will be governed and construed under the laws of the State of California without regard to choice of law principles, and the UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement. The courts of California shall have exclusive jurisdiction of any dispute arising out of this Agreement. ### Llama 2 Acceptable Use Policy Meta is committed to promoting safe and fair use of its tools and features, including Llama 2. If you access or use Llama 2, you agree to this Acceptable Use Policy (“Policy”). The most recent copy of this policy can be found at ai.meta.com/llama/use-policy. #### Prohibited Uses We want everyone to use Llama 2 safely and responsibly. You agree you will not use, or allow others to use, Llama 2 to: 1. Violate the law or others’ rights, including to: 1. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as: 1. Violence or terrorism 2. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material 3. Human trafficking, exploitation, and sexual violence 4. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials. 5. Sexual solicitation 6. Any other criminal activity 2. Engage in, promote, incite, or facilitate the harassment, abuse, threatening, or bullying of individuals or groups of individuals 3. Engage in, promote, incite, or facilitate discrimination or other unlawful or harmful conduct in the provision of employment, employment benefits, credit, housing, other economic benefits, or other essential goods and services 4. Engage in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or related professional practices 5. Collect, process, disclose, generate, or infer health, demographic, or other sensitive personal or private information about individuals without rights and consents required by applicable laws 6. Engage in or facilitate any action or generate any content that infringes, misappropriates, or otherwise violates any third-party rights, including the outputs or results of any products or services using the Llama 2 Materials 7. Create, generate, or facilitate the creation of malicious code, malware, computer viruses or do anything else that could disable, overburden, interfere with or impair the proper working, integrity, operation or appearance of a website or computer system 2. Engage in, promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals, including use of Llama 2 related to the following: 1. Military, warfare, nuclear industries or applications, espionage, use for materials or activities that are subject to the International Traffic Arms Regulations (ITAR) maintained by the United States Department of State 2. Guns and illegal weapons (including weapon development) 3. Illegal drugs and regulated/controlled substances 4. Operation of critical infrastructure, transportation technologies, or heavy machinery 5. Self-harm or harm to others, including suicide, cutting, and eating disorders 6. Any content intended to incite or promote violence, abuse, or any infliction of bodily harm to an individual 3. Intentionally deceive or mislead others, including use of Llama 2 related to the following: 1. Generating, promoting, or furthering fraud or the creation or promotion of disinformation 2. Generating, promoting, or furthering defamatory content, including the creation of defamatory statements, images, or other content 3. Generating, promoting, or further distributing spam 4. Impersonating another individual without consent, authorization, or legal right 5. Representing that the use of Llama 2 or outputs are human-generated 6. Generating or facilitating false online engagement, including fake reviews and other means of fake online engagement 4. Fail to appropriately disclose to end users any known dangers of your AI system Please report any violation of this Policy, software “bug,” or other problems that could lead to a violation of this Policy through one of the following means: * Reporting issues with the model: github.com/facebookresearch/llama * Reporting risky content generated by the model: developers.facebook.com/llama_output_feedback * Reporting bugs and security concerns: facebook.com/whitehat/info * Reporting violations of the Acceptable Use Policy or unlicensed uses of Llama: LlamaUseReport@meta.com extra_gated_fields: First Name: text Last Name: text Date of birth: date_picker Country: country Affiliation: text geo: ip_location By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy: checkbox extra_gated_description: >- The information you provide will be collected, stored, processed and shared in accordance with the Meta Privacy Policy. extra_gated_button_content: Submit language: - en pipeline_tag: text-generation tags: - facebook - meta - pytorch - llama - llama-2 license: llama2 --- # **Llama 2** Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is the repository for the 7B pretrained model, converted for the Hugging Face Transformers format. Links to other models can be found in the index at the bottom. ## Model Details *Note: Use of this model is governed by the Meta license. In order to download the model weights and tokenizer, please visit the website and accept our License before requesting access here.* Meta developed and publicly released the Llama 2 family of large language models (LLMs), a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama-2-Chat, are optimized for dialogue use cases. Llama-2-Chat models outperform open-source chat models on most benchmarks we tested, and in our human evaluations for helpfulness and safety, are on par with some popular closed-source models like ChatGPT and PaLM. **Model Developers** Meta **Variations** Llama 2 comes in a range of parameter sizes — 7B, 13B, and 70B — as well as pretrained and fine-tuned variations. **Input** Models input text only. **Output** Models generate text only. **Model Architecture** Llama 2 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align to human preferences for helpfulness and safety. ||Training Data|Params|Content Length|GQA|Tokens|LR| |---|---|---|---|---|---|---| |Llama 2|*A new mix of publicly available online data*|7B|4k|✗|2.0T|3.0 x 10-4| |Llama 2|*A new mix of publicly available online data*|13B|4k|✗|2.0T|3.0 x 10-4| |Llama 2|*A new mix of publicly available online data*|70B|4k|✔|2.0T|1.5 x 10-4| *Llama 2 family of models.* Token counts refer to pretraining data only. All models are trained with a global batch-size of 4M tokens. Bigger models - 70B -- use Grouped-Query Attention (GQA) for improved inference scalability. **Model Dates** Llama 2 was trained between January 2023 and July 2023. **Status** This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. **License** A custom commercial license is available at: **Research Paper** \"Llama-2: Open Foundation and Fine-tuned Chat Models\" ## Intended Use **Intended Use Cases** Llama 2 is intended for commercial and research use in English. Tuned models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. To get the expected features and performance for the chat versions, a specific formatting needs to be followed, including the and tags, and tokens, and the whitespaces and breaklines in between (we recommend calling on inputs to avoid double-spaces). See our reference code in github for details: []( **Out-of-scope Uses** Use in any manner that violates applicable laws or regulations (including trade compliance laws).Use in languages other than English. Use in any other way that is prohibited by the Acceptable Use Policy and Licensing Agreement for Llama 2. ## Hardware and Software **Training Factors** We used custom training libraries, Meta's Research Super Cluster, and production clusters for pretraining. Fine-tuning, annotation, and evaluation were also performed on third-party cloud compute. **Carbon Footprint** Pretraining utilized a cumulative 3.3M GPU hours of computation on hardware of type A100-80GB (TDP of 350-400W). Estimated total emissions were 539 tCO2eq, 100% of which were offset by Meta’s sustainability program. ||Time (GPU hours)|Power Consumption (W)|Carbon Emitted(tCO2eq)| |---|---|---|---| |Llama 2 7B|184320|400|31.22| |Llama 2 13B|368640|400|62.44| |Llama 2 70B|1720320|400|291.42| |Total|3311616||539.00| **CO2 emissions during pretraining.** Time: total GPU time required for training each model. Power Consumption: peak power capacity per GPU device for the GPUs used adjusted for power usage efficiency. 100% of the emissions are directly offset by Meta's sustainability program, and because we are openly releasing these models, the pretraining costs do not need to be incurred by others. ## Training Data **Overview** Llama 2 was pretrained on 2 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over one million new human-annotated examples. Neither the pretraining nor the fine-tuning datasets include Meta user data. **Data Freshness** The pretraining data has a cutoff of September 2022, but some tuning data is more recent, up to July 2023. ## Evaluation Results In this section, we report the results for the Llama 1 and Llama 2 models on standard academic benchmarks.For all the evaluations, we use our internal evaluations library. |Model|Size|Code|Commonsense Reasoning|World Knowledge|Reading Comprehension|Math|MMLU|BBH|AGI Eval| |---|---|---|---|---|---|---|---|---|---| |Llama 1|7B|14.1|60.8|46.2|58.5|6.95|35.1|30.3|23.9| |Llama 1|13B|18.9|66.1|52.6|62.3|10.9|46.9|37.0|33.9| |Llama 1|33B|26.0|70.0|58.4|67.6|21.4|57.8|39.8|41.7| |Llama 1|65B|30.7|70.7|60.5|68.6|30.8|63.4|43.5|47.6| |Llama 2|7B|16.8|63.9|48.9|61.3|14.6|45.3|32.6|29.3| |Llama 2|13B|24.5|66.9|55.4|65.8|28.7|54.8|39.4|39.1| |Llama 2|70B|**37.5**|**71.9**|**63.6**|**69.4**|**35.2**|**68.9**|**51.2**|**54.2**| **Overall performance on grouped academic benchmarks.** *Code:* We report the average pass@1 scores of our models on HumanEval and MBPP. *Commonsense Reasoning:* We report the average of PIQA, SIQA, HellaSwag, WinoGrande, ARC easy and challenge, OpenBookQA, and CommonsenseQA. We report 7-shot results for CommonSenseQA and 0-shot results for all other benchmarks. *World Knowledge:* We evaluate the 5-shot performance on NaturalQuestions and TriviaQA and report the average. *Reading Comprehension:* For reading comprehension, we report the 0-shot average on SQuAD, QuAC, and BoolQ. *MATH:* We report the average of the GSM8K (8 shot) and MATH (4 shot) benchmarks at top 1. |||TruthfulQA|Toxigen| |---|---|---|---| |Llama 1|7B|27.42|23.00| |Llama 1|13B|41.74|23.08| |Llama 1|33B|44.19|22.57| |Llama 1|65B|48.71|21.77| |Llama 2|7B|33.29|**21.25**| |Llama 2|13B|41.86|26.10| |Llama 2|70B|**50.18**|24.60| **Evaluation of pretrained LLMs on automatic safety benchmarks.** For TruthfulQA, we present the percentage of generations that are both truthful and informative (the higher the better). For ToxiGen, we present the percentage of toxic generations (the smaller the better). |||TruthfulQA|Toxigen| |---|---|---|---| |Llama-2-Chat|7B|57.04|**0.00**| |Llama-2-Chat|13B|62.18|**0.00**| |Llama-2-Chat|70B|**64.14**|0.01| **Evaluation of fine-tuned LLMs on different safety datasets.** Same metric definitions as above. ## Ethical Considerations and Limitations Llama 2 is a new technology that carries risks with use. Testing conducted to date has been in English, and has not covered, nor could it cover all scenarios. For these reasons, as with all LLMs, Llama 2’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 2, developers should perform safety testing and tuning tailored to their specific applications of the model. Please see the Responsible Use Guide available at ## Reporting Issues Please report any software “bug,” or other problems with the models through one of the following means: - Reporting issues with the model: github.com/facebookresearch/llama - Reporting problematic content generated by the model: developers.facebook.com/llama_output_feedback - Reporting bugs and security concerns: facebook.com/whitehat/info ## Llama Model Index |Model|Llama2|Llama2-hf|Llama2-chat|Llama2-chat-hf| |---|---|---|---|---| |7B| Link | Link | Link | Link| |13B| Link | Link | Link | Link| |70B| Link | Link | Link | Link|", + "model_explanation_gemini": "A 7B-parameter foundational large language model (Llama 2) licensed by Meta for broad use, reproduction, and modification under specific terms, with restrictions on redistribution and commercial scaling.\n\n**Features**: \n- Large-scale language model (7 billion parameters) \n- Permits use, reproduction, distribution, and derivative works \n- Requires adherence to Meta’s license agreement and acceptable use policy \n- Prohibits using outputs to improve competing models (excluding Llama 2 derivatives)" +} \ No newline at end of file diff --git a/model_data_json/meta-llama_Llama-3.1-70B-Instruct.json b/model_data_json/meta-llama_Llama-3.1-70B-Instruct.json new file mode 100644 index 0000000000000000000000000000000000000000..4927d6e7d854f0d752581c7a81c14f7bb845aebb --- /dev/null +++ b/model_data_json/meta-llama_Llama-3.1-70B-Instruct.json @@ -0,0 +1,33 @@ +{ + "model_id": "meta-llama/Llama-3.1-70B-Instruct", + "downloads": 1114152, + "tags": [ + "transformers", + "safetensors", + "llama", + "text-generation", + "facebook", + "meta", + "pytorch", + "llama-3", + "conversational", + "en", + "de", + "fr", + "it", + "pt", + "hi", + "es", + "th", + "arxiv:2204.05149", + "base_model:meta-llama/Llama-3.1-70B", + "base_model:finetune:meta-llama/Llama-3.1-70B", + "license:llama3.1", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: - en - de - fr - it - pt - hi - es - th library_name: transformers base_model: meta-llama/Meta-Llama-3.1-70B new_version: meta-llama/Llama-3.3-70B-Instruct license: llama3.1 pipeline_tag: text-generation tags: - facebook - meta - pytorch - llama - llama-3 extra_gated_prompt: \"### LLAMA 3.1 COMMUNITY LICENSE AGREEMENT\\nLlama 3.1 Version\\ \\ Release Date: July 23, 2024\\n\\\"Agreement\\\" means the terms and conditions for\\ \\ use, reproduction, distribution and modification of the Llama Materials set forth\\ \\ herein.\\n\\\"Documentation\\\" means the specifications, manuals and documentation\\ \\ accompanying Llama 3.1 distributed by Meta at \\\"Licensee\\\" or \\\"you\\\" means you, or your employer or any other person or entity\\ \\ (if you are entering into this Agreement on such person or entity’s behalf), of\\ \\ the age required under applicable laws, rules or regulations to provide legal\\ \\ consent and that has legal authority to bind your employer or such other person\\ \\ or entity if you are entering in this Agreement on their behalf.\\n\\\"Llama 3.1\\\"\\ \\ means the foundational large language models and software and algorithms, including\\ \\ machine-learning model code, trained model weights, inference-enabling code, training-enabling\\ \\ code, fine-tuning enabling code and other elements of the foregoing distributed\\ \\ by Meta at Materials\\\" means,\\ \\ collectively, Meta’s proprietary Llama 3.1 and Documentation (and any portion\\ \\ thereof) made available under this Agreement.\\n\\\"Meta\\\" or \\\"we\\\" means Meta Platforms\\ \\ Ireland Limited (if you are located in or, if you are an entity, your principal\\ \\ place of business is in the EEA or Switzerland) and Meta Platforms, Inc. (if you\\ \\ are located outside of the EEA or Switzerland).\\n \\n1. License Rights and Redistribution.\\n\\ a. Grant of Rights. You are granted a non-exclusive, worldwide, non-transferable\\ \\ and royalty-free limited license under Meta’s intellectual property or other rights\\ \\ owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy,\\ \\ create derivative works of, and make modifications to the Llama Materials.\\nb.\\ \\ Redistribution and Use.\\ni. If you distribute or make available the Llama Materials\\ \\ (or any derivative works thereof), or a product or service (including another\\ \\ AI model) that contains any of them, you shall (A) provide a copy of this Agreement\\ \\ with any such Llama Materials; and (B) prominently display “Built with Llama”\\ \\ on a related website, user interface, blogpost, about page, or product documentation.\\ \\ If you use the Llama Materials or any outputs or results of the Llama Materials\\ \\ to create, train, fine tune, or otherwise improve an AI model, which is distributed\\ \\ or made available, you shall also include “Llama” at the beginning of any such\\ \\ AI model name.\\nii. If you receive Llama Materials, or any derivative works thereof,\\ \\ from a Licensee as part of an integrated end user product, then Section 2 of\\ \\ this Agreement will not apply to you.\\niii. You must retain in all copies of the\\ \\ Llama Materials that you distribute the following attribution notice within a\\ \\ “Notice” text file distributed as a part of such copies: “Llama 3.1 is licensed\\ \\ under the Llama 3.1 Community License, Copyright © Meta Platforms, Inc. All Rights\\ \\ Reserved.”\\niv. Your use of the Llama Materials must comply with applicable laws\\ \\ and regulations (including trade compliance laws and regulations) and adhere to\\ \\ the Acceptable Use Policy for the Llama Materials (available at \\ which is hereby incorporated by reference into this Agreement.\\n2. Additional\\ \\ Commercial Terms. If, on the Llama 3.1 version release date, the monthly active\\ \\ users of the products or services made available by or for Licensee, or Licensee’s\\ \\ affiliates, is greater than 700 million monthly active users in the preceding\\ \\ calendar month, you must request a license from Meta, which Meta may grant to\\ \\ you in its sole discretion, and you are not authorized to exercise any of the\\ \\ rights under this Agreement unless or until Meta otherwise expressly grants you\\ \\ such rights.\\n3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE\\ \\ LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN “AS IS”\\ \\ BASIS, WITHOUT WARRANTIES OF ANY KIND, AND META DISCLAIMS ALL WARRANTIES OF ANY\\ \\ KIND, BOTH EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES\\ \\ OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE.\\ \\ YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING\\ \\ THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA\\ \\ MATERIALS AND ANY OUTPUT AND RESULTS.\\n4. Limitation of Liability. IN NO EVENT\\ \\ WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN\\ \\ CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS\\ \\ AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL,\\ \\ EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED\\ \\ OF THE POSSIBILITY OF ANY OF THE FOREGOING.\\n5. Intellectual Property.\\na. No\\ \\ trademark licenses are granted under this Agreement, and in connection with the\\ \\ Llama Materials, neither Meta nor Licensee may use any name or mark owned by or\\ \\ associated with the other or any of its affiliates, except as required for reasonable\\ \\ and customary use in describing and redistributing the Llama Materials or as set\\ \\ forth in this Section 5(a). Meta hereby grants you a license to use “Llama” (the\\ \\ “Mark”) solely as required to comply with the last sentence of Section 1.b.i.\\ \\ You will comply with Meta’s brand guidelines (currently accessible at \\ ). All goodwill arising out of your use of the Mark will inure to the benefit\\ \\ of Meta.\\nb. Subject to Meta’s ownership of Llama Materials and derivatives made\\ \\ by or for Meta, with respect to any derivative works and modifications of the\\ \\ Llama Materials that are made by you, as between you and Meta, you are and will\\ \\ be the owner of such derivative works and modifications.\\nc. If you institute\\ \\ litigation or other proceedings against Meta or any entity (including a cross-claim\\ \\ or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 3.1 outputs\\ \\ or results, or any portion of any of the foregoing, constitutes infringement of\\ \\ intellectual property or other rights owned or licensable by you, then any licenses\\ \\ granted to you under this Agreement shall terminate as of the date such litigation\\ \\ or claim is filed or instituted. You will indemnify and hold harmless Meta from\\ \\ and against any claim by any third party arising out of or related to your use\\ \\ or distribution of the Llama Materials.\\n6. Term and Termination. The term of\\ \\ this Agreement will commence upon your acceptance of this Agreement or access\\ \\ to the Llama Materials and will continue in full force and effect until terminated\\ \\ in accordance with the terms and conditions herein. Meta may terminate this Agreement\\ \\ if you are in breach of any term or condition of this Agreement. Upon termination\\ \\ of this Agreement, you shall delete and cease use of the Llama Materials. Sections\\ \\ 3, 4 and 7 shall survive the termination of this Agreement.\\n7. Governing Law\\ \\ and Jurisdiction. This Agreement will be governed and construed under the laws\\ \\ of the State of California without regard to choice of law principles, and the\\ \\ UN Convention on Contracts for the International Sale of Goods does not apply\\ \\ to this Agreement. The courts of California shall have exclusive jurisdiction\\ \\ of any dispute arising out of this Agreement.\\n### Llama 3.1 Acceptable Use Policy\\n\\ Meta is committed to promoting safe and fair use of its tools and features, including\\ \\ Llama 3.1. If you access or use Llama 3.1, you agree to this Acceptable Use Policy\\ \\ (“Policy”). The most recent copy of this policy can be found at #### Prohibited Uses\\nWe want everyone to use Llama 3.1 safely and responsibly.\\ \\ You agree you will not use, or allow others to use, Llama 3.1 to:\\n 1. Violate\\ \\ the law or others’ rights, including to:\\n 1. Engage in, promote, generate,\\ \\ contribute to, encourage, plan, incite, or further illegal or unlawful activity\\ \\ or content, such as:\\n 1. Violence or terrorism\\n 2. Exploitation\\ \\ or harm to children, including the solicitation, creation, acquisition, or dissemination\\ \\ of child exploitative content or failure to report Child Sexual Abuse Material\\n\\ \\ 3. Human trafficking, exploitation, and sexual violence\\n 4. The\\ \\ illegal distribution of information or materials to minors, including obscene\\ \\ materials, or failure to employ legally required age-gating in connection with\\ \\ such information or materials.\\n 5. Sexual solicitation\\n 6. Any\\ \\ other criminal activity\\n 3. Engage in, promote, incite, or facilitate the\\ \\ harassment, abuse, threatening, or bullying of individuals or groups of individuals\\n\\ \\ 4. Engage in, promote, incite, or facilitate discrimination or other unlawful\\ \\ or harmful conduct in the provision of employment, employment benefits, credit,\\ \\ housing, other economic benefits, or other essential goods and services\\n 5.\\ \\ Engage in the unauthorized or unlicensed practice of any profession including,\\ \\ but not limited to, financial, legal, medical/health, or related professional\\ \\ practices\\n 6. Collect, process, disclose, generate, or infer health, demographic,\\ \\ or other sensitive personal or private information about individuals without rights\\ \\ and consents required by applicable laws\\n 7. Engage in or facilitate any action\\ \\ or generate any content that infringes, misappropriates, or otherwise violates\\ \\ any third-party rights, including the outputs or results of any products or services\\ \\ using the Llama Materials\\n 8. Create, generate, or facilitate the creation\\ \\ of malicious code, malware, computer viruses or do anything else that could disable,\\ \\ overburden, interfere with or impair the proper working, integrity, operation\\ \\ or appearance of a website or computer system\\n2. Engage in, promote, incite,\\ \\ facilitate, or assist in the planning or development of activities that present\\ \\ a risk of death or bodily harm to individuals, including use of Llama 3.1 related\\ \\ to the following:\\n 1. Military, warfare, nuclear industries or applications,\\ \\ espionage, use for materials or activities that are subject to the International\\ \\ Traffic Arms Regulations (ITAR) maintained by the United States Department of\\ \\ State\\n 2. Guns and illegal weapons (including weapon development)\\n 3.\\ \\ Illegal drugs and regulated/controlled substances\\n 4. Operation of critical\\ \\ infrastructure, transportation technologies, or heavy machinery\\n 5. Self-harm\\ \\ or harm to others, including suicide, cutting, and eating disorders\\n 6. Any\\ \\ content intended to incite or promote violence, abuse, or any infliction of bodily\\ \\ harm to an individual\\n3. Intentionally deceive or mislead others, including use\\ \\ of Llama 3.1 related to the following:\\n 1. Generating, promoting, or furthering\\ \\ fraud or the creation or promotion of disinformation\\n 2. Generating, promoting,\\ \\ or furthering defamatory content, including the creation of defamatory statements,\\ \\ images, or other content\\n 3. Generating, promoting, or further distributing\\ \\ spam\\n 4. Impersonating another individual without consent, authorization,\\ \\ or legal right\\n 5. Representing that the use of Llama 3.1 or outputs are human-generated\\n\\ \\ 6. Generating or facilitating false online engagement, including fake reviews\\ \\ and other means of fake online engagement\\n4. Fail to appropriately disclose to\\ \\ end users any known dangers of your AI system\\nPlease report any violation of\\ \\ this Policy, software “bug,” or other problems that could lead to a violation\\ \\ of this Policy through one of the following means:\\n * Reporting issues with\\ \\ the model: \\ * Reporting risky content generated by the model:\\n developers.facebook.com/llama_output_feedback\\n\\ \\ * Reporting bugs and security concerns: facebook.com/whitehat/info\\n * Reporting\\ \\ violations of the Acceptable Use Policy or unlicensed uses of Meta Llama 3: LlamaUseReport@meta.com\" extra_gated_fields: First Name: text Last Name: text Date of birth: date_picker Country: country Affiliation: text Job title: type: select options: - Student - Research Graduate - AI researcher - AI developer/engineer - Reporter - Other geo: ip_location ? By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy : checkbox extra_gated_description: The information you provide will be collected, stored, processed and shared in accordance with the Meta Privacy Policy. extra_gated_button_content: Submit --- ## Model Information The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes (text in/text out). The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks. **Model developer**: Meta **Model Architecture:** Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
Training Data Params Input modalities Output modalities Context length GQA Token count Knowledge cutoff
Llama 3.1 (text only) A new mix of publicly available online data. 8B Multilingual Text Multilingual Text and code 128k Yes 15T+ December 2023
70B Multilingual Text Multilingual Text and code 128k Yes
405B Multilingual Text Multilingual Text and code 128k Yes
**Supported languages:** English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. **Llama 3.1 family of models**. Token counts refer to pretraining data only. All model versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date:** July 23, 2024. **Status:** This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. **License:** A custom commercial license, the Llama 3.1 Community License, is available at: Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model README. For more technical information about generation parameters and recipes for how to use Llama 3.1 in applications, please go here. ## Intended Use **Intended Use Cases** Llama 3.1 is intended for commercial and research use in multiple languages. Instruction tuned text only models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. The Llama 3.1 model collection also supports the ability to leverage the outputs of its models to improve other models including synthetic data generation and distillation. The Llama 3.1 Community License allows for these use cases. **Out-of-scope** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3.1 Community License. Use in languages beyond those explicitly referenced as supported in this model card**. **Note: Llama 3.1 has been trained on a broader collection of languages than the 8 supported languages. Developers may fine-tune Llama 3.1 models for languages beyond the 8 supported languages provided they comply with the Llama 3.1 Community License and the Acceptable Use Policy and in such cases are responsible for ensuring that any uses of Llama 3.1 in additional languages is done in a safe and responsible manner. ## How to use This repository contains two versions of Meta-Llama-3.1-70B-Instruct, for use with transformers and with the original codebase. ### Use with transformers Starting with onward, you can run conversational inference using the Transformers abstraction or by leveraging the Auto classes with the function. Make sure to update your transformers installation via . See the snippet below for usage with Transformers: ### Tool use with transformers LLaMA-3.1 supports multiple tool use formats. You can see a full guide to prompt formatting here. Tool use is also supported through chat templates in Transformers. Here is a quick example showing a single simple tool: You can then generate text from this input as normal. If the model generates a tool call, you should add it to the chat like so: and then call the tool and append the result, with the role, like so: After that, you can again to let the model use the tool result in the chat. Note that this was a very brief introduction to tool calling - for more information, see the LLaMA prompt format docs and the Transformers tool use documentation. ### Use with The model checkpoints can be used in and for further memory optimisations using and See the snippet below for usage: To load in 4-bit simply pass ### Use with Please, follow the instructions in the repository. To download Original checkpoints, see the example command below leveraging : ## Hardware and Software **Training Factors** We used custom training libraries, Meta's custom built GPU cluster, and production infrastructure for pretraining. Fine-tuning, annotation, and evaluation were also performed on production infrastructure. **Training utilized a cumulative of** 39.3M GPU hours of computation on H100-80GB (TDP of 700W) type hardware, per the table below. Training time is the total GPU time required for training each model and power consumption is the peak power capacity per GPU device used, adjusted for power usage efficiency. **Training Greenhouse Gas Emissions** Estimated total location-based greenhouse gas emissions were **11,390** tons CO2eq for training. Since 2020, Meta has maintained net zero greenhouse gas emissions in its global operations and matched 100% of its electricity use with renewable energy, therefore the total market-based greenhouse gas emissions for training were 0 tons CO2eq.
Training Time (GPU hours) Training Power Consumption (W) Training Location-Based Greenhouse Gas Emissions

(tons CO2eq)

Training Market-Based Greenhouse Gas Emissions

(tons CO2eq)

Llama 3.1 8B 1.46M 700 420 0
Llama 3.1 70B 7.0M 700 2,040 0
Llama 3.1 405B 30.84M 700 8,930 0
Total 39.3M
11,390 0
The methodology used to determine training energy use and greenhouse gas emissions can be found here. Since Meta is openly releasing these models, the training energy use and greenhouse gas emissions will not be incurred by others. ## Training Data **Overview:** Llama 3.1 was pretrained on ~15 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over 25M synthetically generated examples. **Data Freshness:** The pretraining data has a cutoff of December 2023. ## Benchmark scores In this section, we report the results for Llama 3.1 models on standard automatic benchmarks. For all the evaluations, we use our internal evaluations library. ### Base pretrained models
Category Benchmark # Shots Metric Llama 3 8B Llama 3.1 8B Llama 3 70B Llama 3.1 70B Llama 3.1 405B
General MMLU 5 macro_avg/acc_char 66.7 66.7 79.5 79.3 85.2
MMLU-Pro (CoT) 5 macro_avg/acc_char 36.2 37.1 55.0 53.8 61.6
AGIEval English 3-5 average/acc_char 47.1 47.8 63.0 64.6 71.6
CommonSenseQA 7 acc_char 72.6 75.0 83.8 84.1 85.8
Winogrande 5 acc_char - 60.5 - 83.3 86.7
BIG-Bench Hard (CoT) 3 average/em 61.1 64.2 81.3 81.6 85.9
ARC-Challenge 25 acc_char 79.4 79.7 93.1 92.9 96.1
Knowledge reasoning TriviaQA-Wiki 5 em 78.5 77.6 89.7 89.8 91.8
Reading comprehension SQuAD 1 em 76.4 77.0 85.6 81.8 89.3
QuAC (F1) 1 f1 44.4 44.9 51.1 51.1 53.6
BoolQ 0 acc_char 75.7 75.0 79.0 79.4 80.0
DROP (F1) 3 f1 58.4 59.5 79.7 79.6 84.8
### Instruction tuned models
Category Benchmark # Shots Metric Llama 3 8B Instruct Llama 3.1 8B Instruct Llama 3 70B Instruct Llama 3.1 70B Instruct Llama 3.1 405B Instruct
General MMLU 5 macro_avg/acc 68.5 69.4 82.0 83.6 87.3
MMLU (CoT) 0 macro_avg/acc 65.3 73.0 80.9 86.0 88.6
MMLU-Pro (CoT) 5 micro_avg/acc_char 45.5 48.3 63.4 66.4 73.3
IFEval 76.8 80.4 82.9 87.5 88.6
Reasoning ARC-C 0 acc 82.4 83.4 94.4 94.8 96.9
GPQA 0 em 34.6 30.4 39.5 46.7 50.7
Code HumanEval 0 pass@1 60.4 72.6 81.7 80.5 89.0
MBPP ++ base version 0 pass@1 70.6 72.8 82.5 86.0 88.6
Multipl-E HumanEval 0 pass@1 - 50.8 - 65.5 75.2
Multipl-E MBPP 0 pass@1 - 52.4 - 62.0 65.7
Math GSM-8K (CoT) 8 em_maj1@1 80.6 84.5 93.0 95.1 96.8
MATH (CoT) 0 final_em 29.1 51.9 51.0 68.0 73.8
Tool Use API-Bank 0 acc 48.3 82.6 85.1 90.0 92.0
BFCL 0 acc 60.3 76.1 83.0 84.8 88.5
Gorilla Benchmark API Bench 0 acc 1.7 8.2 14.7 29.7 35.3
Nexus (0-shot) 0 macro_avg/acc 18.1 38.5 47.8 56.7 58.7
Multilingual Multilingual MGSM (CoT) 0 em - 68.9 - 86.9 91.6
#### Multilingual benchmarks
Category Benchmark Language Llama 3.1 8B Llama 3.1 70B Llama 3.1 405B
General MMLU (5-shot, macro_avg/acc) Portuguese 62.12 80.13 84.95
Spanish 62.45 80.05 85.08
Italian 61.63 80.4 85.04
German 60.59 79.27 84.36
French 62.34 79.82 84.66
Hindi 50.88 74.52 80.31
Thai 50.32 72.95 78.21
## Responsibility & Safety As part of our Responsible release approach, we followed a three-pronged strategy to managing trust & safety risks: * Enable developers to deploy helpful, safe and flexible experiences for their target audience and for the use cases supported by Llama. * Protect developers against adversarial users aiming to exploit Llama capabilities to potentially cause harm. * Provide protections for the community to help prevent the misuse of our models. ### Responsible deployment Llama is a foundational technology designed to be used in a variety of use cases, examples on how Meta’s Llama models have been responsibly deployed can be found in our Community Stories webpage. Our approach is to build the most helpful models enabling the world to benefit from the technology power, by aligning our model safety for the generic use cases addressing a standard set of harms. Developers are then in the driver seat to tailor safety for their use case, defining their own policy and deploying the models with the necessary safeguards in their Llama systems. Llama 3.1 was developed following the best practices outlined in our Responsible Use Guide, you can refer to the Responsible Use Guide to learn more. #### Llama 3.1 instruct Our main objectives for conducting safety fine-tuning are to provide the research community with a valuable resource for studying the robustness of safety fine-tuning, as well as to offer developers a readily available, safe, and powerful model for various applications to reduce the developer workload to deploy safe AI systems. For more details on the safety mitigations implemented please read the Llama 3 paper. **Fine-tuning data** We employ a multi-faceted approach to data collection, combining human-generated data from our vendors with synthetic data to mitigate potential safety risks. We’ve developed many large language model (LLM)-based classifiers that enable us to thoughtfully select high-quality prompts and responses, enhancing data quality control. **Refusals and Tone** Building on the work we started with Llama 3, we put a great emphasis on model refusals to benign prompts as well as refusal tone. We included both borderline and adversarial prompts in our safety data strategy, and modified our safety data responses to follow tone guidelines. #### Llama 3.1 systems **Large language models, including Llama 3.1, are not designed to be deployed in isolation but instead should be deployed as part of an overall AI system with additional safety guardrails as required.** Developers are expected to deploy system safeguards when building agentic systems. Safeguards are key to achieve the right helpfulness-safety alignment as well as mitigating safety and security risks inherent to the system and any integration of the model or system with external tools. As part of our responsible release approach, we provide the community with safeguards that developers should deploy with Llama models or other LLMs, including Llama Guard 3, Prompt Guard and Code Shield. All our reference implementations demos contain these safeguards by default so developers can benefit from system-level safety out-of-the-box. #### New capabilities Note that this release introduces new capabilities, including a longer context window, multilingual inputs and outputs and possible integrations by developers with third party tools. Building with these new capabilities requires specific considerations in addition to the best practices that generally apply across all Generative AI use cases. **Tool-use**: Just like in standard software development, developers are responsible for the integration of the LLM with the tools and services of their choice. They should define a clear policy for their use case and assess the integrity of the third party services they use to be aware of the safety and security limitations when using this capability. Refer to the Responsible Use Guide for best practices on the safe deployment of the third party safeguards. **Multilinguality**: Llama 3.1 supports 7 languages in addition to English: French, German, Hindi, Italian, Portuguese, Spanish, and Thai. Llama may be able to output text in other languages than those that meet performance thresholds for safety and helpfulness. We strongly discourage developers from using this model to converse in non-supported languages without implementing finetuning and system controls in alignment with their policies and the best practices shared in the Responsible Use Guide. ### Evaluations We evaluated Llama models for common use cases as well as specific capabilities. Common use cases evaluations measure safety risks of systems for most commonly built applications including chat bot, coding assistant, tool calls. We built dedicated, adversarial evaluation datasets and evaluated systems composed of Llama models and Llama Guard 3 to filter input prompt and output response. It is important to evaluate applications in context, and we recommend building dedicated evaluation dataset for your use case. Prompt Guard and Code Shield are also available if relevant to the application. Capability evaluations measure vulnerabilities of Llama models inherent to specific capabilities, for which were crafted dedicated benchmarks including long context, multilingual, tools calls, coding or memorization. **Red teaming** For both scenarios, we conducted recurring red teaming exercises with the goal of discovering risks via adversarial prompting and we used the learnings to improve our benchmarks and safety tuning datasets. We partnered early with subject-matter experts in critical risk areas to understand the nature of these real-world harms and how such models may lead to unintended harm for society. Based on these conversations, we derived a set of adversarial goals for the red team to attempt to achieve, such as extracting harmful information or reprogramming the model to act in a potentially harmful capacity. The red team consisted of experts in cybersecurity, adversarial machine learning, responsible AI, and integrity in addition to multilingual content specialists with background in integrity issues in specific geographic markets. ### Critical and other risks We specifically focused our efforts on mitigating the following critical risk areas: **1- CBRNE (Chemical, Biological, Radiological, Nuclear, and Explosive materials) helpfulness** To assess risks related to proliferation of chemical and biological weapons, we performed uplift testing designed to assess whether use of Llama 3.1 models could meaningfully increase the capabilities of malicious actors to plan or carry out attacks using these types of weapons. **2. Child Safety** Child Safety risk assessments were conducted using a team of experts, to assess the model’s capability to produce outputs that could result in Child Safety risks and inform on any necessary and appropriate risk mitigations via fine tuning. We leveraged those expert red teaming sessions to expand the coverage of our evaluation benchmarks through Llama 3 model development. For Llama 3, we conducted new in-depth sessions using objective based methodologies to assess the model risks along multiple attack vectors including the additional languages Llama 3 is trained on. We also partnered with content specialists to perform red teaming exercises assessing potentially violating content while taking account of market specific nuances or experiences. **3. Cyber attack enablement** Our cyber attack uplift study investigated whether LLMs can enhance human capabilities in hacking tasks, both in terms of skill level and speed. Our attack automation study focused on evaluating the capabilities of LLMs when used as autonomous agents in cyber offensive operations, specifically in the context of ransomware attacks. This evaluation was distinct from previous studies that considered LLMs as interactive assistants. The primary objective was to assess whether these models could effectively function as independent agents in executing complex cyber-attacks without human intervention. Our study of Llama-3.1-405B’s social engineering uplift for cyber attackers was conducted to assess the effectiveness of AI models in aiding cyber threat actors in spear phishing campaigns. Please read our Llama 3.1 Cyber security whitepaper to learn more. ### Community Generative AI safety requires expertise and tooling, and we believe in the strength of the open community to accelerate its progress. We are active members of open consortiums, including the AI Alliance, Partnership on AI and MLCommons, actively contributing to safety standardization and transparency. We encourage the community to adopt taxonomies like the MLCommons Proof of Concept evaluation to facilitate collaboration and transparency on safety and content evaluations. Our Purple Llama tools are open sourced for the community to use and widely distributed across ecosystem partners including cloud service providers. We encourage community contributions to our Github repository. We also set up the Llama Impact Grants program to identify and support the most compelling applications of Meta’s Llama model for societal benefit across three categories: education, climate and open innovation. The 20 finalists from the hundreds of applications can be found here. Finally, we put in place a set of resources including an output reporting mechanism and bug bounty program to continuously improve the Llama technology with the help of the community. ## Ethical Considerations and Limitations The core values of Llama 3.1 are openness, inclusivity and helpfulness. It is meant to serve everyone, and to work for a wide range of use cases. It is thus designed to be accessible to people across many different backgrounds, experiences and perspectives. Llama 3.1 addresses users and their needs as they are, without insertion unnecessary judgment or normativity, while reflecting the understanding that even content that may appear problematic in some cases can serve valuable purposes in others. It respects the dignity and autonomy of all users, especially in terms of the values of free thought and expression that power innovation and progress. But Llama 3.1 is a new technology, and like any new technology, there are risks associated with its use. Testing conducted to date has not covered, nor could it cover, all scenarios. For these reasons, as with all LLMs, Llama 3.1’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 3.1 models, developers should perform safety testing and tuning tailored to their specific applications of the model. Please refer to available resources including our Responsible Use Guide, Trust and Safety solutions, and other resources to learn more about responsible development.", + "model_explanation_gemini": "A multilingual text-generation model designed for instruction-following tasks, supporting several languages including English, German, French, and more. \n\n**Features:** \n- Multilingual support (en, de, fr, it, pt, hi, es, th) \n- Text-generation capability \n- Instruction-following fine-tuning \n- Based on Meta's Llama 3.1 architecture (70B parameters) \n- Licensed under the Llama 3.1 Community License \n\n**Comparison:**" +} \ No newline at end of file diff --git a/model_data_json/meta-llama_Llama-3.1-8B-Instruct.json b/model_data_json/meta-llama_Llama-3.1-8B-Instruct.json new file mode 100644 index 0000000000000000000000000000000000000000..2a603669588e78858babfbf130b3caebbeec829c --- /dev/null +++ b/model_data_json/meta-llama_Llama-3.1-8B-Instruct.json @@ -0,0 +1,33 @@ +{ + "model_id": "meta-llama/Llama-3.1-8B-Instruct", + "downloads": 5237076, + "tags": [ + "transformers", + "safetensors", + "llama", + "text-generation", + "facebook", + "meta", + "pytorch", + "llama-3", + "conversational", + "en", + "de", + "fr", + "it", + "pt", + "hi", + "es", + "th", + "arxiv:2204.05149", + "base_model:meta-llama/Llama-3.1-8B", + "base_model:finetune:meta-llama/Llama-3.1-8B", + "license:llama3.1", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: - en - de - fr - it - pt - hi - es - th license: llama3.1 base_model: meta-llama/Meta-Llama-3.1-8B pipeline_tag: text-generation tags: - facebook - meta - pytorch - llama - llama-3 extra_gated_prompt: \"### LLAMA 3.1 COMMUNITY LICENSE AGREEMENT\\nLlama 3.1 Version\\ \\ Release Date: July 23, 2024\\n\\\"Agreement\\\" means the terms and conditions for\\ \\ use, reproduction, distribution and modification of the Llama Materials set forth\\ \\ herein.\\n\\\"Documentation\\\" means the specifications, manuals and documentation\\ \\ accompanying Llama 3.1 distributed by Meta at \\\"Licensee\\\" or \\\"you\\\" means you, or your employer or any other person or entity\\ \\ (if you are entering into this Agreement on such person or entity’s behalf), of\\ \\ the age required under applicable laws, rules or regulations to provide legal\\ \\ consent and that has legal authority to bind your employer or such other person\\ \\ or entity if you are entering in this Agreement on their behalf.\\n\\\"Llama 3.1\\\"\\ \\ means the foundational large language models and software and algorithms, including\\ \\ machine-learning model code, trained model weights, inference-enabling code, training-enabling\\ \\ code, fine-tuning enabling code and other elements of the foregoing distributed\\ \\ by Meta at Materials\\\" means,\\ \\ collectively, Meta’s proprietary Llama 3.1 and Documentation (and any portion\\ \\ thereof) made available under this Agreement.\\n\\\"Meta\\\" or \\\"we\\\" means Meta Platforms\\ \\ Ireland Limited (if you are located in or, if you are an entity, your principal\\ \\ place of business is in the EEA or Switzerland) and Meta Platforms, Inc. (if you\\ \\ are located outside of the EEA or Switzerland).\\n \\n1. License Rights and Redistribution.\\n\\ a. Grant of Rights. You are granted a non-exclusive, worldwide, non-transferable\\ \\ and royalty-free limited license under Meta’s intellectual property or other rights\\ \\ owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy,\\ \\ create derivative works of, and make modifications to the Llama Materials.\\nb.\\ \\ Redistribution and Use.\\ni. If you distribute or make available the Llama Materials\\ \\ (or any derivative works thereof), or a product or service (including another\\ \\ AI model) that contains any of them, you shall (A) provide a copy of this Agreement\\ \\ with any such Llama Materials; and (B) prominently display “Built with Llama”\\ \\ on a related website, user interface, blogpost, about page, or product documentation.\\ \\ If you use the Llama Materials or any outputs or results of the Llama Materials\\ \\ to create, train, fine tune, or otherwise improve an AI model, which is distributed\\ \\ or made available, you shall also include “Llama” at the beginning of any such\\ \\ AI model name.\\nii. If you receive Llama Materials, or any derivative works thereof,\\ \\ from a Licensee as part of an integrated end user product, then Section 2 of\\ \\ this Agreement will not apply to you.\\niii. You must retain in all copies of the\\ \\ Llama Materials that you distribute the following attribution notice within a\\ \\ “Notice” text file distributed as a part of such copies: “Llama 3.1 is licensed\\ \\ under the Llama 3.1 Community License, Copyright © Meta Platforms, Inc. All Rights\\ \\ Reserved.”\\niv. Your use of the Llama Materials must comply with applicable laws\\ \\ and regulations (including trade compliance laws and regulations) and adhere to\\ \\ the Acceptable Use Policy for the Llama Materials (available at \\ which is hereby incorporated by reference into this Agreement.\\n2. Additional\\ \\ Commercial Terms. If, on the Llama 3.1 version release date, the monthly active\\ \\ users of the products or services made available by or for Licensee, or Licensee’s\\ \\ affiliates, is greater than 700 million monthly active users in the preceding\\ \\ calendar month, you must request a license from Meta, which Meta may grant to\\ \\ you in its sole discretion, and you are not authorized to exercise any of the\\ \\ rights under this Agreement unless or until Meta otherwise expressly grants you\\ \\ such rights.\\n3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE\\ \\ LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN “AS IS”\\ \\ BASIS, WITHOUT WARRANTIES OF ANY KIND, AND META DISCLAIMS ALL WARRANTIES OF ANY\\ \\ KIND, BOTH EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES\\ \\ OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE.\\ \\ YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING\\ \\ THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA\\ \\ MATERIALS AND ANY OUTPUT AND RESULTS.\\n4. Limitation of Liability. IN NO EVENT\\ \\ WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN\\ \\ CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS\\ \\ AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL,\\ \\ EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED\\ \\ OF THE POSSIBILITY OF ANY OF THE FOREGOING.\\n5. Intellectual Property.\\na. No\\ \\ trademark licenses are granted under this Agreement, and in connection with the\\ \\ Llama Materials, neither Meta nor Licensee may use any name or mark owned by or\\ \\ associated with the other or any of its affiliates, except as required for reasonable\\ \\ and customary use in describing and redistributing the Llama Materials or as set\\ \\ forth in this Section 5(a). Meta hereby grants you a license to use “Llama” (the\\ \\ “Mark”) solely as required to comply with the last sentence of Section 1.b.i.\\ \\ You will comply with Meta’s brand guidelines (currently accessible at \\ ). All goodwill arising out of your use of the Mark will inure to the benefit\\ \\ of Meta.\\nb. Subject to Meta’s ownership of Llama Materials and derivatives made\\ \\ by or for Meta, with respect to any derivative works and modifications of the\\ \\ Llama Materials that are made by you, as between you and Meta, you are and will\\ \\ be the owner of such derivative works and modifications.\\nc. If you institute\\ \\ litigation or other proceedings against Meta or any entity (including a cross-claim\\ \\ or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 3.1 outputs\\ \\ or results, or any portion of any of the foregoing, constitutes infringement of\\ \\ intellectual property or other rights owned or licensable by you, then any licenses\\ \\ granted to you under this Agreement shall terminate as of the date such litigation\\ \\ or claim is filed or instituted. You will indemnify and hold harmless Meta from\\ \\ and against any claim by any third party arising out of or related to your use\\ \\ or distribution of the Llama Materials.\\n6. Term and Termination. The term of\\ \\ this Agreement will commence upon your acceptance of this Agreement or access\\ \\ to the Llama Materials and will continue in full force and effect until terminated\\ \\ in accordance with the terms and conditions herein. Meta may terminate this Agreement\\ \\ if you are in breach of any term or condition of this Agreement. Upon termination\\ \\ of this Agreement, you shall delete and cease use of the Llama Materials. Sections\\ \\ 3, 4 and 7 shall survive the termination of this Agreement.\\n7. Governing Law\\ \\ and Jurisdiction. This Agreement will be governed and construed under the laws\\ \\ of the State of California without regard to choice of law principles, and the\\ \\ UN Convention on Contracts for the International Sale of Goods does not apply\\ \\ to this Agreement. The courts of California shall have exclusive jurisdiction\\ \\ of any dispute arising out of this Agreement.\\n### Llama 3.1 Acceptable Use Policy\\n\\ Meta is committed to promoting safe and fair use of its tools and features, including\\ \\ Llama 3.1. If you access or use Llama 3.1, you agree to this Acceptable Use Policy\\ \\ (“Policy”). The most recent copy of this policy can be found at #### Prohibited Uses\\nWe want everyone to use Llama 3.1 safely and responsibly.\\ \\ You agree you will not use, or allow others to use, Llama 3.1 to:\\n 1. Violate\\ \\ the law or others’ rights, including to:\\n 1. Engage in, promote, generate,\\ \\ contribute to, encourage, plan, incite, or further illegal or unlawful activity\\ \\ or content, such as:\\n 1. Violence or terrorism\\n 2. Exploitation\\ \\ or harm to children, including the solicitation, creation, acquisition, or dissemination\\ \\ of child exploitative content or failure to report Child Sexual Abuse Material\\n\\ \\ 3. Human trafficking, exploitation, and sexual violence\\n 4. The\\ \\ illegal distribution of information or materials to minors, including obscene\\ \\ materials, or failure to employ legally required age-gating in connection with\\ \\ such information or materials.\\n 5. Sexual solicitation\\n 6. Any\\ \\ other criminal activity\\n 3. Engage in, promote, incite, or facilitate the\\ \\ harassment, abuse, threatening, or bullying of individuals or groups of individuals\\n\\ \\ 4. Engage in, promote, incite, or facilitate discrimination or other unlawful\\ \\ or harmful conduct in the provision of employment, employment benefits, credit,\\ \\ housing, other economic benefits, or other essential goods and services\\n 5.\\ \\ Engage in the unauthorized or unlicensed practice of any profession including,\\ \\ but not limited to, financial, legal, medical/health, or related professional\\ \\ practices\\n 6. Collect, process, disclose, generate, or infer health, demographic,\\ \\ or other sensitive personal or private information about individuals without rights\\ \\ and consents required by applicable laws\\n 7. Engage in or facilitate any action\\ \\ or generate any content that infringes, misappropriates, or otherwise violates\\ \\ any third-party rights, including the outputs or results of any products or services\\ \\ using the Llama Materials\\n 8. Create, generate, or facilitate the creation\\ \\ of malicious code, malware, computer viruses or do anything else that could disable,\\ \\ overburden, interfere with or impair the proper working, integrity, operation\\ \\ or appearance of a website or computer system\\n2. Engage in, promote, incite,\\ \\ facilitate, or assist in the planning or development of activities that present\\ \\ a risk of death or bodily harm to individuals, including use of Llama 3.1 related\\ \\ to the following:\\n 1. Military, warfare, nuclear industries or applications,\\ \\ espionage, use for materials or activities that are subject to the International\\ \\ Traffic Arms Regulations (ITAR) maintained by the United States Department of\\ \\ State\\n 2. Guns and illegal weapons (including weapon development)\\n 3.\\ \\ Illegal drugs and regulated/controlled substances\\n 4. Operation of critical\\ \\ infrastructure, transportation technologies, or heavy machinery\\n 5. Self-harm\\ \\ or harm to others, including suicide, cutting, and eating disorders\\n 6. Any\\ \\ content intended to incite or promote violence, abuse, or any infliction of bodily\\ \\ harm to an individual\\n3. Intentionally deceive or mislead others, including use\\ \\ of Llama 3.1 related to the following:\\n 1. Generating, promoting, or furthering\\ \\ fraud or the creation or promotion of disinformation\\n 2. Generating, promoting,\\ \\ or furthering defamatory content, including the creation of defamatory statements,\\ \\ images, or other content\\n 3. Generating, promoting, or further distributing\\ \\ spam\\n 4. Impersonating another individual without consent, authorization,\\ \\ or legal right\\n 5. Representing that the use of Llama 3.1 or outputs are human-generated\\n\\ \\ 6. Generating or facilitating false online engagement, including fake reviews\\ \\ and other means of fake online engagement\\n4. Fail to appropriately disclose to\\ \\ end users any known dangers of your AI system\\nPlease report any violation of\\ \\ this Policy, software “bug,” or other problems that could lead to a violation\\ \\ of this Policy through one of the following means:\\n * Reporting issues with\\ \\ the model: \\ * Reporting risky content generated by the model:\\n developers.facebook.com/llama_output_feedback\\n\\ \\ * Reporting bugs and security concerns: facebook.com/whitehat/info\\n * Reporting\\ \\ violations of the Acceptable Use Policy or unlicensed uses of Meta Llama 3: LlamaUseReport@meta.com\" extra_gated_fields: First Name: text Last Name: text Date of birth: date_picker Country: country Affiliation: text Job title: type: select options: - Student - Research Graduate - AI researcher - AI developer/engineer - Reporter - Other geo: ip_location ? By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy : checkbox extra_gated_description: The information you provide will be collected, stored, processed and shared in accordance with the Meta Privacy Policy. extra_gated_button_content: Submit --- ## Model Information The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes (text in/text out). The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks. **Model developer**: Meta **Model Architecture:** Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
Training Data Params Input modalities Output modalities Context length GQA Token count Knowledge cutoff
Llama 3.1 (text only) A new mix of publicly available online data. 8B Multilingual Text Multilingual Text and code 128k Yes 15T+ December 2023
70B Multilingual Text Multilingual Text and code 128k Yes
405B Multilingual Text Multilingual Text and code 128k Yes
**Supported languages:** English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. **Llama 3.1 family of models**. Token counts refer to pretraining data only. All model versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date:** July 23, 2024. **Status:** This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. **License:** A custom commercial license, the Llama 3.1 Community License, is available at: Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model README. For more technical information about generation parameters and recipes for how to use Llama 3.1 in applications, please go here. ## Intended Use **Intended Use Cases** Llama 3.1 is intended for commercial and research use in multiple languages. Instruction tuned text only models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. The Llama 3.1 model collection also supports the ability to leverage the outputs of its models to improve other models including synthetic data generation and distillation. The Llama 3.1 Community License allows for these use cases. **Out-of-scope** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3.1 Community License. Use in languages beyond those explicitly referenced as supported in this model card**. **Note: Llama 3.1 has been trained on a broader collection of languages than the 8 supported languages. Developers may fine-tune Llama 3.1 models for languages beyond the 8 supported languages provided they comply with the Llama 3.1 Community License and the Acceptable Use Policy and in such cases are responsible for ensuring that any uses of Llama 3.1 in additional languages is done in a safe and responsible manner. ## How to use This repository contains two versions of Meta-Llama-3.1-8B-Instruct, for use with transformers and with the original codebase. ### Use with transformers Starting with onward, you can run conversational inference using the Transformers abstraction or by leveraging the Auto classes with the function. Make sure to update your transformers installation via . Note: You can also find detailed recipes on how to use the model locally, with , assisted generations, quantised and more at []( ### Tool use with transformers LLaMA-3.1 supports multiple tool use formats. You can see a full guide to prompt formatting here. Tool use is also supported through chat templates in Transformers. Here is a quick example showing a single simple tool: You can then generate text from this input as normal. If the model generates a tool call, you should add it to the chat like so: and then call the tool and append the result, with the role, like so: After that, you can again to let the model use the tool result in the chat. Note that this was a very brief introduction to tool calling - for more information, see the LLaMA prompt format docs and the Transformers tool use documentation. ### Use with Please, follow the instructions in the repository To download Original checkpoints, see the example command below leveraging : ## Hardware and Software **Training Factors** We used custom training libraries, Meta's custom built GPU cluster, and production infrastructure for pretraining. Fine-tuning, annotation, and evaluation were also performed on production infrastructure. **Training utilized a cumulative of** 39.3M GPU hours of computation on H100-80GB (TDP of 700W) type hardware, per the table below. Training time is the total GPU time required for training each model and power consumption is the peak power capacity per GPU device used, adjusted for power usage efficiency. **Training Greenhouse Gas Emissions** Estimated total location-based greenhouse gas emissions were **11,390** tons CO2eq for training. Since 2020, Meta has maintained net zero greenhouse gas emissions in its global operations and matched 100% of its electricity use with renewable energy, therefore the total market-based greenhouse gas emissions for training were 0 tons CO2eq.
Training Time (GPU hours) Training Power Consumption (W) Training Location-Based Greenhouse Gas Emissions

(tons CO2eq)

Training Market-Based Greenhouse Gas Emissions

(tons CO2eq)

Llama 3.1 8B 1.46M 700 420 0
Llama 3.1 70B 7.0M 700 2,040 0
Llama 3.1 405B 30.84M 700 8,930 0
Total 39.3M
11,390 0
The methodology used to determine training energy use and greenhouse gas emissions can be found here. Since Meta is openly releasing these models, the training energy use and greenhouse gas emissions will not be incurred by others. ## Training Data **Overview:** Llama 3.1 was pretrained on ~15 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over 25M synthetically generated examples. **Data Freshness:** The pretraining data has a cutoff of December 2023. ## Benchmark scores In this section, we report the results for Llama 3.1 models on standard automatic benchmarks. For all the evaluations, we use our internal evaluations library. ### Base pretrained models
Category Benchmark # Shots Metric Llama 3 8B Llama 3.1 8B Llama 3 70B Llama 3.1 70B Llama 3.1 405B
General MMLU 5 macro_avg/acc_char 66.7 66.7 79.5 79.3 85.2
MMLU-Pro (CoT) 5 macro_avg/acc_char 36.2 37.1 55.0 53.8 61.6
AGIEval English 3-5 average/acc_char 47.1 47.8 63.0 64.6 71.6
CommonSenseQA 7 acc_char 72.6 75.0 83.8 84.1 85.8
Winogrande 5 acc_char - 60.5 - 83.3 86.7
BIG-Bench Hard (CoT) 3 average/em 61.1 64.2 81.3 81.6 85.9
ARC-Challenge 25 acc_char 79.4 79.7 93.1 92.9 96.1
Knowledge reasoning TriviaQA-Wiki 5 em 78.5 77.6 89.7 89.8 91.8
Reading comprehension SQuAD 1 em 76.4 77.0 85.6 81.8 89.3
QuAC (F1) 1 f1 44.4 44.9 51.1 51.1 53.6
BoolQ 0 acc_char 75.7 75.0 79.0 79.4 80.0
DROP (F1) 3 f1 58.4 59.5 79.7 79.6 84.8
### Instruction tuned models
Category Benchmark # Shots Metric Llama 3 8B Instruct Llama 3.1 8B Instruct Llama 3 70B Instruct Llama 3.1 70B Instruct Llama 3.1 405B Instruct
General MMLU 5 macro_avg/acc 68.5 69.4 82.0 83.6 87.3
MMLU (CoT) 0 macro_avg/acc 65.3 73.0 80.9 86.0 88.6
MMLU-Pro (CoT) 5 micro_avg/acc_char 45.5 48.3 63.4 66.4 73.3
IFEval 76.8 80.4 82.9 87.5 88.6
Reasoning ARC-C 0 acc 82.4 83.4 94.4 94.8 96.9
GPQA 0 em 34.6 30.4 39.5 46.7 50.7
Code HumanEval 0 pass@1 60.4 72.6 81.7 80.5 89.0
MBPP ++ base version 0 pass@1 70.6 72.8 82.5 86.0 88.6
Multipl-E HumanEval 0 pass@1 - 50.8 - 65.5 75.2
Multipl-E MBPP 0 pass@1 - 52.4 - 62.0 65.7
Math GSM-8K (CoT) 8 em_maj1@1 80.6 84.5 93.0 95.1 96.8
MATH (CoT) 0 final_em 29.1 51.9 51.0 68.0 73.8
Tool Use API-Bank 0 acc 48.3 82.6 85.1 90.0 92.0
BFCL 0 acc 60.3 76.1 83.0 84.8 88.5
Gorilla Benchmark API Bench 0 acc 1.7 8.2 14.7 29.7 35.3
Nexus (0-shot) 0 macro_avg/acc 18.1 38.5 47.8 56.7 58.7
Multilingual Multilingual MGSM (CoT) 0 em - 68.9 - 86.9 91.6
#### Multilingual benchmarks
Category Benchmark Language Llama 3.1 8B Llama 3.1 70B Llama 3.1 405B
General MMLU (5-shot, macro_avg/acc) Portuguese 62.12 80.13 84.95
Spanish 62.45 80.05 85.08
Italian 61.63 80.4 85.04
German 60.59 79.27 84.36
French 62.34 79.82 84.66
Hindi 50.88 74.52 80.31
Thai 50.32 72.95 78.21
## Responsibility & Safety As part of our Responsible release approach, we followed a three-pronged strategy to managing trust & safety risks: * Enable developers to deploy helpful, safe and flexible experiences for their target audience and for the use cases supported by Llama. * Protect developers against adversarial users aiming to exploit Llama capabilities to potentially cause harm. * Provide protections for the community to help prevent the misuse of our models. ### Responsible deployment Llama is a foundational technology designed to be used in a variety of use cases, examples on how Meta’s Llama models have been responsibly deployed can be found in our Community Stories webpage. Our approach is to build the most helpful models enabling the world to benefit from the technology power, by aligning our model safety for the generic use cases addressing a standard set of harms. Developers are then in the driver seat to tailor safety for their use case, defining their own policy and deploying the models with the necessary safeguards in their Llama systems. Llama 3.1 was developed following the best practices outlined in our Responsible Use Guide, you can refer to the Responsible Use Guide to learn more. #### Llama 3.1 instruct Our main objectives for conducting safety fine-tuning are to provide the research community with a valuable resource for studying the robustness of safety fine-tuning, as well as to offer developers a readily available, safe, and powerful model for various applications to reduce the developer workload to deploy safe AI systems. For more details on the safety mitigations implemented please read the Llama 3 paper. **Fine-tuning data** We employ a multi-faceted approach to data collection, combining human-generated data from our vendors with synthetic data to mitigate potential safety risks. We’ve developed many large language model (LLM)-based classifiers that enable us to thoughtfully select high-quality prompts and responses, enhancing data quality control. **Refusals and Tone** Building on the work we started with Llama 3, we put a great emphasis on model refusals to benign prompts as well as refusal tone. We included both borderline and adversarial prompts in our safety data strategy, and modified our safety data responses to follow tone guidelines. #### Llama 3.1 systems **Large language models, including Llama 3.1, are not designed to be deployed in isolation but instead should be deployed as part of an overall AI system with additional safety guardrails as required.** Developers are expected to deploy system safeguards when building agentic systems. Safeguards are key to achieve the right helpfulness-safety alignment as well as mitigating safety and security risks inherent to the system and any integration of the model or system with external tools. As part of our responsible release approach, we provide the community with safeguards that developers should deploy with Llama models or other LLMs, including Llama Guard 3, Prompt Guard and Code Shield. All our reference implementations demos contain these safeguards by default so developers can benefit from system-level safety out-of-the-box. #### New capabilities Note that this release introduces new capabilities, including a longer context window, multilingual inputs and outputs and possible integrations by developers with third party tools. Building with these new capabilities requires specific considerations in addition to the best practices that generally apply across all Generative AI use cases. **Tool-use**: Just like in standard software development, developers are responsible for the integration of the LLM with the tools and services of their choice. They should define a clear policy for their use case and assess the integrity of the third party services they use to be aware of the safety and security limitations when using this capability. Refer to the Responsible Use Guide for best practices on the safe deployment of the third party safeguards. **Multilinguality**: Llama 3.1 supports 7 languages in addition to English: French, German, Hindi, Italian, Portuguese, Spanish, and Thai. Llama may be able to output text in other languages than those that meet performance thresholds for safety and helpfulness. We strongly discourage developers from using this model to converse in non-supported languages without implementing finetuning and system controls in alignment with their policies and the best practices shared in the Responsible Use Guide. ### Evaluations We evaluated Llama models for common use cases as well as specific capabilities. Common use cases evaluations measure safety risks of systems for most commonly built applications including chat bot, coding assistant, tool calls. We built dedicated, adversarial evaluation datasets and evaluated systems composed of Llama models and Llama Guard 3 to filter input prompt and output response. It is important to evaluate applications in context, and we recommend building dedicated evaluation dataset for your use case. Prompt Guard and Code Shield are also available if relevant to the application. Capability evaluations measure vulnerabilities of Llama models inherent to specific capabilities, for which were crafted dedicated benchmarks including long context, multilingual, tools calls, coding or memorization. **Red teaming** For both scenarios, we conducted recurring red teaming exercises with the goal of discovering risks via adversarial prompting and we used the learnings to improve our benchmarks and safety tuning datasets. We partnered early with subject-matter experts in critical risk areas to understand the nature of these real-world harms and how such models may lead to unintended harm for society. Based on these conversations, we derived a set of adversarial goals for the red team to attempt to achieve, such as extracting harmful information or reprogramming the model to act in a potentially harmful capacity. The red team consisted of experts in cybersecurity, adversarial machine learning, responsible AI, and integrity in addition to multilingual content specialists with background in integrity issues in specific geographic markets. ### Critical and other risks We specifically focused our efforts on mitigating the following critical risk areas: **1- CBRNE (Chemical, Biological, Radiological, Nuclear, and Explosive materials) helpfulness** To assess risks related to proliferation of chemical and biological weapons, we performed uplift testing designed to assess whether use of Llama 3.1 models could meaningfully increase the capabilities of malicious actors to plan or carry out attacks using these types of weapons. **2. Child Safety** Child Safety risk assessments were conducted using a team of experts, to assess the model’s capability to produce outputs that could result in Child Safety risks and inform on any necessary and appropriate risk mitigations via fine tuning. We leveraged those expert red teaming sessions to expand the coverage of our evaluation benchmarks through Llama 3 model development. For Llama 3, we conducted new in-depth sessions using objective based methodologies to assess the model risks along multiple attack vectors including the additional languages Llama 3 is trained on. We also partnered with content specialists to perform red teaming exercises assessing potentially violating content while taking account of market specific nuances or experiences. **3. Cyber attack enablement** Our cyber attack uplift study investigated whether LLMs can enhance human capabilities in hacking tasks, both in terms of skill level and speed. Our attack automation study focused on evaluating the capabilities of LLMs when used as autonomous agents in cyber offensive operations, specifically in the context of ransomware attacks. This evaluation was distinct from previous studies that considered LLMs as interactive assistants. The primary objective was to assess whether these models could effectively function as independent agents in executing complex cyber-attacks without human intervention. Our study of Llama-3.1-405B’s social engineering uplift for cyber attackers was conducted to assess the effectiveness of AI models in aiding cyber threat actors in spear phishing campaigns. Please read our Llama 3.1 Cyber security whitepaper to learn more. ### Community Generative AI safety requires expertise and tooling, and we believe in the strength of the open community to accelerate its progress. We are active members of open consortiums, including the AI Alliance, Partnership on AI and MLCommons, actively contributing to safety standardization and transparency. We encourage the community to adopt taxonomies like the MLCommons Proof of Concept evaluation to facilitate collaboration and transparency on safety and content evaluations. Our Purple Llama tools are open sourced for the community to use and widely distributed across ecosystem partners including cloud service providers. We encourage community contributions to our Github repository. We also set up the Llama Impact Grants program to identify and support the most compelling applications of Meta’s Llama model for societal benefit across three categories: education, climate and open innovation. The 20 finalists from the hundreds of applications can be found here. Finally, we put in place a set of resources including an output reporting mechanism and bug bounty program to continuously improve the Llama technology with the help of the community. ## Ethical Considerations and Limitations The core values of Llama 3.1 are openness, inclusivity and helpfulness. It is meant to serve everyone, and to work for a wide range of use cases. It is thus designed to be accessible to people across many different backgrounds, experiences and perspectives. Llama 3.1 addresses users and their needs as they are, without insertion unnecessary judgment or normativity, while reflecting the understanding that even content that may appear problematic in some cases can serve valuable purposes in others. It respects the dignity and autonomy of all users, especially in terms of the values of free thought and expression that power innovation and progress. But Llama 3.1 is a new technology, and like any new technology, there are risks associated with its use. Testing conducted to date has not covered, nor could it cover, all scenarios. For these reasons, as with all LLMs, Llama 3.1’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 3.1 models, developers should perform safety testing and tuning tailored to their specific applications of the model. Please refer to available resources including our Responsible Use Guide, Trust and Safety solutions, and other resources to learn more about responsible development.", + "model_explanation_gemini": "A multilingual text-generation model based on Meta's Llama 3.1 architecture, designed for instruction-following tasks across several languages. \n\n**Features:** \n- Languages: English, German, French, Italian, Portuguese, Hindi, Spanish, Thai \n- License: Llama 3.1 Community License \n- Base Model: `meta-llama/Meta-Llama-3.1-8B` \n- Task: Text generation \n- Framework: PyTorch" +} \ No newline at end of file diff --git a/model_data_json/meta-llama_Llama-3.1-8B.json b/model_data_json/meta-llama_Llama-3.1-8B.json new file mode 100644 index 0000000000000000000000000000000000000000..a62e84245a2aa284150006d8d2e1064a7b064d26 --- /dev/null +++ b/model_data_json/meta-llama_Llama-3.1-8B.json @@ -0,0 +1,30 @@ +{ + "model_id": "meta-llama/Llama-3.1-8B", + "downloads": 1087264, + "tags": [ + "transformers", + "safetensors", + "llama", + "text-generation", + "facebook", + "meta", + "pytorch", + "llama-3", + "en", + "de", + "fr", + "it", + "pt", + "hi", + "es", + "th", + "arxiv:2204.05149", + "license:llama3.1", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: - en - de - fr - it - pt - hi - es - th pipeline_tag: text-generation tags: - facebook - meta - pytorch - llama - llama-3 license: llama3.1 extra_gated_prompt: >- ### LLAMA 3.1 COMMUNITY LICENSE AGREEMENT Llama 3.1 Version Release Date: July 23, 2024 \"Agreement\" means the terms and conditions for use, reproduction, distribution and modification of the Llama Materials set forth herein. \"Documentation\" means the specifications, manuals and documentation accompanying Llama 3.1 distributed by Meta at \"Licensee\" or \"you\" means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity’s behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf. \"Llama 3.1\" means the foundational large language models and software and algorithms, including machine-learning model code, trained model weights, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing distributed by Meta at \"Llama Materials\" means, collectively, Meta’s proprietary Llama 3.1 and Documentation (and any portion thereof) made available under this Agreement. \"Meta\" or \"we\" means Meta Platforms Ireland Limited (if you are located in or, if you are an entity, your principal place of business is in the EEA or Switzerland) and Meta Platforms, Inc. (if you are located outside of the EEA or Switzerland). 1. License Rights and Redistribution. a. Grant of Rights. You are granted a non-exclusive, worldwide, non-transferable and royalty-free limited license under Meta’s intellectual property or other rights owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy, create derivative works of, and make modifications to the Llama Materials. b. Redistribution and Use. i. If you distribute or make available the Llama Materials (or any derivative works thereof), or a product or service (including another AI model) that contains any of them, you shall (A) provide a copy of this Agreement with any such Llama Materials; and (B) prominently display “Built with Llama” on a related website, user interface, blogpost, about page, or product documentation. If you use the Llama Materials or any outputs or results of the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include “Llama” at the beginning of any such AI model name. ii. If you receive Llama Materials, or any derivative works thereof, from a Licensee as part of an integrated end user product, then Section 2 of this Agreement will not apply to you. iii. You must retain in all copies of the Llama Materials that you distribute the following attribution notice within a “Notice” text file distributed as a part of such copies: “Llama 3.1 is licensed under the Llama 3.1 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.” iv. Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the Llama Materials (available at which is hereby incorporated by reference into this Agreement. 2. Additional Commercial Terms. If, on the Llama 3.1 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights. 3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN “AS IS” BASIS, WITHOUT WARRANTIES OF ANY KIND, AND META DISCLAIMS ALL WARRANTIES OF ANY KIND, BOTH EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS. 4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING. 5. Intellectual Property. a. No trademark licenses are granted under this Agreement, and in connection with the Llama Materials, neither Meta nor Licensee may use any name or mark owned by or associated with the other or any of its affiliates, except as required for reasonable and customary use in describing and redistributing the Llama Materials or as set forth in this Section 5(a). Meta hereby grants you a license to use “Llama” (the “Mark”) solely as required to comply with the last sentence of Section 1.b.i. You will comply with Meta’s brand guidelines (currently accessible at ). All goodwill arising out of your use of the Mark will inure to the benefit of Meta. b. Subject to Meta’s ownership of Llama Materials and derivatives made by or for Meta, with respect to any derivative works and modifications of the Llama Materials that are made by you, as between you and Meta, you are and will be the owner of such derivative works and modifications. c. If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 3.1 outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials. 6. Term and Termination. The term of this Agreement will commence upon your acceptance of this Agreement or access to the Llama Materials and will continue in full force and effect until terminated in accordance with the terms and conditions herein. Meta may terminate this Agreement if you are in breach of any term or condition of this Agreement. Upon termination of this Agreement, you shall delete and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the termination of this Agreement. 7. Governing Law and Jurisdiction. This Agreement will be governed and construed under the laws of the State of California without regard to choice of law principles, and the UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement. The courts of California shall have exclusive jurisdiction of any dispute arising out of this Agreement. ### Llama 3.1 Acceptable Use Policy Meta is committed to promoting safe and fair use of its tools and features, including Llama 3.1. If you access or use Llama 3.1, you agree to this Acceptable Use Policy (“Policy”). The most recent copy of this policy can be found at #### Prohibited Uses We want everyone to use Llama 3.1 safely and responsibly. You agree you will not use, or allow others to use, Llama 3.1 to: 1. Violate the law or others’ rights, including to: 1. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as: 1. Violence or terrorism 2. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material 3. Human trafficking, exploitation, and sexual violence 4. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials. 5. Sexual solicitation 6. Any other criminal activity 3. Engage in, promote, incite, or facilitate the harassment, abuse, threatening, or bullying of individuals or groups of individuals 4. Engage in, promote, incite, or facilitate discrimination or other unlawful or harmful conduct in the provision of employment, employment benefits, credit, housing, other economic benefits, or other essential goods and services 5. Engage in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or related professional practices 6. Collect, process, disclose, generate, or infer health, demographic, or other sensitive personal or private information about individuals without rights and consents required by applicable laws 7. Engage in or facilitate any action or generate any content that infringes, misappropriates, or otherwise violates any third-party rights, including the outputs or results of any products or services using the Llama Materials 8. Create, generate, or facilitate the creation of malicious code, malware, computer viruses or do anything else that could disable, overburden, interfere with or impair the proper working, integrity, operation or appearance of a website or computer system 2. Engage in, promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals, including use of Llama 3.1 related to the following: 1. Military, warfare, nuclear industries or applications, espionage, use for materials or activities that are subject to the International Traffic Arms Regulations (ITAR) maintained by the United States Department of State 2. Guns and illegal weapons (including weapon development) 3. Illegal drugs and regulated/controlled substances 4. Operation of critical infrastructure, transportation technologies, or heavy machinery 5. Self-harm or harm to others, including suicide, cutting, and eating disorders 6. Any content intended to incite or promote violence, abuse, or any infliction of bodily harm to an individual 3. Intentionally deceive or mislead others, including use of Llama 3.1 related to the following: 1. Generating, promoting, or furthering fraud or the creation or promotion of disinformation 2. Generating, promoting, or furthering defamatory content, including the creation of defamatory statements, images, or other content 3. Generating, promoting, or further distributing spam 4. Impersonating another individual without consent, authorization, or legal right 5. Representing that the use of Llama 3.1 or outputs are human-generated 6. Generating or facilitating false online engagement, including fake reviews and other means of fake online engagement 4. Fail to appropriately disclose to end users any known dangers of your AI system Please report any violation of this Policy, software “bug,” or other problems that could lead to a violation of this Policy through one of the following means: * Reporting issues with the model: * Reporting risky content generated by the model: developers.facebook.com/llama_output_feedback * Reporting bugs and security concerns: facebook.com/whitehat/info * Reporting violations of the Acceptable Use Policy or unlicensed uses of Meta Llama 3: LlamaUseReport@meta.com extra_gated_fields: First Name: text Last Name: text Date of birth: date_picker Country: country Affiliation: text Job title: type: select options: - Student - Research Graduate - AI researcher - AI developer/engineer - Reporter - Other geo: ip_location By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy: checkbox extra_gated_description: >- The information you provide will be collected, stored, processed and shared in accordance with the Meta Privacy Policy. extra_gated_button_content: Submit library_name: transformers --- ## Model Information The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes (text in/text out). The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks. **Model developer**: Meta **Model Architecture:** Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
Training Data Params Input modalities Output modalities Context length GQA Token count Knowledge cutoff
Llama 3.1 (text only) A new mix of publicly available online data. 8B Multilingual Text Multilingual Text and code 128k Yes 15T+ December 2023
70B Multilingual Text Multilingual Text and code 128k Yes
405B Multilingual Text Multilingual Text and code 128k Yes
**Supported languages:** English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. **Llama 3.1 family of models**. Token counts refer to pretraining data only. All model versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date:** July 23, 2024. **Status:** This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. **License:** A custom commercial license, the Llama 3.1 Community License, is available at: Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model README. For more technical information about generation parameters and recipes for how to use Llama 3.1 in applications, please go here. ## Intended Use **Intended Use Cases** Llama 3.1 is intended for commercial and research use in multiple languages. Instruction tuned text only models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. The Llama 3.1 model collection also supports the ability to leverage the outputs of its models to improve other models including synthetic data generation and distillation. The Llama 3.1 Community License allows for these use cases. **Out-of-scope** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3.1 Community License. Use in languages beyond those explicitly referenced as supported in this model card**. **Note: Llama 3.1 has been trained on a broader collection of languages than the 8 supported languages. Developers may fine-tune Llama 3.1 models for languages beyond the 8 supported languages provided they comply with the Llama 3.1 Community License and the Acceptable Use Policy and in such cases are responsible for ensuring that any uses of Llama 3.1 in additional languages is done in a safe and responsible manner. ## How to use This repository contains two versions of Meta's Llama-3.1-8B, for use with transformers and with the original codebase. ### Use with transformers Starting with transformers >= 4.43.0 onward, you can run conversational inference using the Transformers pipeline abstraction or by leveraging the Auto classes with the generate() function. Make sure to update your transformers installation via pip install --upgrade transformers. ### Use with Please, follow the instructions in the repository. To download Original checkpoints, see the example command below leveraging : ## Hardware and Software **Training Factors** We used custom training libraries, Meta's custom built GPU cluster, and production infrastructure for pretraining. Fine-tuning, annotation, and evaluation were also performed on production infrastructure. **Training utilized a cumulative of** 39.3M GPU hours of computation on H100-80GB (TDP of 700W) type hardware, per the table below. Training time is the total GPU time required for training each model and power consumption is the peak power capacity per GPU device used, adjusted for power usage efficiency. **Training Greenhouse Gas Emissions** Estimated total location-based greenhouse gas emissions were **11,390** tons CO2eq for training. Since 2020, Meta has maintained net zero greenhouse gas emissions in its global operations and matched 100% of its electricity use with renewable energy, therefore the total market-based greenhouse gas emissions for training were 0 tons CO2eq.
Training Time (GPU hours) Training Power Consumption (W) Training Location-Based Greenhouse Gas Emissions

(tons CO2eq)

Training Market-Based Greenhouse Gas Emissions

(tons CO2eq)

Llama 3.1 8B 1.46M 700 420 0
Llama 3.1 70B 7.0M 700 2,040 0
Llama 3.1 405B 30.84M 700 8,930 0
Total 39.3M
11,390 0
The methodology used to determine training energy use and greenhouse gas emissions can be found here. Since Meta is openly releasing these models, the training energy use and greenhouse gas emissions will not be incurred by others. ## Training Data **Overview:** Llama 3.1 was pretrained on ~15 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over 25M synthetically generated examples. **Data Freshness:** The pretraining data has a cutoff of December 2023. ## Benchmark scores In this section, we report the results for Llama 3.1 models on standard automatic benchmarks. For all the evaluations, we use our internal evaluations library. ### Base pretrained models
Category Benchmark # Shots Metric Llama 3 8B Llama 3.1 8B Llama 3 70B Llama 3.1 70B Llama 3.1 405B
General MMLU 5 macro_avg/acc_char 66.7 66.7 79.5 79.3 85.2
MMLU-Pro (CoT) 5 macro_avg/acc_char 36.2 37.1 55.0 53.8 61.6
AGIEval English 3-5 average/acc_char 47.1 47.8 63.0 64.6 71.6
CommonSenseQA 7 acc_char 72.6 75.0 83.8 84.1 85.8
Winogrande 5 acc_char - 60.5 - 83.3 86.7
BIG-Bench Hard (CoT) 3 average/em 61.1 64.2 81.3 81.6 85.9
ARC-Challenge 25 acc_char 79.4 79.7 93.1 92.9 96.1
Knowledge reasoning TriviaQA-Wiki 5 em 78.5 77.6 89.7 89.8 91.8
Reading comprehension SQuAD 1 em 76.4 77.0 85.6 81.8 89.3
QuAC (F1) 1 f1 44.4 44.9 51.1 51.1 53.6
BoolQ 0 acc_char 75.7 75.0 79.0 79.4 80.0
DROP (F1) 3 f1 58.4 59.5 79.7 79.6 84.8
### Instruction tuned models
Category Benchmark # Shots Metric Llama 3 8B Instruct Llama 3.1 8B Instruct Llama 3 70B Instruct Llama 3.1 70B Instruct Llama 3.1 405B Instruct
General MMLU 5 macro_avg/acc 68.5 69.4 82.0 83.6 87.3
MMLU (CoT) 0 macro_avg/acc 65.3 73.0 80.9 86.0 88.6
MMLU-Pro (CoT) 5 micro_avg/acc_char 45.5 48.3 63.4 66.4 73.3
IFEval 76.8 80.4 82.9 87.5 88.6
Reasoning ARC-C 0 acc 82.4 83.4 94.4 94.8 96.9
GPQA 0 em 34.6 30.4 39.5 46.7 50.7
Code HumanEval 0 pass@1 60.4 72.6 81.7 80.5 89.0
MBPP ++ base version 0 pass@1 70.6 72.8 82.5 86.0 88.6
Multipl-E HumanEval 0 pass@1 - 50.8 - 65.5 75.2
Multipl-E MBPP 0 pass@1 - 52.4 - 62.0 65.7
Math GSM-8K (CoT) 8 em_maj1@1 80.6 84.5 93.0 95.1 96.8
MATH (CoT) 0 final_em 29.1 51.9 51.0 68.0 73.8
Tool Use API-Bank 0 acc 48.3 82.6 85.1 90.0 92.0
BFCL 0 acc 60.3 76.1 83.0 84.8 88.5
Gorilla Benchmark API Bench 0 acc 1.7 8.2 14.7 29.7 35.3
Nexus (0-shot) 0 macro_avg/acc 18.1 38.5 47.8 56.7 58.7
Multilingual Multilingual MGSM (CoT) 0 em - 68.9 - 86.9 91.6
#### Multilingual benchmarks
Category Benchmark Language Llama 3.1 8B Llama 3.1 70B Llama 3.1 405B
General MMLU (5-shot, macro_avg/acc) Portuguese 62.12 80.13 84.95
Spanish 62.45 80.05 85.08
Italian 61.63 80.4 85.04
German 60.59 79.27 84.36
French 62.34 79.82 84.66
Hindi 50.88 74.52 80.31
Thai 50.32 72.95 78.21
## Responsibility & Safety As part of our Responsible release approach, we followed a three-pronged strategy to managing trust & safety risks: * Enable developers to deploy helpful, safe and flexible experiences for their target audience and for the use cases supported by Llama. * Protect developers against adversarial users aiming to exploit Llama capabilities to potentially cause harm. * Provide protections for the community to help prevent the misuse of our models. ### Responsible deployment Llama is a foundational technology designed to be used in a variety of use cases, examples on how Meta’s Llama models have been responsibly deployed can be found in our Community Stories webpage. Our approach is to build the most helpful models enabling the world to benefit from the technology power, by aligning our model safety for the generic use cases addressing a standard set of harms. Developers are then in the driver seat to tailor safety for their use case, defining their own policy and deploying the models with the necessary safeguards in their Llama systems. Llama 3.1 was developed following the best practices outlined in our Responsible Use Guide, you can refer to the Responsible Use Guide to learn more. #### Llama 3.1 instruct Our main objectives for conducting safety fine-tuning are to provide the research community with a valuable resource for studying the robustness of safety fine-tuning, as well as to offer developers a readily available, safe, and powerful model for various applications to reduce the developer workload to deploy safe AI systems. For more details on the safety mitigations implemented please read the Llama 3 paper. **Fine-tuning data** We employ a multi-faceted approach to data collection, combining human-generated data from our vendors with synthetic data to mitigate potential safety risks. We’ve developed many large language model (LLM)-based classifiers that enable us to thoughtfully select high-quality prompts and responses, enhancing data quality control. **Refusals and Tone** Building on the work we started with Llama 3, we put a great emphasis on model refusals to benign prompts as well as refusal tone. We included both borderline and adversarial prompts in our safety data strategy, and modified our safety data responses to follow tone guidelines. #### Llama 3.1 systems **Large language models, including Llama 3.1, are not designed to be deployed in isolation but instead should be deployed as part of an overall AI system with additional safety guardrails as required.** Developers are expected to deploy system safeguards when building agentic systems. Safeguards are key to achieve the right helpfulness-safety alignment as well as mitigating safety and security risks inherent to the system and any integration of the model or system with external tools. As part of our responsible release approach, we provide the community with safeguards that developers should deploy with Llama models or other LLMs, including Llama Guard 3, Prompt Guard and Code Shield. All our reference implementations demos contain these safeguards by default so developers can benefit from system-level safety out-of-the-box. #### New capabilities Note that this release introduces new capabilities, including a longer context window, multilingual inputs and outputs and possible integrations by developers with third party tools. Building with these new capabilities requires specific considerations in addition to the best practices that generally apply across all Generative AI use cases. **Tool-use**: Just like in standard software development, developers are responsible for the integration of the LLM with the tools and services of their choice. They should define a clear policy for their use case and assess the integrity of the third party services they use to be aware of the safety and security limitations when using this capability. Refer to the Responsible Use Guide for best practices on the safe deployment of the third party safeguards. **Multilinguality**: Llama 3.1 supports 7 languages in addition to English: French, German, Hindi, Italian, Portuguese, Spanish, and Thai. Llama may be able to output text in other languages than those that meet performance thresholds for safety and helpfulness. We strongly discourage developers from using this model to converse in non-supported languages without implementing finetuning and system controls in alignment with their policies and the best practices shared in the Responsible Use Guide. ### Evaluations We evaluated Llama models for common use cases as well as specific capabilities. Common use cases evaluations measure safety risks of systems for most commonly built applications including chat bot, coding assistant, tool calls. We built dedicated, adversarial evaluation datasets and evaluated systems composed of Llama models and Llama Guard 3 to filter input prompt and output response. It is important to evaluate applications in context, and we recommend building dedicated evaluation dataset for your use case. Prompt Guard and Code Shield are also available if relevant to the application. Capability evaluations measure vulnerabilities of Llama models inherent to specific capabilities, for which were crafted dedicated benchmarks including long context, multilingual, tools calls, coding or memorization. **Red teaming** For both scenarios, we conducted recurring red teaming exercises with the goal of discovering risks via adversarial prompting and we used the learnings to improve our benchmarks and safety tuning datasets. We partnered early with subject-matter experts in critical risk areas to understand the nature of these real-world harms and how such models may lead to unintended harm for society. Based on these conversations, we derived a set of adversarial goals for the red team to attempt to achieve, such as extracting harmful information or reprogramming the model to act in a potentially harmful capacity. The red team consisted of experts in cybersecurity, adversarial machine learning, responsible AI, and integrity in addition to multilingual content specialists with background in integrity issues in specific geographic markets. ### Critical and other risks We specifically focused our efforts on mitigating the following critical risk areas: **1- CBRNE (Chemical, Biological, Radiological, Nuclear, and Explosive materials) helpfulness** To assess risks related to proliferation of chemical and biological weapons, we performed uplift testing designed to assess whether use of Llama 3.1 models could meaningfully increase the capabilities of malicious actors to plan or carry out attacks using these types of weapons. **2. Child Safety** Child Safety risk assessments were conducted using a team of experts, to assess the model’s capability to produce outputs that could result in Child Safety risks and inform on any necessary and appropriate risk mitigations via fine tuning. We leveraged those expert red teaming sessions to expand the coverage of our evaluation benchmarks through Llama 3 model development. For Llama 3, we conducted new in-depth sessions using objective based methodologies to assess the model risks along multiple attack vectors including the additional languages Llama 3 is trained on. We also partnered with content specialists to perform red teaming exercises assessing potentially violating content while taking account of market specific nuances or experiences. **3. Cyber attack enablement** Our cyber attack uplift study investigated whether LLMs can enhance human capabilities in hacking tasks, both in terms of skill level and speed. Our attack automation study focused on evaluating the capabilities of LLMs when used as autonomous agents in cyber offensive operations, specifically in the context of ransomware attacks. This evaluation was distinct from previous studies that considered LLMs as interactive assistants. The primary objective was to assess whether these models could effectively function as independent agents in executing complex cyber-attacks without human intervention. Our study of Llama-3.1-405B’s social engineering uplift for cyber attackers was conducted to assess the effectiveness of AI models in aiding cyber threat actors in spear phishing campaigns. Please read our Llama 3.1 Cyber security whitepaper to learn more. ### Community Generative AI safety requires expertise and tooling, and we believe in the strength of the open community to accelerate its progress. We are active members of open consortiums, including the AI Alliance, Partnership on AI and MLCommons, actively contributing to safety standardization and transparency. We encourage the community to adopt taxonomies like the MLCommons Proof of Concept evaluation to facilitate collaboration and transparency on safety and content evaluations. Our Purple Llama tools are open sourced for the community to use and widely distributed across ecosystem partners including cloud service providers. We encourage community contributions to our Github repository. We also set up the Llama Impact Grants program to identify and support the most compelling applications of Meta’s Llama model for societal benefit across three categories: education, climate and open innovation. The 20 finalists from the hundreds of applications can be found here. Finally, we put in place a set of resources including an output reporting mechanism and bug bounty program to continuously improve the Llama technology with the help of the community. ## Ethical Considerations and Limitations The core values of Llama 3.1 are openness, inclusivity and helpfulness. It is meant to serve everyone, and to work for a wide range of use cases. It is thus designed to be accessible to people across many different backgrounds, experiences and perspectives. Llama 3.1 addresses users and their needs as they are, without insertion unnecessary judgment or normativity, while reflecting the understanding that even content that may appear problematic in some cases can serve valuable purposes in others. It respects the dignity and autonomy of all users, especially in terms of the values of free thought and expression that power innovation and progress. But Llama 3.1 is a new technology, and like any new technology, there are risks associated with its use. Testing conducted to date has not covered, nor could it cover, all scenarios. For these reasons, as with all LLMs, Llama 3.1’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 3.1 models, developers should perform safety testing and tuning tailored to their specific applications of the model. Please refer to available resources including our Responsible Use Guide, Trust and Safety solutions, and other resources to learn more about responsible development.", + "model_explanation_gemini": "\"meta-llama_Llama-3.1-8B is a multilingual (English, German, French, Italian, Portuguese, Hindi, Spanish, Thai) text-generation model released under the Llama 3.1 Community License, designed for foundational language tasks with commercial use restrictions for large-scale deployments.\"\n\n**Features**: \n- Multilingual text generation (8 languages) \n- Released under Llama 3.1 Community License \n- Commercial use capped at 700M monthly active" +} \ No newline at end of file diff --git a/model_data_json/meta-llama_Llama-3.2-11B-Vision-Instruct.json b/model_data_json/meta-llama_Llama-3.2-11B-Vision-Instruct.json new file mode 100644 index 0000000000000000000000000000000000000000..19eaf13d137656211497217b1c71ab8409417687 --- /dev/null +++ b/model_data_json/meta-llama_Llama-3.2-11B-Vision-Instruct.json @@ -0,0 +1,31 @@ +{ + "model_id": "meta-llama/Llama-3.2-11B-Vision-Instruct", + "downloads": 706191, + "tags": [ + "transformers", + "safetensors", + "mllama", + "image-text-to-text", + "facebook", + "meta", + "pytorch", + "llama", + "llama-3", + "conversational", + "en", + "de", + "fr", + "it", + "pt", + "hi", + "es", + "th", + "arxiv:2204.05149", + "license:llama3.2", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: - en - de - fr - it - pt - hi - es - th library_name: transformers pipeline_tag: image-text-to-text tags: - facebook - meta - pytorch - llama - llama-3 widget: - example_title: rococo art messages: - role: user content: - type: text text: Which era does this piece belong to? Give details about the era. - type: image_url image_url: url: - role: assistant content: | The piece is a painting of a woman in a red dress, surrounded by gold and white ornate details. The woman is depicted in mid-air, with her arms outstretched and her legs bent at the knees. She is holding a bouquet of flowers in her right hand and a fruit in her left hand. The background of the painting is a light blue sky with pink clouds, and there are also some pink flowers and green leaves surrounding the woman. The overall atmosphere of the painting is one of joy and celebration, as if the woman is dancing or celebrating something. This piece belongs to the Rococo era, which was a style of art and architecture that emerged in Europe in the 18th century. The Rococo style is characterized by its use of pastel colors, curved lines, and ornate details. It was popularized during the reign of King Louis XV of France, who ruled from 1715 to 1774. - example_title: diagram understanding messages: - role: user content: - type: text text: Where do the droughts happen according to this diagram? - type: image_url image_url: url: - role: assistant content: | # Drought Regions in 2016 According to the diagram, the regions that experienced severe droughts in 2016 are: - **Eastern and Southern Africa**: This region was severely affected by droughts, with significant impacts on agriculture and water resources. - **Western and Southern Africa**: Droughts also occurred in this region, leading to food insecurity and other challenges. These regions were particularly vulnerable to droughts in 2016, highlighting the need for effective water management strategies and climate resilience measures. - example_title: document qa messages: - role: user content: - type: text text: How long does it take from invoice date to due date? Be short and concise. - type: image_url image_url: url: - role: assistant content: | To determine the time gap between the invoice date and the due date, we need to calculate the difference between these two dates. Invoice Date: February 11, 2019 Due Date: February 26, 2019 Step 1: Calculate the Difference in Days February 26 - February 11 = 15 days Therefore, the time gap between the invoice date and the due date is 15 days. license: llama3.2 extra_gated_prompt: >- ### LLAMA 3.2 COMMUNITY LICENSE AGREEMENT Llama 3.2 Version Release Date: September 25, 2024 “Agreement” means the terms and conditions for use, reproduction, distribution and modification of the Llama Materials set forth herein. “Documentation” means the specifications, manuals and documentation accompanying Llama 3.2 distributed by Meta at “Licensee” or “you” means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity’s behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf. “Llama 3.2” means the foundational large language models and software and algorithms, including machine-learning model code, trained model weights, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing distributed by Meta at “Llama Materials” means, collectively, Meta’s proprietary Llama 3.2 and Documentation (and any portion thereof) made available under this Agreement. “Meta” or “we” means Meta Platforms Ireland Limited (if you are located in or, if you are an entity, your principal place of business is in the EEA or Switzerland) and Meta Platforms, Inc. (if you are located outside of the EEA or Switzerland). By clicking “I Accept” below or by using or distributing any portion or element of the Llama Materials, you agree to be bound by this Agreement. 1. License Rights and Redistribution. a. Grant of Rights. You are granted a non-exclusive, worldwide, non-transferable and royalty-free limited license under Meta’s intellectual property or other rights owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy, create derivative works of, and make modifications to the Llama Materials. b. Redistribution and Use. i. If you distribute or make available the Llama Materials (or any derivative works thereof), or a product or service (including another AI model) that contains any of them, you shall (A) provide a copy of this Agreement with any such Llama Materials; and (B) prominently display “Built with Llama” on a related website, user interface, blogpost, about page, or product documentation. If you use the Llama Materials or any outputs or results of the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include “Llama” at the beginning of any such AI model name. ii. If you receive Llama Materials, or any derivative works thereof, from a Licensee as part of an integrated end user product, then Section 2 of this Agreement will not apply to you. iii. You must retain in all copies of the Llama Materials that you distribute the following attribution notice within a “Notice” text file distributed as a part of such copies: “Llama 3.2 is licensed under the Llama 3.2 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.” iv. Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the Llama Materials (available at which is hereby incorporated by reference into this Agreement. 2. Additional Commercial Terms. If, on the Llama 3.2 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights. 3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN “AS IS” BASIS, WITHOUT WARRANTIES OF ANY KIND, AND META DISCLAIMS ALL WARRANTIES OF ANY KIND, BOTH EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS. 4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING. 5. Intellectual Property. a. No trademark licenses are granted under this Agreement, and in connection with the Llama Materials, neither Meta nor Licensee may use any name or mark owned by or associated with the other or any of its affiliates, except as required for reasonable and customary use in describing and redistributing the Llama Materials or as set forth in this Section 5(a). Meta hereby grants you a license to use “Llama” (the “Mark”) solely as required to comply with the last sentence of Section 1.b.i. You will comply with Meta’s brand guidelines (currently accessible at All goodwill arising out of your use of the Mark will inure to the benefit of Meta. b. Subject to Meta’s ownership of Llama Materials and derivatives made by or for Meta, with respect to any derivative works and modifications of the Llama Materials that are made by you, as between you and Meta, you are and will be the owner of such derivative works and modifications. c. If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 3.2 outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials. 6. Term and Termination. The term of this Agreement will commence upon your acceptance of this Agreement or access to the Llama Materials and will continue in full force and effect until terminated in accordance with the terms and conditions herein. Meta may terminate this Agreement if you are in breach of any term or condition of this Agreement. Upon termination of this Agreement, you shall delete and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the termination of this Agreement. 7. Governing Law and Jurisdiction. This Agreement will be governed and construed under the laws of the State of California without regard to choice of law principles, and the UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement. The courts of California shall have exclusive jurisdiction of any dispute arising out of this Agreement. ### Llama 3.2 Acceptable Use Policy Meta is committed to promoting safe and fair use of its tools and features, including Llama 3.2. If you access or use Llama 3.2, you agree to this Acceptable Use Policy (“**Policy**”). The most recent copy of this policy can be found at #### Prohibited Uses We want everyone to use Llama 3.2 safely and responsibly. You agree you will not use, or allow others to use, Llama 3.2 to: 1. Violate the law or others’ rights, including to: 1. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as: 1. Violence or terrorism 2. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material 3. Human trafficking, exploitation, and sexual violence 4. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials. 5. Sexual solicitation 6. Any other criminal activity 1. Engage in, promote, incite, or facilitate the harassment, abuse, threatening, or bullying of individuals or groups of individuals 2. Engage in, promote, incite, or facilitate discrimination or other unlawful or harmful conduct in the provision of employment, employment benefits, credit, housing, other economic benefits, or other essential goods and services 3. Engage in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or related professional practices 4. Collect, process, disclose, generate, or infer private or sensitive information about individuals, including information about individuals’ identity, health, or demographic information, unless you have obtained the right to do so in accordance with applicable law 5. Engage in or facilitate any action or generate any content that infringes, misappropriates, or otherwise violates any third-party rights, including the outputs or results of any products or services using the Llama Materials 6. Create, generate, or facilitate the creation of malicious code, malware, computer viruses or do anything else that could disable, overburden, interfere with or impair the proper working, integrity, operation or appearance of a website or computer system 7. Engage in any action, or facilitate any action, to intentionally circumvent or remove usage restrictions or other safety measures, or to enable functionality disabled by Meta 2. Engage in, promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals, including use of Llama 3.2 related to the following: 8. Military, warfare, nuclear industries or applications, espionage, use for materials or activities that are subject to the International Traffic Arms Regulations (ITAR) maintained by the United States Department of State or to the U.S. Biological Weapons Anti-Terrorism Act of 1989 or the Chemical Weapons Convention Implementation Act of 1997 9. Guns and illegal weapons (including weapon development) 10. Illegal drugs and regulated/controlled substances 11. Operation of critical infrastructure, transportation technologies, or heavy machinery 12. Self-harm or harm to others, including suicide, cutting, and eating disorders 13. Any content intended to incite or promote violence, abuse, or any infliction of bodily harm to an individual 3. Intentionally deceive or mislead others, including use of Llama 3.2 related to the following: 14. Generating, promoting, or furthering fraud or the creation or promotion of disinformation 15. Generating, promoting, or furthering defamatory content, including the creation of defamatory statements, images, or other content 16. Generating, promoting, or further distributing spam 17. Impersonating another individual without consent, authorization, or legal right 18. Representing that the use of Llama 3.2 or outputs are human-generated 19. Generating or facilitating false online engagement, including fake reviews and other means of fake online engagement 4. Fail to appropriately disclose to end users any known dangers of your AI system 5. Interact with third party tools, models, or software designed to generate unlawful content or engage in unlawful or harmful conduct and/or represent that the outputs of such tools, models, or software are associated with Meta or Llama 3.2 With respect to any multimodal models included in Llama 3.2, the rights granted under Section 1(a) of the Llama 3.2 Community License Agreement are not being granted to you if you are an individual domiciled in, or a company with a principal place of business in, the European Union. This restriction does not apply to end users of a product or service that incorporates any such multimodal models. Please report any violation of this Policy, software “bug,” or other problems that could lead to a violation of this Policy through one of the following means: * Reporting issues with the model: * Reporting risky content generated by the model: developers.facebook.com/llama_output_feedback * Reporting bugs and security concerns: facebook.com/whitehat/info * Reporting violations of the Acceptable Use Policy or unlicensed uses of Llama 3.2: LlamaUseReport@meta.com extra_gated_fields: First Name: text Last Name: text Date of birth: date_picker Country: country Affiliation: text Job title: type: select options: - Student - Research Graduate - AI researcher - AI developer/engineer - Reporter - Other geo: ip_location By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy: checkbox extra_gated_description: >- The information you provide will be collected, stored, processed and shared in accordance with the Meta Privacy Policy. extra_gated_button_content: Submit extra_gated_eu_disallowed: true --- ## Model Information The Llama 3.2-Vision collection of multimodal large language models (LLMs) is a collection of pretrained and instruction-tuned image reasoning generative models in 11B and 90B sizes (text \\+ images in / text out). The Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image. The models outperform many of the available open source and closed multimodal models on common industry benchmarks. **Model Developer**: Meta **Model Architecture:** Llama 3.2-Vision is built on top of Llama 3.1 text-only model, which is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. To support image recognition tasks, the Llama 3.2-Vision model uses a separately trained vision adapter that integrates with the pre-trained Llama 3.1 language model. The adapter consists of a series of cross-attention layers that feed image encoder representations into the core LLM. | | Training Data | Params | Input modalities | Output modalities | Context length | GQA | Data volume | Knowledge cutoff | | :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- | | Llama 3.2-Vision | (Image, text) pairs | 11B (10.6) | Text \\+ Image | Text | 128k | Yes | 6B (image, text) pairs | December 2023 | | Llama 3.2-Vision | (Image, text) pairs | 90B (88.8) | Text \\+ Image | Text | 128k | Yes | 6B (image, text) pairs | December 2023 | **Supported Languages:** For text only tasks, English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai are officially supported. Llama 3.2 has been trained on a broader collection of languages than these 8 supported languages. Note for image+text applications, English is the only language supported. Developers may fine-tune Llama 3.2 models for languages beyond these supported languages, provided they comply with the Llama 3.2 Community License and the Acceptable Use Policy. Developers are always expected to ensure that their deployments, including those that involve additional languages, are completed safely and responsibly. **Llama 3.2 Model Family:** Token counts refer to pretraining data only. All model versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date:** Sept 25, 2024 **Status:** This is a static model trained on an offline dataset. Future versions may be released that improve model capabilities and safety. **License:** Use of Llama 3.2 is governed by the Llama 3.2 Community License (a custom, commercial license agreement). **Feedback:** Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model README. For more technical information about generation parameters and recipes for how to use Llama 3.2-Vision in applications, please go here. ## Intended Use **Intended Use Cases:** Llama 3.2-Vision is intended for commercial and research use. Instruction tuned models are intended for visual recognition, image reasoning, captioning, and assistant-like chat with images, whereas pretrained models can be adapted for a variety of image reasoning tasks. Additionally, because of Llama 3.2-Vision’s ability to take images and text as inputs, additional use cases could include: 1. Visual Question Answering (VQA) and Visual Reasoning: Imagine a machine that looks at a picture and understands your questions about it. 2. Document Visual Question Answering (DocVQA): Imagine a computer understanding both the text and layout of a document, like a map or contract, and then answering questions about it directly from the image. 3. Image Captioning: Image captioning bridges the gap between vision and language, extracting details, understanding the scene, and then crafting a sentence or two that tells the story. 4. Image-Text Retrieval: Image-text retrieval is like a matchmaker for images and their descriptions. Similar to a search engine but one that understands both pictures and words. 5. Visual Grounding: Visual grounding is like connecting the dots between what we see and say. It’s about understanding how language references specific parts of an image, allowing AI models to pinpoint objects or regions based on natural language descriptions. The Llama 3.2 model collection also supports the ability to leverage the outputs of its models to improve other models including synthetic data generation and distillation. The Llama 3.2 Community License allows for these use cases. **Out of Scope:** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3.2 Community License. Use in languages beyond those explicitly referenced as supported in this model card. ## How to use This repository contains two versions of Llama-3.2-11B-Vision-Instruct, for use with transformers and with the original codebase. ### Use with transformers Starting with transformers >= 4.45.0 onward, you can run inference using conversational messages that may include an image you can query about. Make sure to update your transformers installation via . ### Use with Please, follow the instructions in the repository. To download the original checkpoints, you can use as follows: ## Hardware and Software **Training Factors:** We used custom training libraries, Meta's custom built GPU cluster, and production infrastructure for pretraining. Fine-tuning, annotation, and evaluation were also performed on production infrastructure. **Training Energy Use:** Training utilized a cumulative of **2.02M** GPU hours of computation on H100-80GB (TDP of 700W) type hardware, per the table below. Training time is the total GPU time required for training each model and power consumption is the peak power capacity per GPU device used, adjusted for power usage efficiency. ## **Training Greenhouse Gas Emissions:** Estimated total location-based greenhouse gas emissions were **584** tons CO2eq for training. Since 2020, Meta has maintained net zero greenhouse gas emissions in its global operations and matched 100% of its electricity use with renewable energy, therefore the total market-based greenhouse gas emissions for training were 0 tons CO2eq. | | Training Time (GPU hours) | Training Power Consumption (W) | Training Location-Based Greenhouse Gas Emissions (tons CO2eq) | Training Market-Based Greenhouse Gas Emissions (tons CO2eq) | | :---- | :---: | :---: | :---: | :---: | | Llama 3.2-vision 11B | Stage 1 pretraining: 147K H100 hours Stage 2 annealing: 98K H100 hours SFT: 896 H100 hours RLHF: 224 H100 hours | 700 | 71 | 0 | | Llama 3.2-vision 90B | Stage 1 pretraining: 885K H100 hours Stage 2 annealing: 885K H100 hours SFT: 3072 H100 hours RLHF: 2048 H100 hours | 700 | 513 | 0 | | Total | 2.02M | | 584 | 0 | The methodology used to determine training energy use and greenhouse gas emissions can be found here. Since Meta is openly releasing these models, the training energy use and greenhouse gas emissions will not be incurred by others. ## Training Data **Overview:** Llama 3.2-Vision was pretrained on 6B image and text pairs. The instruction tuning data includes publicly available vision instruction datasets, as well as over 3M synthetically generated examples. **Data Freshness:** The pretraining data has a cutoff of December 2023\\. ## Benchmarks \\- Image Reasoning In this section, we report the results for Llama 3.2-Vision models on standard automatic benchmarks. For all these evaluations, we used our internal evaluations library. ### Base Pretrained Models | Category | Benchmark | \\# Shots | Metric | Llama 3.2 11B | Llama 3.2 90B | | ----- | ----- | ----- | ----- | ----- | ----- | | Image Understanding | VQAv2 (val) | 0 | Accuracy | 66.8 | 73.6 | | | Text VQA (val) | 0 | Relaxed accuracy | 73.1 | 73.5 | | | DocVQA (val, unseen) | 0 | ANLS | 62.3 | 70.7 | | Visual Reasoning | MMMU (val, 0-shot) | 0 | Micro average accuracy | 41.7 | 49.3 | | | ChartQA (test) | 0 | Accuracy | 39.4 | 54.2 | | | InfographicsQA (val, unseen) | 0 | ANLS | 43.2 | 56.8 | | | AI2 Diagram (test) | 0 | Accuracy | 62.4 | 75.3 | ### Instruction Tuned Models | Modality | Capability | Benchmark | \\# Shots | Metric | Llama 3.2 11B | Llama 3.2 90B | | ----- | :---: | ----- | :---: | :---: | ----- | ----- | | Image | College-level Problems and Mathematical Reasoning | MMMU (val, CoT) | 0 | Micro average accuracy | 50.7 | 60.3 | | | | MMMU-Pro, Standard (10 opts, test) | 0 | Accuracy | 33.0 | 45.2 | | | | MMMU-Pro, Vision (test) | 0 | Accuracy | 23.7 | 33.8 | | | | MathVista (testmini) | 0 | Accuracy | 51.5 | 57.3 | | | Charts and Diagram Understanding | ChartQA (test, CoT) | 0 | Relaxed accuracy | 83.4 | 85.5 | | | | AI2 Diagram (test) | 0 | Accuracy | 91.1 | 92.3 | | | | DocVQA (test) | 0 | ANLS | 88.4 | 90.1 | | | General Visual Question Answering | VQAv2 (test) | 0 | Accuracy | 75.2 | 78.1 | | | | | | | | | | Text | General | MMLU (CoT) | 0 | Macro\\_avg/acc | 73.0 | 86.0 | | | Math | MATH (CoT) | 0 | Final\\_em | 51.9 | 68.0 | | | Reasoning | GPQA | 0 | Accuracy | 32.8 | 46.7 | | | Multilingual | MGSM (CoT) | 0 | em | 68.9 | 86.9 | ## Responsibility & Safety As part of our Responsible release approach, we followed a three-pronged strategy to managing trust & safety risks: 1. Enable developers to deploy helpful, safe and flexible experiences for their target audience and for the use cases supported by Llama. 2. Protect developers against adversarial users aiming to exploit Llama capabilities to potentially cause harm. 3. Provide protections for the community to help prevent the misuse of our models. ### Responsible Deployment **Approach:** Llama is a foundational technology designed to be used in a variety of use cases, examples on how Meta’s Llama models have been responsibly deployed can be found in our Community Stories webpage. Our approach is to build the most helpful models enabling the world to benefit from the technology power, by aligning our model safety for the generic use cases addressing a standard set of harms. Developers are then in the driver seat to tailor safety for their use case, defining their own policy and deploying the models with the necessary safeguards in their Llama systems. Llama 3.2 was developed following the best practices outlined in our Responsible Use Guide, you can refer to the Responsible Use Guide to learn more. #### Llama 3.2 Instruct **Objective:** Our main objectives for conducting safety fine-tuning are to provide the research community with a valuable resource for studying the robustness of safety fine-tuning, as well as to offer developers a readily available, safe, and powerful model for various applications to reduce the developer workload to deploy safe AI systems. We implemented the same set of safety mitigations as in Llama 3, and you can learn more about these in the Llama 3 paper. **Fine-Tuning Data:** We employ a multi-faceted approach to data collection, combining human-generated data from our vendors with synthetic data to mitigate potential safety risks. We’ve developed many large language model (LLM)-based classifiers that enable us to thoughtfully select high-quality prompts and responses, enhancing data quality control. **Refusals and Tone:** Building on the work we started with Llama 3, we put a great emphasis on model refusals to benign prompts as well as refusal tone. We included both borderline and adversarial prompts in our safety data strategy, and modified our safety data responses to follow tone guidelines. #### Llama 3.2 Systems **Safety as a System:** Large language models, including Llama 3.2, **are not designed to be deployed in isolation** but instead should be deployed as part of an overall AI system with additional safety guardrails as required. Developers are expected to deploy system safeguards when building agentic systems. Safeguards are key to achieve the right helpfulness-safety alignment as well as mitigating safety and security risks inherent to the system and any integration of the model or system with external tools. As part of our responsible release approach, we provide the community with safeguards that developers should deploy with Llama models or other LLMs, including Llama Guard, Prompt Guard and Code Shield. All our reference implementations demos contain these safeguards by default so developers can benefit from system-level safety out-of-the-box. ### New Capabilities and Use Cases **Technological Advancement:** Llama releases usually introduce new capabilities that require specific considerations in addition to the best practices that generally apply across all Generative AI use cases. For prior release capabilities also supported by Llama 3.2, see Llama 3.1 Model Card, as the same considerations apply here as well., **Image Reasoning:** Llama 3.2-Vision models come with multimodal (text and image) input capabilities enabling image reasoning applications. As part of our responsible release process, we took dedicated measures including evaluations and mitigations to address the risk of the models uniquely identifying individuals in images. As with other LLM risks, models may not always be robust to adversarial prompts, and developers should evaluate identification and other applicable risks in the context of their applications as well as consider deploying Llama Guard 3-11B-Vision as part of their system or other mitigations as appropriate to detect and mitigate such risks. ### Evaluations **Scaled Evaluations:** We built dedicated, adversarial evaluation datasets and evaluated systems composed of Llama models and Purple Llama safeguards to filter input prompt and output response. It is important to evaluate applications in context, and we recommend building dedicated evaluation dataset for your use case. **Red teaming:** We conducted recurring red teaming exercises with the goal of discovering risks via adversarial prompting and we used the learnings to improve our benchmarks and safety tuning datasets. We partnered early with subject-matter experts in critical risk areas to understand the nature of these real-world harms and how such models may lead to unintended harm for society. Based on these conversations, we derived a set of adversarial goals for the red team to attempt to achieve, such as extracting harmful information or reprogramming the model to act in a potentially harmful capacity. The red team consisted of experts in cybersecurity, adversarial machine learning, responsible AI, and integrity in addition to multilingual content specialists with background in integrity issues in specific geographic markets. ### Critical Risks In addition to our safety work above, we took extra care on measuring and/or mitigating the following critical risk areas: **1\\. CBRNE (Chemical, Biological, Radiological, Nuclear, and Explosive Weapons):** For Llama 3.1, to assess risks related to proliferation of chemical and biological weapons, we performed uplift testing designed to assess whether use of Llama 3.1 models could meaningfully increase the capabilities of malicious actors to plan or carry out attacks using these types of weapons. For Llama 3.2-Vision models, we conducted additional targeted evaluations and found that it was unlikely Llama 3.2 presented an increase in scientific capabilities due to its added image understanding capability as compared to Llama 3.1. **2\\. Child Safety:** Child Safety risk assessments were conducted using a team of experts, to assess the model’s capability to produce outputs that could result in Child Safety risks and inform on any necessary and appropriate risk mitigations via fine tuning. We leveraged those expert red teaming sessions to expand the coverage of our evaluation benchmarks through Llama 3 model development. For Llama 3, we conducted new in-depth sessions using objective based methodologies to assess the model risks along multiple attack vectors including the additional languages Llama 3 is trained on. We also partnered with content specialists to perform red teaming exercises assessing potentially violating content while taking account of market specific nuances or experiences. **3\\. Cyber Attacks:** For Llama 3.1 405B, our cyber attack uplift study investigated whether LLMs can enhance human capabilities in hacking tasks, both in terms of skill level and speed. Our attack automation study focused on evaluating the capabilities of LLMs when used as autonomous agents in cyber offensive operations, specifically in the context of ransomware attacks. This evaluation was distinct from previous studies that considered LLMs as interactive assistants. The primary objective was to assess whether these models could effectively function as independent agents in executing complex cyber-attacks without human intervention. Because Llama 3.2’s vision capabilities are not generally germane to cyber uplift, we believe that the testing conducted for Llama 3.1 also applies to Llama 3.2. ### Community **Industry Partnerships:** Generative AI safety requires expertise and tooling, and we believe in the strength of the open community to accelerate its progress. We are active members of open consortiums, including the AI Alliance, Partnership on AI and MLCommons, actively contributing to safety standardization and transparency. We encourage the community to adopt taxonomies like the MLCommons Proof of Concept evaluation to facilitate collaboration and transparency on safety and content evaluations. Our Purple Llama tools are open sourced for the community to use and widely distributed across ecosystem partners including cloud service providers. We encourage community contributions to our Github repository. **Grants:** We also set up the Llama Impact Grants program to identify and support the most compelling applications of Meta’s Llama model for societal benefit across three categories: education, climate and open innovation. The 20 finalists from the hundreds of applications can be found here. **Reporting:** Finally, we put in place a set of resources including an output reporting mechanism and bug bounty program to continuously improve the Llama technology with the help of the community. ## Ethical Considerations and Limitations **Values:** The core values of Llama 3.2 are openness, inclusivity and helpfulness. It is meant to serve everyone, and to work for a wide range of use cases. It is thus designed to be accessible to people across many different backgrounds, experiences and perspectives. Llama 3.2 addresses users and their needs as they are, without insertion unnecessary judgment or normativity, while reflecting the understanding that even content that may appear problematic in some cases can serve valuable purposes in others. It respects the dignity and autonomy of all users, especially in terms of the values of free thought and expression that power innovation and progress. **Testing:** But Llama 3.2 is a new technology, and like any new technology, there are risks associated with its use. Testing conducted to date has not covered, nor could it cover, all scenarios. For these reasons, as with all LLMs, Llama 3.2’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 3.2 models, developers should perform safety testing and tuning tailored to their specific applications of the model. Please refer to available resources including our Responsible Use Guide, Trust and Safety solutions, and other resources to learn more about responsible development.", + "model_explanation_gemini": "Generates text responses based on both image inputs and text prompts, supporting multilingual tasks like art analysis, diagram interpretation, and document QA.\n\nFeatures: \n- Multilingual support (English, German, French, Italian, Portuguese, Hindi, Spanish, Thai) \n- Image-text-to-text capability \n- Handles tasks like era identification, diagram understanding, and document queries \n- Built on Meta's Llama-3.2 architecture (11B parameters) \n\nComparison: \nUnlike text-only LLMs" +} \ No newline at end of file diff --git a/model_data_json/meta-llama_Llama-3.2-1B-Instruct.json b/model_data_json/meta-llama_Llama-3.2-1B-Instruct.json new file mode 100644 index 0000000000000000000000000000000000000000..dc03e21f6b053c37a81392aca913e68ef2a96d03 --- /dev/null +++ b/model_data_json/meta-llama_Llama-3.2-1B-Instruct.json @@ -0,0 +1,32 @@ +{ + "model_id": "meta-llama/Llama-3.2-1B-Instruct", + "downloads": 2482128, + "tags": [ + "transformers", + "safetensors", + "llama", + "text-generation", + "facebook", + "meta", + "pytorch", + "llama-3", + "conversational", + "en", + "de", + "fr", + "it", + "pt", + "hi", + "es", + "th", + "arxiv:2204.05149", + "arxiv:2405.16406", + "license:llama3.2", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: - en - de - fr - it - pt - hi - es - th library_name: transformers pipeline_tag: text-generation tags: - facebook - meta - pytorch - llama - llama-3 license: llama3.2 extra_gated_prompt: >- ### LLAMA 3.2 COMMUNITY LICENSE AGREEMENT Llama 3.2 Version Release Date: September 25, 2024 “Agreement” means the terms and conditions for use, reproduction, distribution and modification of the Llama Materials set forth herein. “Documentation” means the specifications, manuals and documentation accompanying Llama 3.2 distributed by Meta at “Licensee” or “you” means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity’s behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf. “Llama 3.2” means the foundational large language models and software and algorithms, including machine-learning model code, trained model weights, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing distributed by Meta at “Llama Materials” means, collectively, Meta’s proprietary Llama 3.2 and Documentation (and any portion thereof) made available under this Agreement. “Meta” or “we” means Meta Platforms Ireland Limited (if you are located in or, if you are an entity, your principal place of business is in the EEA or Switzerland) and Meta Platforms, Inc. (if you are located outside of the EEA or Switzerland). By clicking “I Accept” below or by using or distributing any portion or element of the Llama Materials, you agree to be bound by this Agreement. 1. License Rights and Redistribution. a. Grant of Rights. You are granted a non-exclusive, worldwide, non-transferable and royalty-free limited license under Meta’s intellectual property or other rights owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy, create derivative works of, and make modifications to the Llama Materials. b. Redistribution and Use. i. If you distribute or make available the Llama Materials (or any derivative works thereof), or a product or service (including another AI model) that contains any of them, you shall (A) provide a copy of this Agreement with any such Llama Materials; and (B) prominently display “Built with Llama” on a related website, user interface, blogpost, about page, or product documentation. If you use the Llama Materials or any outputs or results of the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include “Llama” at the beginning of any such AI model name. ii. If you receive Llama Materials, or any derivative works thereof, from a Licensee as part of an integrated end user product, then Section 2 of this Agreement will not apply to you. iii. You must retain in all copies of the Llama Materials that you distribute the following attribution notice within a “Notice” text file distributed as a part of such copies: “Llama 3.2 is licensed under the Llama 3.2 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.” iv. Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the Llama Materials (available at which is hereby incorporated by reference into this Agreement. 2. Additional Commercial Terms. If, on the Llama 3.2 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights. 3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN “AS IS” BASIS, WITHOUT WARRANTIES OF ANY KIND, AND META DISCLAIMS ALL WARRANTIES OF ANY KIND, BOTH EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS. 4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING. 5. Intellectual Property. a. No trademark licenses are granted under this Agreement, and in connection with the Llama Materials, neither Meta nor Licensee may use any name or mark owned by or associated with the other or any of its affiliates, except as required for reasonable and customary use in describing and redistributing the Llama Materials or as set forth in this Section 5(a). Meta hereby grants you a license to use “Llama” (the “Mark”) solely as required to comply with the last sentence of Section 1.b.i. You will comply with Meta’s brand guidelines (currently accessible at All goodwill arising out of your use of the Mark will inure to the benefit of Meta. b. Subject to Meta’s ownership of Llama Materials and derivatives made by or for Meta, with respect to any derivative works and modifications of the Llama Materials that are made by you, as between you and Meta, you are and will be the owner of such derivative works and modifications. c. If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 3.2 outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials. 6. Term and Termination. The term of this Agreement will commence upon your acceptance of this Agreement or access to the Llama Materials and will continue in full force and effect until terminated in accordance with the terms and conditions herein. Meta may terminate this Agreement if you are in breach of any term or condition of this Agreement. Upon termination of this Agreement, you shall delete and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the termination of this Agreement. 7. Governing Law and Jurisdiction. This Agreement will be governed and construed under the laws of the State of California without regard to choice of law principles, and the UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement. The courts of California shall have exclusive jurisdiction of any dispute arising out of this Agreement. ### Llama 3.2 Acceptable Use Policy Meta is committed to promoting safe and fair use of its tools and features, including Llama 3.2. If you access or use Llama 3.2, you agree to this Acceptable Use Policy (“**Policy**”). The most recent copy of this policy can be found at #### Prohibited Uses We want everyone to use Llama 3.2 safely and responsibly. You agree you will not use, or allow others to use, Llama 3.2 to: 1. Violate the law or others’ rights, including to: 1. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as: 1. Violence or terrorism 2. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material 3. Human trafficking, exploitation, and sexual violence 4. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials. 5. Sexual solicitation 6. Any other criminal activity 1. Engage in, promote, incite, or facilitate the harassment, abuse, threatening, or bullying of individuals or groups of individuals 2. Engage in, promote, incite, or facilitate discrimination or other unlawful or harmful conduct in the provision of employment, employment benefits, credit, housing, other economic benefits, or other essential goods and services 3. Engage in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or related professional practices 4. Collect, process, disclose, generate, or infer private or sensitive information about individuals, including information about individuals’ identity, health, or demographic information, unless you have obtained the right to do so in accordance with applicable law 5. Engage in or facilitate any action or generate any content that infringes, misappropriates, or otherwise violates any third-party rights, including the outputs or results of any products or services using the Llama Materials 6. Create, generate, or facilitate the creation of malicious code, malware, computer viruses or do anything else that could disable, overburden, interfere with or impair the proper working, integrity, operation or appearance of a website or computer system 7. Engage in any action, or facilitate any action, to intentionally circumvent or remove usage restrictions or other safety measures, or to enable functionality disabled by Meta 2. Engage in, promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals, including use of Llama 3.2 related to the following: 8. Military, warfare, nuclear industries or applications, espionage, use for materials or activities that are subject to the International Traffic Arms Regulations (ITAR) maintained by the United States Department of State or to the U.S. Biological Weapons Anti-Terrorism Act of 1989 or the Chemical Weapons Convention Implementation Act of 1997 9. Guns and illegal weapons (including weapon development) 10. Illegal drugs and regulated/controlled substances 11. Operation of critical infrastructure, transportation technologies, or heavy machinery 12. Self-harm or harm to others, including suicide, cutting, and eating disorders 13. Any content intended to incite or promote violence, abuse, or any infliction of bodily harm to an individual 3. Intentionally deceive or mislead others, including use of Llama 3.2 related to the following: 14. Generating, promoting, or furthering fraud or the creation or promotion of disinformation 15. Generating, promoting, or furthering defamatory content, including the creation of defamatory statements, images, or other content 16. Generating, promoting, or further distributing spam 17. Impersonating another individual without consent, authorization, or legal right 18. Representing that the use of Llama 3.2 or outputs are human-generated 19. Generating or facilitating false online engagement, including fake reviews and other means of fake online engagement 4. Fail to appropriately disclose to end users any known dangers of your AI system 5. Interact with third party tools, models, or software designed to generate unlawful content or engage in unlawful or harmful conduct and/or represent that the outputs of such tools, models, or software are associated with Meta or Llama 3.2 With respect to any multimodal models included in Llama 3.2, the rights granted under Section 1(a) of the Llama 3.2 Community License Agreement are not being granted to you if you are an individual domiciled in, or a company with a principal place of business in, the European Union. This restriction does not apply to end users of a product or service that incorporates any such multimodal models. Please report any violation of this Policy, software “bug,” or other problems that could lead to a violation of this Policy through one of the following means: * Reporting issues with the model: * Reporting risky content generated by the model: developers.facebook.com/llama_output_feedback * Reporting bugs and security concerns: facebook.com/whitehat/info * Reporting violations of the Acceptable Use Policy or unlicensed uses of Llama 3.2: LlamaUseReport@meta.com extra_gated_fields: First Name: text Last Name: text Date of birth: date_picker Country: country Affiliation: text Job title: type: select options: - Student - Research Graduate - AI researcher - AI developer/engineer - Reporter - Other geo: ip_location By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy: checkbox extra_gated_description: >- The information you provide will be collected, stored, processed and shared in accordance with the Meta Privacy Policy. extra_gated_button_content: Submit --- ## Model Information The Llama 3.2 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction-tuned generative models in 1B and 3B sizes (text in/text out). The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks. They outperform many of the available open source and closed chat models on common industry benchmarks. **Model Developer:** Meta **Model Architecture:** Llama 3.2 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. | | Training Data | Params | Input modalities | Output modalities | Context Length | GQA | Shared Embeddings | Token count | Knowledge cutoff | | :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- | | Llama 3.2 (text only) | A new mix of publicly available online data. | 1B (1.23B) | Multilingual Text | Multilingual Text and code | 128k | Yes | Yes | Up to 9T tokens | December 2023 | | | | 3B (3.21B) | Multilingual Text | Multilingual Text and code | | | | | | | Llama 3.2 Quantized (text only) | A new mix of publicly available online data. | 1B (1.23B) | Multilingual Text | Multilingual Text and code | 8k | Yes | Yes | Up to 9T tokens | December 2023 | | | | 3B (3.21B) | Multilingual Text | Multilingual Text and code | | | | | | **Supported Languages:** English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai are officially supported. Llama 3.2 has been trained on a broader collection of languages than these 8 supported languages. Developers may fine-tune Llama 3.2 models for languages beyond these supported languages, provided they comply with the Llama 3.2 Community License and the Acceptable Use Policy. Developers are always expected to ensure that their deployments, including those that involve additional languages, are completed safely and responsibly. **Llama 3.2 Model Family:** Token counts refer to pretraining data only. All model versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date:** Sept 25, 2024 **Status:** This is a static model trained on an offline dataset. Future versions may be released that improve model capabilities and safety. **License:** Use of Llama 3.2 is governed by the Llama 3.2 Community License (a custom, commercial license agreement). **Feedback:** Instructions on how to provide feedback or comments on the model can be found in the Llama Models README. For more technical information about generation parameters and recipes for how to use Llama 3.2 in applications, please go here. ## Intended Use **Intended Use Cases:** Llama 3.2 is intended for commercial and research use in multiple languages. Instruction tuned text only models are intended for assistant-like chat and agentic applications like knowledge retrieval and summarization, mobile AI powered writing assistants and query and prompt rewriting. Pretrained models can be adapted for a variety of additional natural language generation tasks. Similarly, quantized models can be adapted for a variety of on-device use-cases with limited compute resources. **Out of Scope:** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3.2 Community License. Use in languages beyond those explicitly referenced as supported in this model card. ## How to use This repository contains two versions of Llama-3.2-1B-Instruct, for use with transformers and with the original codebase. ### Use with transformers Starting with onward, you can run conversational inference using the Transformers abstraction or by leveraging the Auto classes with the function. Make sure to update your transformers installation via . Note: You can also find detailed recipes on how to use the model locally, with , assisted generations, quantised and more at []( ### Use with Please, follow the instructions in the repository To download Original checkpoints, see the example command below leveraging : ## Hardware and Software **Training Factors:** We used custom training libraries, Meta's custom built GPU cluster, and production infrastructure for pretraining. Fine-tuning, quantization, annotation, and evaluation were also performed on production infrastructure. **Training Energy Use:** Training utilized a cumulative of **916k** GPU hours of computation on H100-80GB (TDP of 700W) type hardware, per the table below. Training time is the total GPU time required for training each model and power consumption is the peak power capacity per GPU device used, adjusted for power usage efficiency. **Training Greenhouse Gas Emissions:** Estimated total location-based greenhouse gas emissions were **240** tons CO2eq for training. Since 2020, Meta has maintained net zero greenhouse gas emissions in its global operations and matched 100% of its electricity use with renewable energy; therefore, the total market-based greenhouse gas emissions for training were 0 tons CO2eq. | | Training Time (GPU hours) | Logit Generation Time (GPU Hours) | Training Power Consumption (W) | Training Location-Based Greenhouse Gas Emissions (tons CO2eq) | Training Market-Based Greenhouse Gas Emissions (tons CO2eq) | | :---- | :---: | ----- | :---: | :---: | :---: | | Llama 3.2 1B | 370k | \\- | 700 | 107 | 0 | | Llama 3.2 3B | 460k | \\- | 700 | 133 | 0 | | Llama 3.2 1B SpinQuant | 1.7 | 0 | 700 | *Negligible*\\*\\* | 0 | | Llama 3.2 3B SpinQuant | 2.4 | 0 | 700 | *Negligible*\\*\\* | 0 | | Llama 3.2 1B QLora | 1.3k | 0 | 700 | 0.381 | 0 | | Llama 3.2 3B QLora | 1.6k | 0 | 700 | 0.461 | 0 | | Total | 833k | 86k | | 240 | 0 | \\*\\* The location-based CO2e emissions of Llama 3.2 1B SpinQuant and Llama 3.2 3B SpinQuant are less than 0.001 metric tonnes each. This is due to the minimal training GPU hours that are required. The methodology used to determine training energy use and greenhouse gas emissions can be found here. Since Meta is openly releasing these models, the training energy use and greenhouse gas emissions will not be incurred by others. ## Training Data **Overview:** Llama 3.2 was pretrained on up to 9 trillion tokens of data from publicly available sources. For the 1B and 3B Llama 3.2 models, we incorporated logits from the Llama 3.1 8B and 70B models into the pretraining stage of the model development, where outputs (logits) from these larger models were used as token-level targets. Knowledge distillation was used after pruning to recover performance. In post-training we used a similar recipe as Llama 3.1 and produced final chat models by doing several rounds of alignment on top of the pre-trained model. Each round involved Supervised Fine-Tuning (SFT), Rejection Sampling (RS), and Direct Preference Optimization (DPO). **Data Freshness:** The pretraining data has a cutoff of December 2023\\. ## Quantization ### Quantization Scheme We designed the current quantization scheme with the PyTorch’s ExecuTorch inference framework and Arm CPU backend in mind, taking into account metrics including model quality, prefill/decoding speed, and memory footprint. Our quantization scheme involves three parts: - All linear layers in all transformer blocks are quantized to a 4-bit groupwise scheme (with a group size of 32) for weights and 8-bit per-token dynamic quantization for activations. - The classification layer is quantized to 8-bit per-channel for weight and 8-bit per token dynamic quantization for activation. - Similar to classification layer, an 8-bit per channel quantization is used for embedding layer. ### Quantization-Aware Training and LoRA The quantization-aware training (QAT) with low-rank adaptation (LoRA) models went through only post-training stages, using the same data as the full precision models. To initialize QAT, we utilize BF16 Llama 3.2 model checkpoints obtained after supervised fine-tuning (SFT) and perform an additional full round of SFT training with QAT. We then freeze the backbone of the QAT model and perform another round of SFT with LoRA adaptors applied to all layers within the transformer block. Meanwhile, the LoRA adaptors' weights and activations are maintained in BF16. Because our approach is similar to QLoRA of Dettmers et al., (2023) (i.e., quantization followed by LoRA adapters), we refer this method as QLoRA. Finally, we fine-tune the resulting model (both backbone and LoRA adaptors) using direct preference optimization (DPO). ### SpinQuant SpinQuant was applied, together with generative post-training quantization (GPTQ). For the SpinQuant rotation matrix fine-tuning, we optimized for 100 iterations, using 800 samples with sequence-length 2048 from the WikiText 2 dataset. For GPTQ, we used 128 samples from the same dataset with the same sequence-length. ## Benchmarks \\- English Text In this section, we report the results for Llama 3.2 models on standard automatic benchmarks. For all these evaluations, we used our internal evaluations library. ### Base Pretrained Models | Category | Benchmark | \\# Shots | Metric | Llama 3.2 1B | Llama 3.2 3B | Llama 3.1 8B | | ----- | ----- | :---: | :---: | :---: | :---: | :---: | | General | MMLU | 5 | macro\\_avg/acc\\_char | 32.2 | 58 | 66.7 | | | AGIEval English | 3-5 | average/acc\\_char | 23.3 | 39.2 | 47.8 | | | ARC-Challenge | 25 | acc\\_char | 32.8 | 69.1 | 79.7 | | Reading comprehension | SQuAD | 1 | em | 49.2 | 67.7 | 77 | | | QuAC (F1) | 1 | f1 | 37.9 | 42.9 | 44.9 | | | DROP (F1) | 3 | f1 | 28.0 | 45.2 | 59.5 | | Long Context | Needle in Haystack | 0 | em | 96.8 | 1 | 1 | ### Instruction Tuned Models | Capability | | Benchmark | \\# Shots | Metric | Llama 3.2 1B bf16 | Llama 3.2 1B Vanilla PTQ\\*\\* | Llama 3.2 1B Spin Quant | Llama 3.2 1B QLoRA | Llama 3.2 3B bf16 | Llama 3.2 3B Vanilla PTQ\\*\\* | Llama 3.2 3B Spin Quant | Llama 3.2 3B QLoRA | Llama 3.1 8B | | :---: | ----- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | | General | | MMLU | 5 | macro\\_avg/acc | 49.3 | 43.3 | 47.3 | 49.0 | 63.4 | 60.5 | 62 | 62.4 | 69.4 | | Re-writing | | Open-rewrite eval | 0 | micro\\_avg/rougeL | 41.6 | 39.2 | 40.9 | 41.2 | 40.1 | 40.3 | 40.8 | 40.7 | 40.9 | | Summarization | | TLDR9+ (test) | 1 | rougeL | 16.8 | 14.9 | 16.7 | 16.8 | 19.0 | 19.1 | 19.2 | 19.1 | 17.2 | | Instruction following | | IFEval | 0 | Avg(Prompt/Instruction acc Loose/Strict) | 59.5 | 51.5 | 58.4 | 55.6 | 77.4 | 73.9 | 73.5 | 75.9 | 80.4 | | Math | | GSM8K (CoT) | 8 | em\\_maj1@1 | 44.4 | 33.1 | 40.6 | 46.5 | 77.7 | 72.9 | 75.7 | 77.9 | 84.5 | | | | MATH (CoT) | 0 | final\\_em | 30.6 | 20.5 | 25.3 | 31.0 | 48.0 | 44.2 | 45.3 | 49.2 | 51.9 | | Reasoning | | ARC-C | 0 | acc | 59.4 | 54.3 | 57 | 60.7 | 78.6 | 75.6 | 77.6 | 77.6 | 83.4 | | | | GPQA | 0 | acc | 27.2 | 25.9 | 26.3 | 25.9 | 32.8 | 32.8 | 31.7 | 33.9 | 32.8 | | | | Hellaswag | 0 | acc | 41.2 | 38.1 | 41.3 | 41.5 | 69.8 | 66.3 | 68 | 66.3 | 78.7 | | Tool Use | | BFCL V2 | 0 | acc | 25.7 | 14.3 | 15.9 | 23.7 | 67.0 | 53.4 | 60.1 | 63.5 | 67.1 | | | | Nexus | 0 | macro\\_avg/acc | 13.5 | 5.2 | 9.6 | 12.5 | 34.3 | 32.4 | 31.5 | 30.1 | 38.5 | | Long Context | | InfiniteBench/En.QA | 0 | longbook\\_qa/f1 | 20.3 | N/A | N/A | N/A | 19.8 | N/A | N/A | N/A | 27.3 | | | | InfiniteBench/En.MC | 0 | longbook\\_choice/acc | 38.0 | N/A | N/A | N/A | 63.3 | N/A | N/A | N/A | 72.2 | | | | NIH/Multi-needle | 0 | recall | 75.0 | N/A | N/A | N/A | 84.7 | N/A | N/A | N/A | 98.8 | | Multilingual | | MGSM (CoT) | 0 | em | 24.5 | 13.7 | 18.2 | 24.4 | 58.2 | 48.9 | 54.3 | 56.8 | 68.9 | \\*\\*for comparison purposes only. Model not released. ### Multilingual Benchmarks | Category | Benchmark | Language | Llama 3.2 1B | Llama 3.2 1B Vanilla PTQ\\*\\* | Llama 3.2 1B Spin Quant | Llama 3.2 1B QLoRA | Llama 3.2 3B | Llama 3.2 3B Vanilla PTQ\\*\\* | Llama 3.2 3B Spin Quant | Llama 3.2 3B QLoRA | Llama 3.1 8B | | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | | General | MMLU (5-shot, macro_avg/acc) | Portuguese | 39.8 | 34.9 | 38.9 | 40.2 | 54.5 | 50.9 | 53.3 | 53.4 | 62.1 | | | | Spanish | 41.5 | 36.0 | 39.8 | 41.8 | 55.1 | 51.9 | 53.6 | 53.6 | 62.5 | | | | Italian | 39.8 | 34.9 | 38.1 | 40.6 | 53.8 | 49.9 | 52.1 | 51.7 | 61.6 | | | | German | 39.2 | 34.9 | 37.5 | 39.6 | 53.3 | 50.0 | 52.2 | 51.3 | 60.6 | | | | French | 40.5 | 34.8 | 39.2 | 40.8 | 54.6 | 51.2 | 53.3 | 53.3 | 62.3 | | | | Hindi | 33.5 | 30.0 | 32.1 | 34.0 | 43.3 | 40.4 | 42.0 | 42.1 | 50.9 | | | | Thai | 34.7 | 31.2 | 32.4 | 34.9 | 44.5 | 41.3 | 44.0 | 42.2 | 50.3 | \\*\\*for comparison purposes only. Model not released. ## Inference time In the below table, we compare the performance metrics of different quantization methods (SpinQuant and QAT \\+ LoRA) with the BF16 baseline. The evaluation was done using the ExecuTorch framework as the inference engine, with the ARM CPU as a backend using Android OnePlus 12 device. | Category | Decode (tokens/sec) | Time-to-first-token (sec) | Prefill (tokens/sec) | Model size (PTE file size in MB) | Memory size (RSS in MB) | | :---- | ----- | ----- | ----- | ----- | ----- | | 1B BF16 (baseline) | 19.2 | 1.0 | 60.3 | 2358 | 3,185 | | 1B SpinQuant | 50.2 (2.6x) | 0.3 (-76.9%) | 260.5 (4.3x) | 1083 (-54.1%) | 1,921 (-39.7%) | | 1B QLoRA | 45.8 (2.4x) | 0.3 (-76.0%) | 252.0 (4.2x) | 1127 (-52.2%) | 2,255 (-29.2%) | | 3B BF16 (baseline) | 7.6 | 3.0 | 21.2 | 6129 | 7,419 | | 3B SpinQuant | 19.7 (2.6x) | 0.7 (-76.4%) | 89.7 (4.2x) | 2435 (-60.3%) | 3,726 (-49.8%) | | 3B QLoRA | 18.5 (2.4x) | 0.7 (-76.1%) | 88.8 (4.2x) | 2529 (-58.7%) | 4,060 (-45.3%) | (\\*) The performance measurement is done using an adb binary-based approach. (\\*\\*) It is measured on an Android OnePlus 12 device. (\\*\\*\\*) Time-to-first-token (TTFT) is measured with prompt length=64 *Footnote:* - *Decode (tokens/second) is for how quickly it keeps generating. Higher is better.* - *Time-to-first-token (TTFT for shorthand) is for how fast it generates the first token for a given prompt. Lower is better.* - *Prefill is the inverse of TTFT (aka 1/TTFT) in tokens/second. Higher is better* - *Model size \\- how big is the model, measured by, PTE file, a binary file format for ExecuTorch* - *RSS size \\- Memory usage in resident set size (RSS)* ## Responsibility & Safety As part of our Responsible release approach, we followed a three-pronged strategy to managing trust & safety risks: 1. Enable developers to deploy helpful, safe and flexible experiences for their target audience and for the use cases supported by Llama 2. Protect developers against adversarial users aiming to exploit Llama capabilities to potentially cause harm 3. Provide protections for the community to help prevent the misuse of our models ### Responsible Deployment **Approach:** Llama is a foundational technology designed to be used in a variety of use cases. Examples on how Meta’s Llama models have been responsibly deployed can be found in our Community Stories webpage. Our approach is to build the most helpful models, enabling the world to benefit from the technology power, by aligning our model safety for generic use cases and addressing a standard set of harms. Developers are then in the driver’s seat to tailor safety for their use cases, defining their own policies and deploying the models with the necessary safeguards in their Llama systems. Llama 3.2 was developed following the best practices outlined in our Responsible Use Guide. #### Llama 3.2 Instruct **Objective:** Our main objectives for conducting safety fine-tuning are to provide the research community with a valuable resource for studying the robustness of safety fine-tuning, as well as to offer developers a readily available, safe, and powerful model for various applications to reduce the developer workload to deploy safe AI systems. We implemented the same set of safety mitigations as in Llama 3, and you can learn more about these in the Llama 3 paper. **Fine-Tuning Data:** We employ a multi-faceted approach to data collection, combining human-generated data from our vendors with synthetic data to mitigate potential safety risks. We’ve developed many large language model (LLM)-based classifiers that enable us to thoughtfully select high-quality prompts and responses, enhancing data quality control. **Refusals and Tone:** Building on the work we started with Llama 3, we put a great emphasis on model refusals to benign prompts as well as refusal tone. We included both borderline and adversarial prompts in our safety data strategy, and modified our safety data responses to follow tone guidelines. #### Llama 3.2 Systems **Safety as a System:** Large language models, including Llama 3.2, **are not designed to be deployed in isolation** but instead should be deployed as part of an overall AI system with additional safety guardrails as required. Developers are expected to deploy system safeguards when building agentic systems. Safeguards are key to achieve the right helpfulness-safety alignment as well as mitigating safety and security risks inherent to the system and any integration of the model or system with external tools. As part of our responsible release approach, we provide the community with safeguards that developers should deploy with Llama models or other LLMs, including Llama Guard, Prompt Guard and Code Shield. All our reference implementations demos contain these safeguards by default so developers can benefit from system-level safety out-of-the-box. ### New Capabilities and Use Cases **Technological Advancement:** Llama releases usually introduce new capabilities that require specific considerations in addition to the best practices that generally apply across all Generative AI use cases. For prior release capabilities also supported by Llama 3.2, see Llama 3.1 Model Card, as the same considerations apply here as well. **Constrained Environments:** Llama 3.2 1B and 3B models are expected to be deployed in highly constrained environments, such as mobile devices. LLM Systems using smaller models will have a different alignment profile and safety/helpfulness tradeoff than more complex, larger systems. Developers should ensure the safety of their system meets the requirements of their use case. We recommend using lighter system safeguards for such use cases, like Llama Guard 3-1B or its mobile-optimized version. ### Evaluations **Scaled Evaluations:** We built dedicated, adversarial evaluation datasets and evaluated systems composed of Llama models and Purple Llama safeguards to filter input prompt and output response. It is important to evaluate applications in context, and we recommend building dedicated evaluation dataset for your use case. **Red Teaming:** We conducted recurring red teaming exercises with the goal of discovering risks via adversarial prompting and we used the learnings to improve our benchmarks and safety tuning datasets. We partnered early with subject-matter experts in critical risk areas to understand the nature of these real-world harms and how such models may lead to unintended harm for society. Based on these conversations, we derived a set of adversarial goals for the red team to attempt to achieve, such as extracting harmful information or reprogramming the model to act in a potentially harmful capacity. The red team consisted of experts in cybersecurity, adversarial machine learning, responsible AI, and integrity in addition to multilingual content specialists with background in integrity issues in specific geographic markets. ### Critical Risks In addition to our safety work above, we took extra care on measuring and/or mitigating the following critical risk areas: **1\\. CBRNE (Chemical, Biological, Radiological, Nuclear, and Explosive Weapons):** Llama 3.2 1B and 3B models are smaller and less capable derivatives of Llama 3.1. For Llama 3.1 70B and 405B, to assess risks related to proliferation of chemical and biological weapons, we performed uplift testing designed to assess whether use of Llama 3.1 models could meaningfully increase the capabilities of malicious actors to plan or carry out attacks using these types of weapons and have determined that such testing also applies to the smaller 1B and 3B models. **2\\. Child Safety:** Child Safety risk assessments were conducted using a team of experts, to assess the model’s capability to produce outputs that could result in Child Safety risks and inform on any necessary and appropriate risk mitigations via fine tuning. We leveraged those expert red teaming sessions to expand the coverage of our evaluation benchmarks through Llama 3 model development. For Llama 3, we conducted new in-depth sessions using objective based methodologies to assess the model risks along multiple attack vectors including the additional languages Llama 3 is trained on. We also partnered with content specialists to perform red teaming exercises assessing potentially violating content while taking account of market specific nuances or experiences. **3\\. Cyber Attacks:** For Llama 3.1 405B, our cyber attack uplift study investigated whether LLMs can enhance human capabilities in hacking tasks, both in terms of skill level and speed. Our attack automation study focused on evaluating the capabilities of LLMs when used as autonomous agents in cyber offensive operations, specifically in the context of ransomware attacks. This evaluation was distinct from previous studies that considered LLMs as interactive assistants. The primary objective was to assess whether these models could effectively function as independent agents in executing complex cyber-attacks without human intervention. Because Llama 3.2’s 1B and 3B models are smaller and less capable models than Llama 3.1 405B, we broadly believe that the testing conducted for the 405B model also applies to Llama 3.2 models. ### Community **Industry Partnerships:** Generative AI safety requires expertise and tooling, and we believe in the strength of the open community to accelerate its progress. We are active members of open consortiums, including the AI Alliance, Partnership on AI and MLCommons, actively contributing to safety standardization and transparency. We encourage the community to adopt taxonomies like the MLCommons Proof of Concept evaluation to facilitate collaboration and transparency on safety and content evaluations. Our Purple Llama tools are open sourced for the community to use and widely distributed across ecosystem partners including cloud service providers. We encourage community contributions to our Github repository. **Grants:** We also set up the Llama Impact Grants program to identify and support the most compelling applications of Meta’s Llama model for societal benefit across three categories: education, climate and open innovation. The 20 finalists from the hundreds of applications can be found here. **Reporting:** Finally, we put in place a set of resources including an output reporting mechanism and bug bounty program to continuously improve the Llama technology with the help of the community. ## Ethical Considerations and Limitations **Values:** The core values of Llama 3.2 are openness, inclusivity and helpfulness. It is meant to serve everyone, and to work for a wide range of use cases. It is thus designed to be accessible to people across many different backgrounds, experiences and perspectives. Llama 3.2 addresses users and their needs as they are, without insertion unnecessary judgment or normativity, while reflecting the understanding that even content that may appear problematic in some cases can serve valuable purposes in others. It respects the dignity and autonomy of all users, especially in terms of the values of free thought and expression that power innovation and progress. **Testing:** Llama 3.2 is a new technology, and like any new technology, there are risks associated with its use. Testing conducted to date has not covered, nor could it cover, all scenarios. For these reasons, as with all LLMs, Llama 3.2’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 3.2 models, developers should perform safety testing and tuning tailored to their specific applications of the model. Please refer to available resources including our Responsible Use Guide, Trust and Safety solutions, and other resources to learn more about responsible development.", + "model_explanation_gemini": "A 1-billion-parameter multilingual text-generation model based on Meta's Llama 3.2 architecture, designed for instruction-following tasks across several languages.\n\nFeatures: \n- Multilingual support (English, German, French, Italian, Portuguese, Hindi, Spanish, Thai) \n- Text-generation capability \n- Instruction-following design \n- Released under Llama 3.2 Community License \n- Part of Meta's Llama model series \n\nComparison: \nThis smaller 1" +} \ No newline at end of file diff --git a/model_data_json/meta-llama_Llama-3.2-1B.json b/model_data_json/meta-llama_Llama-3.2-1B.json new file mode 100644 index 0000000000000000000000000000000000000000..849cdbdd942db5bdcd98de69a3321a017e851939 --- /dev/null +++ b/model_data_json/meta-llama_Llama-3.2-1B.json @@ -0,0 +1,31 @@ +{ + "model_id": "meta-llama/Llama-3.2-1B", + "downloads": 1990093, + "tags": [ + "transformers", + "safetensors", + "llama", + "text-generation", + "facebook", + "meta", + "pytorch", + "llama-3", + "en", + "de", + "fr", + "it", + "pt", + "hi", + "es", + "th", + "arxiv:2204.05149", + "arxiv:2405.16406", + "license:llama3.2", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: - en - de - fr - it - pt - hi - es - th library_name: transformers pipeline_tag: text-generation tags: - facebook - meta - pytorch - llama - llama-3 license: llama3.2 extra_gated_prompt: >- ### LLAMA 3.2 COMMUNITY LICENSE AGREEMENT Llama 3.2 Version Release Date: September 25, 2024 “Agreement” means the terms and conditions for use, reproduction, distribution and modification of the Llama Materials set forth herein. “Documentation” means the specifications, manuals and documentation accompanying Llama 3.2 distributed by Meta at “Licensee” or “you” means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity’s behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf. “Llama 3.2” means the foundational large language models and software and algorithms, including machine-learning model code, trained model weights, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing distributed by Meta at “Llama Materials” means, collectively, Meta’s proprietary Llama 3.2 and Documentation (and any portion thereof) made available under this Agreement. “Meta” or “we” means Meta Platforms Ireland Limited (if you are located in or, if you are an entity, your principal place of business is in the EEA or Switzerland) and Meta Platforms, Inc. (if you are located outside of the EEA or Switzerland). By clicking “I Accept” below or by using or distributing any portion or element of the Llama Materials, you agree to be bound by this Agreement. 1. License Rights and Redistribution. a. Grant of Rights. You are granted a non-exclusive, worldwide, non-transferable and royalty-free limited license under Meta’s intellectual property or other rights owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy, create derivative works of, and make modifications to the Llama Materials. b. Redistribution and Use. i. If you distribute or make available the Llama Materials (or any derivative works thereof), or a product or service (including another AI model) that contains any of them, you shall (A) provide a copy of this Agreement with any such Llama Materials; and (B) prominently display “Built with Llama” on a related website, user interface, blogpost, about page, or product documentation. If you use the Llama Materials or any outputs or results of the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include “Llama” at the beginning of any such AI model name. ii. If you receive Llama Materials, or any derivative works thereof, from a Licensee as part of an integrated end user product, then Section 2 of this Agreement will not apply to you. iii. You must retain in all copies of the Llama Materials that you distribute the following attribution notice within a “Notice” text file distributed as a part of such copies: “Llama 3.2 is licensed under the Llama 3.2 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.” iv. Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the Llama Materials (available at which is hereby incorporated by reference into this Agreement. 2. Additional Commercial Terms. If, on the Llama 3.2 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights. 3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN “AS IS” BASIS, WITHOUT WARRANTIES OF ANY KIND, AND META DISCLAIMS ALL WARRANTIES OF ANY KIND, BOTH EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS. 4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING. 5. Intellectual Property. a. No trademark licenses are granted under this Agreement, and in connection with the Llama Materials, neither Meta nor Licensee may use any name or mark owned by or associated with the other or any of its affiliates, except as required for reasonable and customary use in describing and redistributing the Llama Materials or as set forth in this Section 5(a). Meta hereby grants you a license to use “Llama” (the “Mark”) solely as required to comply with the last sentence of Section 1.b.i. You will comply with Meta’s brand guidelines (currently accessible at All goodwill arising out of your use of the Mark will inure to the benefit of Meta. b. Subject to Meta’s ownership of Llama Materials and derivatives made by or for Meta, with respect to any derivative works and modifications of the Llama Materials that are made by you, as between you and Meta, you are and will be the owner of such derivative works and modifications. c. If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 3.2 outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials. 6. Term and Termination. The term of this Agreement will commence upon your acceptance of this Agreement or access to the Llama Materials and will continue in full force and effect until terminated in accordance with the terms and conditions herein. Meta may terminate this Agreement if you are in breach of any term or condition of this Agreement. Upon termination of this Agreement, you shall delete and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the termination of this Agreement. 7. Governing Law and Jurisdiction. This Agreement will be governed and construed under the laws of the State of California without regard to choice of law principles, and the UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement. The courts of California shall have exclusive jurisdiction of any dispute arising out of this Agreement. ### Llama 3.2 Acceptable Use Policy Meta is committed to promoting safe and fair use of its tools and features, including Llama 3.2. If you access or use Llama 3.2, you agree to this Acceptable Use Policy (“**Policy**”). The most recent copy of this policy can be found at #### Prohibited Uses We want everyone to use Llama 3.2 safely and responsibly. You agree you will not use, or allow others to use, Llama 3.2 to: 1. Violate the law or others’ rights, including to: 1. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as: 1. Violence or terrorism 2. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material 3. Human trafficking, exploitation, and sexual violence 4. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials. 5. Sexual solicitation 6. Any other criminal activity 1. Engage in, promote, incite, or facilitate the harassment, abuse, threatening, or bullying of individuals or groups of individuals 2. Engage in, promote, incite, or facilitate discrimination or other unlawful or harmful conduct in the provision of employment, employment benefits, credit, housing, other economic benefits, or other essential goods and services 3. Engage in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or related professional practices 4. Collect, process, disclose, generate, or infer private or sensitive information about individuals, including information about individuals’ identity, health, or demographic information, unless you have obtained the right to do so in accordance with applicable law 5. Engage in or facilitate any action or generate any content that infringes, misappropriates, or otherwise violates any third-party rights, including the outputs or results of any products or services using the Llama Materials 6. Create, generate, or facilitate the creation of malicious code, malware, computer viruses or do anything else that could disable, overburden, interfere with or impair the proper working, integrity, operation or appearance of a website or computer system 7. Engage in any action, or facilitate any action, to intentionally circumvent or remove usage restrictions or other safety measures, or to enable functionality disabled by Meta 2. Engage in, promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals, including use of Llama 3.2 related to the following: 8. Military, warfare, nuclear industries or applications, espionage, use for materials or activities that are subject to the International Traffic Arms Regulations (ITAR) maintained by the United States Department of State or to the U.S. Biological Weapons Anti-Terrorism Act of 1989 or the Chemical Weapons Convention Implementation Act of 1997 9. Guns and illegal weapons (including weapon development) 10. Illegal drugs and regulated/controlled substances 11. Operation of critical infrastructure, transportation technologies, or heavy machinery 12. Self-harm or harm to others, including suicide, cutting, and eating disorders 13. Any content intended to incite or promote violence, abuse, or any infliction of bodily harm to an individual 3. Intentionally deceive or mislead others, including use of Llama 3.2 related to the following: 14. Generating, promoting, or furthering fraud or the creation or promotion of disinformation 15. Generating, promoting, or furthering defamatory content, including the creation of defamatory statements, images, or other content 16. Generating, promoting, or further distributing spam 17. Impersonating another individual without consent, authorization, or legal right 18. Representing that the use of Llama 3.2 or outputs are human-generated 19. Generating or facilitating false online engagement, including fake reviews and other means of fake online engagement 4. Fail to appropriately disclose to end users any known dangers of your AI system 5. Interact with third party tools, models, or software designed to generate unlawful content or engage in unlawful or harmful conduct and/or represent that the outputs of such tools, models, or software are associated with Meta or Llama 3.2 With respect to any multimodal models included in Llama 3.2, the rights granted under Section 1(a) of the Llama 3.2 Community License Agreement are not being granted to you if you are an individual domiciled in, or a company with a principal place of business in, the European Union. This restriction does not apply to end users of a product or service that incorporates any such multimodal models. Please report any violation of this Policy, software “bug,” or other problems that could lead to a violation of this Policy through one of the following means: * Reporting issues with the model: * Reporting risky content generated by the model: developers.facebook.com/llama_output_feedback * Reporting bugs and security concerns: facebook.com/whitehat/info * Reporting violations of the Acceptable Use Policy or unlicensed uses of Llama 3.2: LlamaUseReport@meta.com extra_gated_fields: First Name: text Last Name: text Date of birth: date_picker Country: country Affiliation: text Job title: type: select options: - Student - Research Graduate - AI researcher - AI developer/engineer - Reporter - Other geo: ip_location By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy: checkbox extra_gated_description: >- The information you provide will be collected, stored, processed and shared in accordance with the Meta Privacy Policy. extra_gated_button_content: Submit --- ## Model Information The Llama 3.2 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction-tuned generative models in 1B and 3B sizes (text in/text out). The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks. They outperform many of the available open source and closed chat models on common industry benchmarks. **Model Developer:** Meta **Model Architecture:** Llama 3.2 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. | | Training Data | Params | Input modalities | Output modalities | Context Length | GQA | Shared Embeddings | Token count | Knowledge cutoff | | :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- | | Llama 3.2 (text only) | A new mix of publicly available online data. | 1B (1.23B) | Multilingual Text | Multilingual Text and code | 128k | Yes | Yes | Up to 9T tokens | December 2023 | | | | 3B (3.21B) | Multilingual Text | Multilingual Text and code | | | | | | | Llama 3.2 Quantized (text only) | A new mix of publicly available online data. | 1B (1.23B) | Multilingual Text | Multilingual Text and code | 8k | Yes | Yes | Up to 9T tokens | December 2023 | | | | 3B (3.21B) | Multilingual Text | Multilingual Text and code | | | | | | **Supported Languages:** English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai are officially supported. Llama 3.2 has been trained on a broader collection of languages than these 8 supported languages. Developers may fine-tune Llama 3.2 models for languages beyond these supported languages, provided they comply with the Llama 3.2 Community License and the Acceptable Use Policy. Developers are always expected to ensure that their deployments, including those that involve additional languages, are completed safely and responsibly. **Llama 3.2 Model Family:** Token counts refer to pretraining data only. All model versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date:** Sept 25, 2024 **Status:** This is a static model trained on an offline dataset. Future versions may be released that improve model capabilities and safety. **License:** Use of Llama 3.2 is governed by the Llama 3.2 Community License (a custom, commercial license agreement). **Feedback:** Instructions on how to provide feedback or comments on the model can be found in the Llama Models README. For more technical information about generation parameters and recipes for how to use Llama 3.2 in applications, please go here. ## Intended Use **Intended Use Cases:** Llama 3.2 is intended for commercial and research use in multiple languages. Instruction tuned text only models are intended for assistant-like chat and agentic applications like knowledge retrieval and summarization, mobile AI powered writing assistants and query and prompt rewriting. Pretrained models can be adapted for a variety of additional natural language generation tasks. Similarly, quantized models can be adapted for a variety of on-device use-cases with limited compute resources. **Out of Scope:** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3.2 Community License. Use in languages beyond those explicitly referenced as supported in this model card. ## How to use This repository contains two versions of Llama-3.2-1B, for use with transformers and with the original codebase. ### Use with transformers Starting with transformers >= 4.43.0 onward, you can run conversational inference using the Transformers pipeline abstraction or by leveraging the Auto classes with the generate() function. Make sure to update your transformers installation via pip install --upgrade transformers. ### Use with Please, follow the instructions in the repository. To download Original checkpoints, see the example command below leveraging : ## Hardware and Software **Training Factors:** We used custom training libraries, Meta's custom built GPU cluster, and production infrastructure for pretraining. Fine-tuning, quantization, annotation, and evaluation were also performed on production infrastructure. **Training Energy Use:** Training utilized a cumulative of **916k** GPU hours of computation on H100-80GB (TDP of 700W) type hardware, per the table below. Training time is the total GPU time required for training each model and power consumption is the peak power capacity per GPU device used, adjusted for power usage efficiency. **Training Greenhouse Gas Emissions:** Estimated total location-based greenhouse gas emissions were **240** tons CO2eq for training. Since 2020, Meta has maintained net zero greenhouse gas emissions in its global operations and matched 100% of its electricity use with renewable energy; therefore, the total market-based greenhouse gas emissions for training were 0 tons CO2eq. | | Training Time (GPU hours) | Logit Generation Time (GPU Hours) | Training Power Consumption (W) | Training Location-Based Greenhouse Gas Emissions (tons CO2eq) | Training Market-Based Greenhouse Gas Emissions (tons CO2eq) | | :---- | :---: | ----- | :---: | :---: | :---: | | Llama 3.2 1B | 370k | \\- | 700 | 107 | 0 | | Llama 3.2 3B | 460k | \\- | 700 | 133 | 0 | | Llama 3.2 1B SpinQuant | 1.7 | 0 | 700 | *Negligible*\\*\\* | 0 | | Llama 3.2 3B SpinQuant | 2.4 | 0 | 700 | *Negligible*\\*\\* | 0 | | Llama 3.2 1B QLora | 1.3k | 0 | 700 | 0.381 | 0 | | Llama 3.2 3B QLora | 1.6k | 0 | 700 | 0.461 | 0 | | Total | 833k | 86k | | 240 | 0 | \\*\\* The location-based CO2e emissions of Llama 3.2 1B SpinQuant and Llama 3.2 3B SpinQuant are less than 0.001 metric tonnes each. This is due to the minimal training GPU hours that are required. The methodology used to determine training energy use and greenhouse gas emissions can be found here. Since Meta is openly releasing these models, the training energy use and greenhouse gas emissions will not be incurred by others. ## Training Data **Overview:** Llama 3.2 was pretrained on up to 9 trillion tokens of data from publicly available sources. For the 1B and 3B Llama 3.2 models, we incorporated logits from the Llama 3.1 8B and 70B models into the pretraining stage of the model development, where outputs (logits) from these larger models were used as token-level targets. Knowledge distillation was used after pruning to recover performance. In post-training we used a similar recipe as Llama 3.1 and produced final chat models by doing several rounds of alignment on top of the pre-trained model. Each round involved Supervised Fine-Tuning (SFT), Rejection Sampling (RS), and Direct Preference Optimization (DPO). **Data Freshness:** The pretraining data has a cutoff of December 2023\\. ## Quantization ### Quantization Scheme We designed the current quantization scheme with the PyTorch’s ExecuTorch inference framework and Arm CPU backend in mind, taking into account metrics including model quality, prefill/decoding speed, and memory footprint. Our quantization scheme involves three parts: - All linear layers in all transformer blocks are quantized to a 4-bit groupwise scheme (with a group size of 32) for weights and 8-bit per-token dynamic quantization for activations. - The classification layer is quantized to 8-bit per-channel for weight and 8-bit per token dynamic quantization for activation. - Similar to classification layer, an 8-bit per channel quantization is used for embedding layer. ### Quantization-Aware Training and LoRA The quantization-aware training (QAT) with low-rank adaptation (LoRA) models went through only post-training stages, using the same data as the full precision models. To initialize QAT, we utilize BF16 Llama 3.2 model checkpoints obtained after supervised fine-tuning (SFT) and perform an additional full round of SFT training with QAT. We then freeze the backbone of the QAT model and perform another round of SFT with LoRA adaptors applied to all layers within the transformer block. Meanwhile, the LoRA adaptors' weights and activations are maintained in BF16. Because our approach is similar to QLoRA of Dettmers et al., (2023) (i.e., quantization followed by LoRA adapters), we refer this method as QLoRA. Finally, we fine-tune the resulting model (both backbone and LoRA adaptors) using direct preference optimization (DPO). ### SpinQuant SpinQuant was applied, together with generative post-training quantization (GPTQ). For the SpinQuant rotation matrix fine-tuning, we optimized for 100 iterations, using 800 samples with sequence-length 2048 from the WikiText 2 dataset. For GPTQ, we used 128 samples from the same dataset with the same sequence-length. ## Benchmarks \\- English Text In this section, we report the results for Llama 3.2 models on standard automatic benchmarks. For all these evaluations, we used our internal evaluations library. ### Base Pretrained Models | Category | Benchmark | \\# Shots | Metric | Llama 3.2 1B | Llama 3.2 3B | Llama 3.1 8B | | ----- | ----- | :---: | :---: | :---: | :---: | :---: | | General | MMLU | 5 | macro\\_avg/acc\\_char | 32.2 | 58 | 66.7 | | | AGIEval English | 3-5 | average/acc\\_char | 23.3 | 39.2 | 47.8 | | | ARC-Challenge | 25 | acc\\_char | 32.8 | 69.1 | 79.7 | | Reading comprehension | SQuAD | 1 | em | 49.2 | 67.7 | 77 | | | QuAC (F1) | 1 | f1 | 37.9 | 42.9 | 44.9 | | | DROP (F1) | 3 | f1 | 28.0 | 45.2 | 59.5 | | Long Context | Needle in Haystack | 0 | em | 96.8 | 1 | 1 | ### Instruction Tuned Models | Capability | | Benchmark | \\# Shots | Metric | Llama 3.2 1B bf16 | Llama 3.2 1B Vanilla PTQ\\*\\* | Llama 3.2 1B Spin Quant | Llama 3.2 1B QLoRA | Llama 3.2 3B bf16 | Llama 3.2 3B Vanilla PTQ\\*\\* | Llama 3.2 3B Spin Quant | Llama 3.2 3B QLoRA | Llama 3.1 8B | | :---: | ----- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | | General | | MMLU | 5 | macro\\_avg/acc | 49.3 | 43.3 | 47.3 | 49.0 | 63.4 | 60.5 | 62 | 62.4 | 69.4 | | Re-writing | | Open-rewrite eval | 0 | micro\\_avg/rougeL | 41.6 | 39.2 | 40.9 | 41.2 | 40.1 | 40.3 | 40.8 | 40.7 | 40.9 | | Summarization | | TLDR9+ (test) | 1 | rougeL | 16.8 | 14.9 | 16.7 | 16.8 | 19.0 | 19.1 | 19.2 | 19.1 | 17.2 | | Instruction following | | IFEval | 0 | Avg(Prompt/Instruction acc Loose/Strict) | 59.5 | 51.5 | 58.4 | 55.6 | 77.4 | 73.9 | 73.5 | 75.9 | 80.4 | | Math | | GSM8K (CoT) | 8 | em\\_maj1@1 | 44.4 | 33.1 | 40.6 | 46.5 | 77.7 | 72.9 | 75.7 | 77.9 | 84.5 | | | | MATH (CoT) | 0 | final\\_em | 30.6 | 20.5 | 25.3 | 31.0 | 48.0 | 44.2 | 45.3 | 49.2 | 51.9 | | Reasoning | | ARC-C | 0 | acc | 59.4 | 54.3 | 57 | 60.7 | 78.6 | 75.6 | 77.6 | 77.6 | 83.4 | | | | GPQA | 0 | acc | 27.2 | 25.9 | 26.3 | 25.9 | 32.8 | 32.8 | 31.7 | 33.9 | 32.8 | | | | Hellaswag | 0 | acc | 41.2 | 38.1 | 41.3 | 41.5 | 69.8 | 66.3 | 68 | 66.3 | 78.7 | | Tool Use | | BFCL V2 | 0 | acc | 25.7 | 14.3 | 15.9 | 23.7 | 67.0 | 53.4 | 60.1 | 63.5 | 67.1 | | | | Nexus | 0 | macro\\_avg/acc | 13.5 | 5.2 | 9.6 | 12.5 | 34.3 | 32.4 | 31.5 | 30.1 | 38.5 | | Long Context | | InfiniteBench/En.QA | 0 | longbook\\_qa/f1 | 20.3 | N/A | N/A | N/A | 19.8 | N/A | N/A | N/A | 27.3 | | | | InfiniteBench/En.MC | 0 | longbook\\_choice/acc | 38.0 | N/A | N/A | N/A | 63.3 | N/A | N/A | N/A | 72.2 | | | | NIH/Multi-needle | 0 | recall | 75.0 | N/A | N/A | N/A | 84.7 | N/A | N/A | N/A | 98.8 | | Multilingual | | MGSM (CoT) | 0 | em | 24.5 | 13.7 | 18.2 | 24.4 | 58.2 | 48.9 | 54.3 | 56.8 | 68.9 | \\*\\*for comparison purposes only. Model not released. ### Multilingual Benchmarks | Category | Benchmark | Language | Llama 3.2 1B | Llama 3.2 1B Vanilla PTQ\\*\\* | Llama 3.2 1B Spin Quant | Llama 3.2 1B QLoRA | Llama 3.2 3B | Llama 3.2 3B Vanilla PTQ\\*\\* | Llama 3.2 3B Spin Quant | Llama 3.2 3B QLoRA | Llama 3.1 8B | | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | | General | MMLU (5-shot, macro_avg/acc) | Portuguese | 39.8 | 34.9 | 38.9 | 40.2 | 54.5 | 50.9 | 53.3 | 53.4 | 62.1 | | | | Spanish | 41.5 | 36.0 | 39.8 | 41.8 | 55.1 | 51.9 | 53.6 | 53.6 | 62.5 | | | | Italian | 39.8 | 34.9 | 38.1 | 40.6 | 53.8 | 49.9 | 52.1 | 51.7 | 61.6 | | | | German | 39.2 | 34.9 | 37.5 | 39.6 | 53.3 | 50.0 | 52.2 | 51.3 | 60.6 | | | | French | 40.5 | 34.8 | 39.2 | 40.8 | 54.6 | 51.2 | 53.3 | 53.3 | 62.3 | | | | Hindi | 33.5 | 30.0 | 32.1 | 34.0 | 43.3 | 40.4 | 42.0 | 42.1 | 50.9 | | | | Thai | 34.7 | 31.2 | 32.4 | 34.9 | 44.5 | 41.3 | 44.0 | 42.2 | 50.3 | \\*\\*for comparison purposes only. Model not released. ## Inference time In the below table, we compare the performance metrics of different quantization methods (SpinQuant and QAT \\+ LoRA) with the BF16 baseline. The evaluation was done using the ExecuTorch framework as the inference engine, with the ARM CPU as a backend using Android OnePlus 12 device. | Category | Decode (tokens/sec) | Time-to-first-token (sec) | Prefill (tokens/sec) | Model size (PTE file size in MB) | Memory size (RSS in MB) | | :---- | ----- | ----- | ----- | ----- | ----- | | 1B BF16 (baseline) | 19.2 | 1.0 | 60.3 | 2358 | 3,185 | | 1B SpinQuant | 50.2 (2.6x) | 0.3 (-76.9%) | 260.5 (4.3x) | 1083 (-54.1%) | 1,921 (-39.7%) | | 1B QLoRA | 45.8 (2.4x) | 0.3 (-76.0%) | 252.0 (4.2x) | 1127 (-52.2%) | 2,255 (-29.2%) | | 3B BF16 (baseline) | 7.6 | 3.0 | 21.2 | 6129 | 7,419 | | 3B SpinQuant | 19.7 (2.6x) | 0.7 (-76.4%) | 89.7 (4.2x) | 2435 (-60.3%) | 3,726 (-49.8%) | | 3B QLoRA | 18.5 (2.4x) | 0.7 (-76.1%) | 88.8 (4.2x) | 2529 (-58.7%) | 4,060 (-45.3%) | (\\*) The performance measurement is done using an adb binary-based approach. (\\*\\*) It is measured on an Android OnePlus 12 device. (\\*\\*\\*) Time-to-first-token (TTFT) is measured with prompt length=64 *Footnote:* - *Decode (tokens/second) is for how quickly it keeps generating. Higher is better.* - *Time-to-first-token (TTFT for shorthand) is for how fast it generates the first token for a given prompt. Lower is better.* - *Prefill is the inverse of TTFT (aka 1/TTFT) in tokens/second. Higher is better* - *Model size \\- how big is the model, measured by, PTE file, a binary file format for ExecuTorch* - *RSS size \\- Memory usage in resident set size (RSS)* ## Responsibility & Safety As part of our Responsible release approach, we followed a three-pronged strategy to managing trust & safety risks: 1. Enable developers to deploy helpful, safe and flexible experiences for their target audience and for the use cases supported by Llama 2. Protect developers against adversarial users aiming to exploit Llama capabilities to potentially cause harm 3. Provide protections for the community to help prevent the misuse of our models ### Responsible Deployment **Approach:** Llama is a foundational technology designed to be used in a variety of use cases. Examples on how Meta’s Llama models have been responsibly deployed can be found in our Community Stories webpage. Our approach is to build the most helpful models, enabling the world to benefit from the technology power, by aligning our model safety for generic use cases and addressing a standard set of harms. Developers are then in the driver’s seat to tailor safety for their use cases, defining their own policies and deploying the models with the necessary safeguards in their Llama systems. Llama 3.2 was developed following the best practices outlined in our Responsible Use Guide. #### Llama 3.2 Instruct **Objective:** Our main objectives for conducting safety fine-tuning are to provide the research community with a valuable resource for studying the robustness of safety fine-tuning, as well as to offer developers a readily available, safe, and powerful model for various applications to reduce the developer workload to deploy safe AI systems. We implemented the same set of safety mitigations as in Llama 3, and you can learn more about these in the Llama 3 paper. **Fine-Tuning Data:** We employ a multi-faceted approach to data collection, combining human-generated data from our vendors with synthetic data to mitigate potential safety risks. We’ve developed many large language model (LLM)-based classifiers that enable us to thoughtfully select high-quality prompts and responses, enhancing data quality control. **Refusals and Tone:** Building on the work we started with Llama 3, we put a great emphasis on model refusals to benign prompts as well as refusal tone. We included both borderline and adversarial prompts in our safety data strategy, and modified our safety data responses to follow tone guidelines. #### Llama 3.2 Systems **Safety as a System:** Large language models, including Llama 3.2, **are not designed to be deployed in isolation** but instead should be deployed as part of an overall AI system with additional safety guardrails as required. Developers are expected to deploy system safeguards when building agentic systems. Safeguards are key to achieve the right helpfulness-safety alignment as well as mitigating safety and security risks inherent to the system and any integration of the model or system with external tools. As part of our responsible release approach, we provide the community with safeguards that developers should deploy with Llama models or other LLMs, including Llama Guard, Prompt Guard and Code Shield. All our reference implementations demos contain these safeguards by default so developers can benefit from system-level safety out-of-the-box. ### New Capabilities and Use Cases **Technological Advancement:** Llama releases usually introduce new capabilities that require specific considerations in addition to the best practices that generally apply across all Generative AI use cases. For prior release capabilities also supported by Llama 3.2, see Llama 3.1 Model Card, as the same considerations apply here as well. **Constrained Environments:** Llama 3.2 1B and 3B models are expected to be deployed in highly constrained environments, such as mobile devices. LLM Systems using smaller models will have a different alignment profile and safety/helpfulness tradeoff than more complex, larger systems. Developers should ensure the safety of their system meets the requirements of their use case. We recommend using lighter system safeguards for such use cases, like Llama Guard 3-1B or its mobile-optimized version. ### Evaluations **Scaled Evaluations:** We built dedicated, adversarial evaluation datasets and evaluated systems composed of Llama models and Purple Llama safeguards to filter input prompt and output response. It is important to evaluate applications in context, and we recommend building dedicated evaluation dataset for your use case. **Red Teaming:** We conducted recurring red teaming exercises with the goal of discovering risks via adversarial prompting and we used the learnings to improve our benchmarks and safety tuning datasets. We partnered early with subject-matter experts in critical risk areas to understand the nature of these real-world harms and how such models may lead to unintended harm for society. Based on these conversations, we derived a set of adversarial goals for the red team to attempt to achieve, such as extracting harmful information or reprogramming the model to act in a potentially harmful capacity. The red team consisted of experts in cybersecurity, adversarial machine learning, responsible AI, and integrity in addition to multilingual content specialists with background in integrity issues in specific geographic markets. ### Critical Risks In addition to our safety work above, we took extra care on measuring and/or mitigating the following critical risk areas: **1\\. CBRNE (Chemical, Biological, Radiological, Nuclear, and Explosive Weapons):** Llama 3.2 1B and 3B models are smaller and less capable derivatives of Llama 3.1. For Llama 3.1 70B and 405B, to assess risks related to proliferation of chemical and biological weapons, we performed uplift testing designed to assess whether use of Llama 3.1 models could meaningfully increase the capabilities of malicious actors to plan or carry out attacks using these types of weapons and have determined that such testing also applies to the smaller 1B and 3B models. **2\\. Child Safety:** Child Safety risk assessments were conducted using a team of experts, to assess the model’s capability to produce outputs that could result in Child Safety risks and inform on any necessary and appropriate risk mitigations via fine tuning. We leveraged those expert red teaming sessions to expand the coverage of our evaluation benchmarks through Llama 3 model development. For Llama 3, we conducted new in-depth sessions using objective based methodologies to assess the model risks along multiple attack vectors including the additional languages Llama 3 is trained on. We also partnered with content specialists to perform red teaming exercises assessing potentially violating content while taking account of market specific nuances or experiences. **3\\. Cyber Attacks:** For Llama 3.1 405B, our cyber attack uplift study investigated whether LLMs can enhance human capabilities in hacking tasks, both in terms of skill level and speed. Our attack automation study focused on evaluating the capabilities of LLMs when used as autonomous agents in cyber offensive operations, specifically in the context of ransomware attacks. This evaluation was distinct from previous studies that considered LLMs as interactive assistants. The primary objective was to assess whether these models could effectively function as independent agents in executing complex cyber-attacks without human intervention. Because Llama 3.2’s 1B and 3B models are smaller and less capable models than Llama 3.1 405B, we broadly believe that the testing conducted for the 405B model also applies to Llama 3.2 models. ### Community **Industry Partnerships:** Generative AI safety requires expertise and tooling, and we believe in the strength of the open community to accelerate its progress. We are active members of open consortiums, including the AI Alliance, Partnership on AI and MLCommons, actively contributing to safety standardization and transparency. We encourage the community to adopt taxonomies like the MLCommons Proof of Concept evaluation to facilitate collaboration and transparency on safety and content evaluations. Our Purple Llama tools are open sourced for the community to use and widely distributed across ecosystem partners including cloud service providers. We encourage community contributions to our Github repository. **Grants:** We also set up the Llama Impact Grants program to identify and support the most compelling applications of Meta’s Llama model for societal benefit across three categories: education, climate and open innovation. The 20 finalists from the hundreds of applications can be found here. **Reporting:** Finally, we put in place a set of resources including an output reporting mechanism and bug bounty program to continuously improve the Llama technology with the help of the community. ## Ethical Considerations and Limitations **Values:** The core values of Llama 3.2 are openness, inclusivity and helpfulness. It is meant to serve everyone, and to work for a wide range of use cases. It is thus designed to be accessible to people across many different backgrounds, experiences and perspectives. Llama 3.2 addresses users and their needs as they are, without insertion unnecessary judgment or normativity, while reflecting the understanding that even content that may appear problematic in some cases can serve valuable purposes in others. It respects the dignity and autonomy of all users, especially in terms of the values of free thought and expression that power innovation and progress. **Testing:** Llama 3.2 is a new technology, and like any new technology, there are risks associated with its use. Testing conducted to date has not covered, nor could it cover, all scenarios. For these reasons, as with all LLMs, Llama 3.2’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 3.2 models, developers should perform safety testing and tuning tailored to their specific applications of the model. Please refer to available resources including our Responsible Use Guide, Trust and Safety solutions, and other resources to learn more about responsible development.", + "model_explanation_gemini": "A multilingual text-generation model supporting English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai, released under Meta's Llama 3.2 Community License. \n\n**Features**: \n- **Languages**: Supports 8 languages (en, de, fr, it, pt, hi, es, th). \n- **Task**: Text generation. \n- **Framework**: PyTorch (Transformers library). \n- **License**: Llama 3.2 Community License" +} \ No newline at end of file diff --git a/model_data_json/meta-llama_Llama-3.2-3B-Instruct.json b/model_data_json/meta-llama_Llama-3.2-3B-Instruct.json new file mode 100644 index 0000000000000000000000000000000000000000..be2769413d0c1d2a82659b6f3ba57c8d7eaa630a --- /dev/null +++ b/model_data_json/meta-llama_Llama-3.2-3B-Instruct.json @@ -0,0 +1,32 @@ +{ + "model_id": "meta-llama/Llama-3.2-3B-Instruct", + "downloads": 1485037, + "tags": [ + "transformers", + "safetensors", + "llama", + "text-generation", + "facebook", + "meta", + "pytorch", + "llama-3", + "conversational", + "en", + "de", + "fr", + "it", + "pt", + "hi", + "es", + "th", + "arxiv:2204.05149", + "arxiv:2405.16406", + "license:llama3.2", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: - en - de - fr - it - pt - hi - es - th library_name: transformers pipeline_tag: text-generation tags: - facebook - meta - pytorch - llama - llama-3 license: llama3.2 extra_gated_prompt: >- ### LLAMA 3.2 COMMUNITY LICENSE AGREEMENT Llama 3.2 Version Release Date: September 25, 2024 “Agreement” means the terms and conditions for use, reproduction, distribution and modification of the Llama Materials set forth herein. “Documentation” means the specifications, manuals and documentation accompanying Llama 3.2 distributed by Meta at “Licensee” or “you” means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity’s behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf. “Llama 3.2” means the foundational large language models and software and algorithms, including machine-learning model code, trained model weights, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing distributed by Meta at “Llama Materials” means, collectively, Meta’s proprietary Llama 3.2 and Documentation (and any portion thereof) made available under this Agreement. “Meta” or “we” means Meta Platforms Ireland Limited (if you are located in or, if you are an entity, your principal place of business is in the EEA or Switzerland) and Meta Platforms, Inc. (if you are located outside of the EEA or Switzerland). By clicking “I Accept” below or by using or distributing any portion or element of the Llama Materials, you agree to be bound by this Agreement. 1. License Rights and Redistribution. a. Grant of Rights. You are granted a non-exclusive, worldwide, non-transferable and royalty-free limited license under Meta’s intellectual property or other rights owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy, create derivative works of, and make modifications to the Llama Materials. b. Redistribution and Use. i. If you distribute or make available the Llama Materials (or any derivative works thereof), or a product or service (including another AI model) that contains any of them, you shall (A) provide a copy of this Agreement with any such Llama Materials; and (B) prominently display “Built with Llama” on a related website, user interface, blogpost, about page, or product documentation. If you use the Llama Materials or any outputs or results of the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include “Llama” at the beginning of any such AI model name. ii. If you receive Llama Materials, or any derivative works thereof, from a Licensee as part of an integrated end user product, then Section 2 of this Agreement will not apply to you. iii. You must retain in all copies of the Llama Materials that you distribute the following attribution notice within a “Notice” text file distributed as a part of such copies: “Llama 3.2 is licensed under the Llama 3.2 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.” iv. Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the Llama Materials (available at which is hereby incorporated by reference into this Agreement. 2. Additional Commercial Terms. If, on the Llama 3.2 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights. 3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN “AS IS” BASIS, WITHOUT WARRANTIES OF ANY KIND, AND META DISCLAIMS ALL WARRANTIES OF ANY KIND, BOTH EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS. 4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING. 5. Intellectual Property. a. No trademark licenses are granted under this Agreement, and in connection with the Llama Materials, neither Meta nor Licensee may use any name or mark owned by or associated with the other or any of its affiliates, except as required for reasonable and customary use in describing and redistributing the Llama Materials or as set forth in this Section 5(a). Meta hereby grants you a license to use “Llama” (the “Mark”) solely as required to comply with the last sentence of Section 1.b.i. You will comply with Meta’s brand guidelines (currently accessible at All goodwill arising out of your use of the Mark will inure to the benefit of Meta. b. Subject to Meta’s ownership of Llama Materials and derivatives made by or for Meta, with respect to any derivative works and modifications of the Llama Materials that are made by you, as between you and Meta, you are and will be the owner of such derivative works and modifications. c. If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 3.2 outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials. 6. Term and Termination. The term of this Agreement will commence upon your acceptance of this Agreement or access to the Llama Materials and will continue in full force and effect until terminated in accordance with the terms and conditions herein. Meta may terminate this Agreement if you are in breach of any term or condition of this Agreement. Upon termination of this Agreement, you shall delete and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the termination of this Agreement. 7. Governing Law and Jurisdiction. This Agreement will be governed and construed under the laws of the State of California without regard to choice of law principles, and the UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement. The courts of California shall have exclusive jurisdiction of any dispute arising out of this Agreement. ### Llama 3.2 Acceptable Use Policy Meta is committed to promoting safe and fair use of its tools and features, including Llama 3.2. If you access or use Llama 3.2, you agree to this Acceptable Use Policy (“**Policy**”). The most recent copy of this policy can be found at #### Prohibited Uses We want everyone to use Llama 3.2 safely and responsibly. You agree you will not use, or allow others to use, Llama 3.2 to: 1. Violate the law or others’ rights, including to: 1. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as: 1. Violence or terrorism 2. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material 3. Human trafficking, exploitation, and sexual violence 4. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials. 5. Sexual solicitation 6. Any other criminal activity 1. Engage in, promote, incite, or facilitate the harassment, abuse, threatening, or bullying of individuals or groups of individuals 2. Engage in, promote, incite, or facilitate discrimination or other unlawful or harmful conduct in the provision of employment, employment benefits, credit, housing, other economic benefits, or other essential goods and services 3. Engage in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or related professional practices 4. Collect, process, disclose, generate, or infer private or sensitive information about individuals, including information about individuals’ identity, health, or demographic information, unless you have obtained the right to do so in accordance with applicable law 5. Engage in or facilitate any action or generate any content that infringes, misappropriates, or otherwise violates any third-party rights, including the outputs or results of any products or services using the Llama Materials 6. Create, generate, or facilitate the creation of malicious code, malware, computer viruses or do anything else that could disable, overburden, interfere with or impair the proper working, integrity, operation or appearance of a website or computer system 7. Engage in any action, or facilitate any action, to intentionally circumvent or remove usage restrictions or other safety measures, or to enable functionality disabled by Meta 2. Engage in, promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals, including use of Llama 3.2 related to the following: 8. Military, warfare, nuclear industries or applications, espionage, use for materials or activities that are subject to the International Traffic Arms Regulations (ITAR) maintained by the United States Department of State or to the U.S. Biological Weapons Anti-Terrorism Act of 1989 or the Chemical Weapons Convention Implementation Act of 1997 9. Guns and illegal weapons (including weapon development) 10. Illegal drugs and regulated/controlled substances 11. Operation of critical infrastructure, transportation technologies, or heavy machinery 12. Self-harm or harm to others, including suicide, cutting, and eating disorders 13. Any content intended to incite or promote violence, abuse, or any infliction of bodily harm to an individual 3. Intentionally deceive or mislead others, including use of Llama 3.2 related to the following: 14. Generating, promoting, or furthering fraud or the creation or promotion of disinformation 15. Generating, promoting, or furthering defamatory content, including the creation of defamatory statements, images, or other content 16. Generating, promoting, or further distributing spam 17. Impersonating another individual without consent, authorization, or legal right 18. Representing that the use of Llama 3.2 or outputs are human-generated 19. Generating or facilitating false online engagement, including fake reviews and other means of fake online engagement 4. Fail to appropriately disclose to end users any known dangers of your AI system 5. Interact with third party tools, models, or software designed to generate unlawful content or engage in unlawful or harmful conduct and/or represent that the outputs of such tools, models, or software are associated with Meta or Llama 3.2 With respect to any multimodal models included in Llama 3.2, the rights granted under Section 1(a) of the Llama 3.2 Community License Agreement are not being granted to you if you are an individual domiciled in, or a company with a principal place of business in, the European Union. This restriction does not apply to end users of a product or service that incorporates any such multimodal models. Please report any violation of this Policy, software “bug,” or other problems that could lead to a violation of this Policy through one of the following means: * Reporting issues with the model: * Reporting risky content generated by the model: developers.facebook.com/llama_output_feedback * Reporting bugs and security concerns: facebook.com/whitehat/info * Reporting violations of the Acceptable Use Policy or unlicensed uses of Llama 3.2: LlamaUseReport@meta.com extra_gated_fields: First Name: text Last Name: text Date of birth: date_picker Country: country Affiliation: text Job title: type: select options: - Student - Research Graduate - AI researcher - AI developer/engineer - Reporter - Other geo: ip_location By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy: checkbox extra_gated_description: >- The information you provide will be collected, stored, processed and shared in accordance with the Meta Privacy Policy. extra_gated_button_content: Submit --- ## Model Information The Llama 3.2 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction-tuned generative models in 1B and 3B sizes (text in/text out). The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks. They outperform many of the available open source and closed chat models on common industry benchmarks. **Model Developer:** Meta **Model Architecture:** Llama 3.2 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. | | Training Data | Params | Input modalities | Output modalities | Context Length | GQA | Shared Embeddings | Token count | Knowledge cutoff | | :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- | | Llama 3.2 (text only) | A new mix of publicly available online data. | 1B (1.23B) | Multilingual Text | Multilingual Text and code | 128k | Yes | Yes | Up to 9T tokens | December 2023 | | | | 3B (3.21B) | Multilingual Text | Multilingual Text and code | | | | | | | Llama 3.2 Quantized (text only) | A new mix of publicly available online data. | 1B (1.23B) | Multilingual Text | Multilingual Text and code | 8k | Yes | Yes | Up to 9T tokens | December 2023 | | | | 3B (3.21B) | Multilingual Text | Multilingual Text and code | | | | | | **Supported Languages:** English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai are officially supported. Llama 3.2 has been trained on a broader collection of languages than these 8 supported languages. Developers may fine-tune Llama 3.2 models for languages beyond these supported languages, provided they comply with the Llama 3.2 Community License and the Acceptable Use Policy. Developers are always expected to ensure that their deployments, including those that involve additional languages, are completed safely and responsibly. **Llama 3.2 Model Family:** Token counts refer to pretraining data only. All model versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date:** Sept 25, 2024 **Status:** This is a static model trained on an offline dataset. Future versions may be released that improve model capabilities and safety. **License:** Use of Llama 3.2 is governed by the Llama 3.2 Community License (a custom, commercial license agreement). **Feedback:** Instructions on how to provide feedback or comments on the model can be found in the Llama Models README. For more technical information about generation parameters and recipes for how to use Llama 3.2 in applications, please go here. ## Intended Use **Intended Use Cases:** Llama 3.2 is intended for commercial and research use in multiple languages. Instruction tuned text only models are intended for assistant-like chat and agentic applications like knowledge retrieval and summarization, mobile AI powered writing assistants and query and prompt rewriting. Pretrained models can be adapted for a variety of additional natural language generation tasks. Similarly, quantized models can be adapted for a variety of on-device use-cases with limited compute resources. **Out of Scope:** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3.2 Community License. Use in languages beyond those explicitly referenced as supported in this model card. ## How to use This repository contains two versions of Llama-3.2-3B-Instruct, for use with and with the original codebase. ### Use with transformers Starting with onward, you can run conversational inference using the Transformers abstraction or by leveraging the Auto classes with the function. Make sure to update your transformers installation via . Note: You can also find detailed recipes on how to use the model locally, with , assisted generations, quantised and more at []( ### Use with Please, follow the instructions in the repository To download Original checkpoints, see the example command below leveraging : ## Hardware and Software **Training Factors:** We used custom training libraries, Meta's custom built GPU cluster, and production infrastructure for pretraining. Fine-tuning, quantization, annotation, and evaluation were also performed on production infrastructure. **Training Energy Use:** Training utilized a cumulative of **916k** GPU hours of computation on H100-80GB (TDP of 700W) type hardware, per the table below. Training time is the total GPU time required for training each model and power consumption is the peak power capacity per GPU device used, adjusted for power usage efficiency. **Training Greenhouse Gas Emissions:** Estimated total location-based greenhouse gas emissions were **240** tons CO2eq for training. Since 2020, Meta has maintained net zero greenhouse gas emissions in its global operations and matched 100% of its electricity use with renewable energy; therefore, the total market-based greenhouse gas emissions for training were 0 tons CO2eq. | | Training Time (GPU hours) | Logit Generation Time (GPU Hours) | Training Power Consumption (W) | Training Location-Based Greenhouse Gas Emissions (tons CO2eq) | Training Market-Based Greenhouse Gas Emissions (tons CO2eq) | | :---- | :---: | ----- | :---: | :---: | :---: | | Llama 3.2 1B | 370k | \\- | 700 | 107 | 0 | | Llama 3.2 3B | 460k | \\- | 700 | 133 | 0 | | Llama 3.2 1B SpinQuant | 1.7 | 0 | 700 | *Negligible*\\*\\* | 0 | | Llama 3.2 3B SpinQuant | 2.4 | 0 | 700 | *Negligible*\\*\\* | 0 | | Llama 3.2 1B QLora | 1.3k | 0 | 700 | 0.381 | 0 | | Llama 3.2 3B QLora | 1.6k | 0 | 700 | 0.461 | 0 | | Total | 833k | 86k | | 240 | 0 | \\*\\* The location-based CO2e emissions of Llama 3.2 1B SpinQuant and Llama 3.2 3B SpinQuant are less than 0.001 metric tonnes each. This is due to the minimal training GPU hours that are required. The methodology used to determine training energy use and greenhouse gas emissions can be found here. Since Meta is openly releasing these models, the training energy use and greenhouse gas emissions will not be incurred by others. ## Training Data **Overview:** Llama 3.2 was pretrained on up to 9 trillion tokens of data from publicly available sources. For the 1B and 3B Llama 3.2 models, we incorporated logits from the Llama 3.1 8B and 70B models into the pretraining stage of the model development, where outputs (logits) from these larger models were used as token-level targets. Knowledge distillation was used after pruning to recover performance. In post-training we used a similar recipe as Llama 3.1 and produced final chat models by doing several rounds of alignment on top of the pre-trained model. Each round involved Supervised Fine-Tuning (SFT), Rejection Sampling (RS), and Direct Preference Optimization (DPO). **Data Freshness:** The pretraining data has a cutoff of December 2023\\. ## Quantization ### Quantization Scheme We designed the current quantization scheme with the PyTorch’s ExecuTorch inference framework and Arm CPU backend in mind, taking into account metrics including model quality, prefill/decoding speed, and memory footprint. Our quantization scheme involves three parts: - All linear layers in all transformer blocks are quantized to a 4-bit groupwise scheme (with a group size of 32) for weights and 8-bit per-token dynamic quantization for activations. - The classification layer is quantized to 8-bit per-channel for weight and 8-bit per token dynamic quantization for activation. - Similar to classification layer, an 8-bit per channel quantization is used for embedding layer. ### Quantization-Aware Training and LoRA The quantization-aware training (QAT) with low-rank adaptation (LoRA) models went through only post-training stages, using the same data as the full precision models. To initialize QAT, we utilize BF16 Llama 3.2 model checkpoints obtained after supervised fine-tuning (SFT) and perform an additional full round of SFT training with QAT. We then freeze the backbone of the QAT model and perform another round of SFT with LoRA adaptors applied to all layers within the transformer block. Meanwhile, the LoRA adaptors' weights and activations are maintained in BF16. Because our approach is similar to QLoRA of Dettmers et al., (2023) (i.e., quantization followed by LoRA adapters), we refer this method as QLoRA. Finally, we fine-tune the resulting model (both backbone and LoRA adaptors) using direct preference optimization (DPO). ### SpinQuant SpinQuant was applied, together with generative post-training quantization (GPTQ). For the SpinQuant rotation matrix fine-tuning, we optimized for 100 iterations, using 800 samples with sequence-length 2048 from the WikiText 2 dataset. For GPTQ, we used 128 samples from the same dataset with the same sequence-length. ## Benchmarks \\- English Text In this section, we report the results for Llama 3.2 models on standard automatic benchmarks. For all these evaluations, we used our internal evaluations library. ### Base Pretrained Models | Category | Benchmark | \\# Shots | Metric | Llama 3.2 1B | Llama 3.2 3B | Llama 3.1 8B | | ----- | ----- | :---: | :---: | :---: | :---: | :---: | | General | MMLU | 5 | macro\\_avg/acc\\_char | 32.2 | 58 | 66.7 | | | AGIEval English | 3-5 | average/acc\\_char | 23.3 | 39.2 | 47.8 | | | ARC-Challenge | 25 | acc\\_char | 32.8 | 69.1 | 79.7 | | Reading comprehension | SQuAD | 1 | em | 49.2 | 67.7 | 77 | | | QuAC (F1) | 1 | f1 | 37.9 | 42.9 | 44.9 | | | DROP (F1) | 3 | f1 | 28.0 | 45.2 | 59.5 | | Long Context | Needle in Haystack | 0 | em | 96.8 | 1 | 1 | ### Instruction Tuned Models | Capability | | Benchmark | \\# Shots | Metric | Llama 3.2 1B bf16 | Llama 3.2 1B Vanilla PTQ\\*\\* | Llama 3.2 1B Spin Quant | Llama 3.2 1B QLoRA | Llama 3.2 3B bf16 | Llama 3.2 3B Vanilla PTQ\\*\\* | Llama 3.2 3B Spin Quant | Llama 3.2 3B QLoRA | Llama 3.1 8B | | :---: | ----- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | | General | | MMLU | 5 | macro\\_avg/acc | 49.3 | 43.3 | 47.3 | 49.0 | 63.4 | 60.5 | 62 | 62.4 | 69.4 | | Re-writing | | Open-rewrite eval | 0 | micro\\_avg/rougeL | 41.6 | 39.2 | 40.9 | 41.2 | 40.1 | 40.3 | 40.8 | 40.7 | 40.9 | | Summarization | | TLDR9+ (test) | 1 | rougeL | 16.8 | 14.9 | 16.7 | 16.8 | 19.0 | 19.1 | 19.2 | 19.1 | 17.2 | | Instruction following | | IFEval | 0 | Avg(Prompt/Instruction acc Loose/Strict) | 59.5 | 51.5 | 58.4 | 55.6 | 77.4 | 73.9 | 73.5 | 75.9 | 80.4 | | Math | | GSM8K (CoT) | 8 | em\\_maj1@1 | 44.4 | 33.1 | 40.6 | 46.5 | 77.7 | 72.9 | 75.7 | 77.9 | 84.5 | | | | MATH (CoT) | 0 | final\\_em | 30.6 | 20.5 | 25.3 | 31.0 | 48.0 | 44.2 | 45.3 | 49.2 | 51.9 | | Reasoning | | ARC-C | 0 | acc | 59.4 | 54.3 | 57 | 60.7 | 78.6 | 75.6 | 77.6 | 77.6 | 83.4 | | | | GPQA | 0 | acc | 27.2 | 25.9 | 26.3 | 25.9 | 32.8 | 32.8 | 31.7 | 33.9 | 32.8 | | | | Hellaswag | 0 | acc | 41.2 | 38.1 | 41.3 | 41.5 | 69.8 | 66.3 | 68 | 66.3 | 78.7 | | Tool Use | | BFCL V2 | 0 | acc | 25.7 | 14.3 | 15.9 | 23.7 | 67.0 | 53.4 | 60.1 | 63.5 | 67.1 | | | | Nexus | 0 | macro\\_avg/acc | 13.5 | 5.2 | 9.6 | 12.5 | 34.3 | 32.4 | 31.5 | 30.1 | 38.5 | | Long Context | | InfiniteBench/En.QA | 0 | longbook\\_qa/f1 | 20.3 | N/A | N/A | N/A | 19.8 | N/A | N/A | N/A | 27.3 | | | | InfiniteBench/En.MC | 0 | longbook\\_choice/acc | 38.0 | N/A | N/A | N/A | 63.3 | N/A | N/A | N/A | 72.2 | | | | NIH/Multi-needle | 0 | recall | 75.0 | N/A | N/A | N/A | 84.7 | N/A | N/A | N/A | 98.8 | | Multilingual | | MGSM (CoT) | 0 | em | 24.5 | 13.7 | 18.2 | 24.4 | 58.2 | 48.9 | 54.3 | 56.8 | 68.9 | \\*\\*for comparison purposes only. Model not released. ### Multilingual Benchmarks | Category | Benchmark | Language | Llama 3.2 1B | Llama 3.2 1B Vanilla PTQ\\*\\* | Llama 3.2 1B Spin Quant | Llama 3.2 1B QLoRA | Llama 3.2 3B | Llama 3.2 3B Vanilla PTQ\\*\\* | Llama 3.2 3B Spin Quant | Llama 3.2 3B QLoRA | Llama 3.1 8B | | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | | General | MMLU (5-shot, macro_avg/acc) | Portuguese | 39.8 | 34.9 | 38.9 | 40.2 | 54.5 | 50.9 | 53.3 | 53.4 | 62.1 | | | | Spanish | 41.5 | 36.0 | 39.8 | 41.8 | 55.1 | 51.9 | 53.6 | 53.6 | 62.5 | | | | Italian | 39.8 | 34.9 | 38.1 | 40.6 | 53.8 | 49.9 | 52.1 | 51.7 | 61.6 | | | | German | 39.2 | 34.9 | 37.5 | 39.6 | 53.3 | 50.0 | 52.2 | 51.3 | 60.6 | | | | French | 40.5 | 34.8 | 39.2 | 40.8 | 54.6 | 51.2 | 53.3 | 53.3 | 62.3 | | | | Hindi | 33.5 | 30.0 | 32.1 | 34.0 | 43.3 | 40.4 | 42.0 | 42.1 | 50.9 | | | | Thai | 34.7 | 31.2 | 32.4 | 34.9 | 44.5 | 41.3 | 44.0 | 42.2 | 50.3 | \\*\\*for comparison purposes only. Model not released. ## Inference time In the below table, we compare the performance metrics of different quantization methods (SpinQuant and QAT \\+ LoRA) with the BF16 baseline. The evaluation was done using the ExecuTorch framework as the inference engine, with the ARM CPU as a backend using Android OnePlus 12 device. | Category | Decode (tokens/sec) | Time-to-first-token (sec) | Prefill (tokens/sec) | Model size (PTE file size in MB) | Memory size (RSS in MB) | | :---- | ----- | ----- | ----- | ----- | ----- | | 1B BF16 (baseline) | 19.2 | 1.0 | 60.3 | 2358 | 3,185 | | 1B SpinQuant | 50.2 (2.6x) | 0.3 (-76.9%) | 260.5 (4.3x) | 1083 (-54.1%) | 1,921 (-39.7%) | | 1B QLoRA | 45.8 (2.4x) | 0.3 (-76.0%) | 252.0 (4.2x) | 1127 (-52.2%) | 2,255 (-29.2%) | | 3B BF16 (baseline) | 7.6 | 3.0 | 21.2 | 6129 | 7,419 | | 3B SpinQuant | 19.7 (2.6x) | 0.7 (-76.4%) | 89.7 (4.2x) | 2435 (-60.3%) | 3,726 (-49.8%) | | 3B QLoRA | 18.5 (2.4x) | 0.7 (-76.1%) | 88.8 (4.2x) | 2529 (-58.7%) | 4,060 (-45.3%) | (\\*) The performance measurement is done using an adb binary-based approach. (\\*\\*) It is measured on an Android OnePlus 12 device. (\\*\\*\\*) Time-to-first-token (TTFT) is measured with prompt length=64 *Footnote:* - *Decode (tokens/second) is for how quickly it keeps generating. Higher is better.* - *Time-to-first-token (TTFT for shorthand) is for how fast it generates the first token for a given prompt. Lower is better.* - *Prefill is the inverse of TTFT (aka 1/TTFT) in tokens/second. Higher is better* - *Model size \\- how big is the model, measured by, PTE file, a binary file format for ExecuTorch* - *RSS size \\- Memory usage in resident set size (RSS)* ## Responsibility & Safety As part of our Responsible release approach, we followed a three-pronged strategy to managing trust & safety risks: 1. Enable developers to deploy helpful, safe and flexible experiences for their target audience and for the use cases supported by Llama 2. Protect developers against adversarial users aiming to exploit Llama capabilities to potentially cause harm 3. Provide protections for the community to help prevent the misuse of our models ### Responsible Deployment **Approach:** Llama is a foundational technology designed to be used in a variety of use cases. Examples on how Meta’s Llama models have been responsibly deployed can be found in our Community Stories webpage. Our approach is to build the most helpful models, enabling the world to benefit from the technology power, by aligning our model safety for generic use cases and addressing a standard set of harms. Developers are then in the driver’s seat to tailor safety for their use cases, defining their own policies and deploying the models with the necessary safeguards in their Llama systems. Llama 3.2 was developed following the best practices outlined in our Responsible Use Guide. #### Llama 3.2 Instruct **Objective:** Our main objectives for conducting safety fine-tuning are to provide the research community with a valuable resource for studying the robustness of safety fine-tuning, as well as to offer developers a readily available, safe, and powerful model for various applications to reduce the developer workload to deploy safe AI systems. We implemented the same set of safety mitigations as in Llama 3, and you can learn more about these in the Llama 3 paper. **Fine-Tuning Data:** We employ a multi-faceted approach to data collection, combining human-generated data from our vendors with synthetic data to mitigate potential safety risks. We’ve developed many large language model (LLM)-based classifiers that enable us to thoughtfully select high-quality prompts and responses, enhancing data quality control. **Refusals and Tone:** Building on the work we started with Llama 3, we put a great emphasis on model refusals to benign prompts as well as refusal tone. We included both borderline and adversarial prompts in our safety data strategy, and modified our safety data responses to follow tone guidelines. #### Llama 3.2 Systems **Safety as a System:** Large language models, including Llama 3.2, **are not designed to be deployed in isolation** but instead should be deployed as part of an overall AI system with additional safety guardrails as required. Developers are expected to deploy system safeguards when building agentic systems. Safeguards are key to achieve the right helpfulness-safety alignment as well as mitigating safety and security risks inherent to the system and any integration of the model or system with external tools. As part of our responsible release approach, we provide the community with safeguards that developers should deploy with Llama models or other LLMs, including Llama Guard, Prompt Guard and Code Shield. All our reference implementations demos contain these safeguards by default so developers can benefit from system-level safety out-of-the-box. ### New Capabilities and Use Cases **Technological Advancement:** Llama releases usually introduce new capabilities that require specific considerations in addition to the best practices that generally apply across all Generative AI use cases. For prior release capabilities also supported by Llama 3.2, see Llama 3.1 Model Card, as the same considerations apply here as well. **Constrained Environments:** Llama 3.2 1B and 3B models are expected to be deployed in highly constrained environments, such as mobile devices. LLM Systems using smaller models will have a different alignment profile and safety/helpfulness tradeoff than more complex, larger systems. Developers should ensure the safety of their system meets the requirements of their use case. We recommend using lighter system safeguards for such use cases, like Llama Guard 3-1B or its mobile-optimized version. ### Evaluations **Scaled Evaluations:** We built dedicated, adversarial evaluation datasets and evaluated systems composed of Llama models and Purple Llama safeguards to filter input prompt and output response. It is important to evaluate applications in context, and we recommend building dedicated evaluation dataset for your use case. **Red Teaming:** We conducted recurring red teaming exercises with the goal of discovering risks via adversarial prompting and we used the learnings to improve our benchmarks and safety tuning datasets. We partnered early with subject-matter experts in critical risk areas to understand the nature of these real-world harms and how such models may lead to unintended harm for society. Based on these conversations, we derived a set of adversarial goals for the red team to attempt to achieve, such as extracting harmful information or reprogramming the model to act in a potentially harmful capacity. The red team consisted of experts in cybersecurity, adversarial machine learning, responsible AI, and integrity in addition to multilingual content specialists with background in integrity issues in specific geographic markets. ### Critical Risks In addition to our safety work above, we took extra care on measuring and/or mitigating the following critical risk areas: **1\\. CBRNE (Chemical, Biological, Radiological, Nuclear, and Explosive Weapons):** Llama 3.2 1B and 3B models are smaller and less capable derivatives of Llama 3.1. For Llama 3.1 70B and 405B, to assess risks related to proliferation of chemical and biological weapons, we performed uplift testing designed to assess whether use of Llama 3.1 models could meaningfully increase the capabilities of malicious actors to plan or carry out attacks using these types of weapons and have determined that such testing also applies to the smaller 1B and 3B models. **2\\. Child Safety:** Child Safety risk assessments were conducted using a team of experts, to assess the model’s capability to produce outputs that could result in Child Safety risks and inform on any necessary and appropriate risk mitigations via fine tuning. We leveraged those expert red teaming sessions to expand the coverage of our evaluation benchmarks through Llama 3 model development. For Llama 3, we conducted new in-depth sessions using objective based methodologies to assess the model risks along multiple attack vectors including the additional languages Llama 3 is trained on. We also partnered with content specialists to perform red teaming exercises assessing potentially violating content while taking account of market specific nuances or experiences. **3\\. Cyber Attacks:** For Llama 3.1 405B, our cyber attack uplift study investigated whether LLMs can enhance human capabilities in hacking tasks, both in terms of skill level and speed. Our attack automation study focused on evaluating the capabilities of LLMs when used as autonomous agents in cyber offensive operations, specifically in the context of ransomware attacks. This evaluation was distinct from previous studies that considered LLMs as interactive assistants. The primary objective was to assess whether these models could effectively function as independent agents in executing complex cyber-attacks without human intervention. Because Llama 3.2’s 1B and 3B models are smaller and less capable models than Llama 3.1 405B, we broadly believe that the testing conducted for the 405B model also applies to Llama 3.2 models. ### Community **Industry Partnerships:** Generative AI safety requires expertise and tooling, and we believe in the strength of the open community to accelerate its progress. We are active members of open consortiums, including the AI Alliance, Partnership on AI and MLCommons, actively contributing to safety standardization and transparency. We encourage the community to adopt taxonomies like the MLCommons Proof of Concept evaluation to facilitate collaboration and transparency on safety and content evaluations. Our Purple Llama tools are open sourced for the community to use and widely distributed across ecosystem partners including cloud service providers. We encourage community contributions to our Github repository. **Grants:** We also set up the Llama Impact Grants program to identify and support the most compelling applications of Meta’s Llama model for societal benefit across three categories: education, climate and open innovation. The 20 finalists from the hundreds of applications can be found here. **Reporting:** Finally, we put in place a set of resources including an output reporting mechanism and bug bounty program to continuously improve the Llama technology with the help of the community. ## Ethical Considerations and Limitations **Values:** The core values of Llama 3.2 are openness, inclusivity and helpfulness. It is meant to serve everyone, and to work for a wide range of use cases. It is thus designed to be accessible to people across many different backgrounds, experiences and perspectives. Llama 3.2 addresses users and their needs as they are, without insertion unnecessary judgment or normativity, while reflecting the understanding that even content that may appear problematic in some cases can serve valuable purposes in others. It respects the dignity and autonomy of all users, especially in terms of the values of free thought and expression that power innovation and progress. **Testing:** Llama 3.2 is a new technology, and like any new technology, there are risks associated with its use. Testing conducted to date has not covered, nor could it cover, all scenarios. For these reasons, as with all LLMs, Llama 3.2’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 3.2 models, developers should perform safety testing and tuning tailored to their specific applications of the model. Please refer to available resources including our Responsible Use Guide, Trust and Safety solutions, and other resources to learn more about responsible development.", + "model_explanation_gemini": "A multilingual text-generation model designed for instruction-following tasks, supporting English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.\n\nFeatures: \n- Multilingual support (8 languages) \n- Text-generation capability \n- Instruction-following design \n- Released under Llama 3.2 Community License \n- Transformer-based architecture \n\nComparison: \nThis model belongs to Meta's Llama 3.2 series, which typically offers improved performance over earlier versions, with a" +} \ No newline at end of file diff --git a/model_data_json/meta-llama_Llama-3.2-3B.json b/model_data_json/meta-llama_Llama-3.2-3B.json new file mode 100644 index 0000000000000000000000000000000000000000..103fe7676230bae3f1327e24c31712df71aa4ba7 --- /dev/null +++ b/model_data_json/meta-llama_Llama-3.2-3B.json @@ -0,0 +1,31 @@ +{ + "model_id": "meta-llama/Llama-3.2-3B", + "downloads": 546751, + "tags": [ + "transformers", + "safetensors", + "llama", + "text-generation", + "facebook", + "meta", + "pytorch", + "llama-3", + "en", + "de", + "fr", + "it", + "pt", + "hi", + "es", + "th", + "arxiv:2204.05149", + "arxiv:2405.16406", + "license:llama3.2", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: - en - de - fr - it - pt - hi - es - th library_name: transformers pipeline_tag: text-generation tags: - facebook - meta - pytorch - llama - llama-3 license: llama3.2 extra_gated_prompt: >- ### LLAMA 3.2 COMMUNITY LICENSE AGREEMENT Llama 3.2 Version Release Date: September 25, 2024 “Agreement” means the terms and conditions for use, reproduction, distribution and modification of the Llama Materials set forth herein. “Documentation” means the specifications, manuals and documentation accompanying Llama 3.2 distributed by Meta at “Licensee” or “you” means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity’s behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf. “Llama 3.2” means the foundational large language models and software and algorithms, including machine-learning model code, trained model weights, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing distributed by Meta at “Llama Materials” means, collectively, Meta’s proprietary Llama 3.2 and Documentation (and any portion thereof) made available under this Agreement. “Meta” or “we” means Meta Platforms Ireland Limited (if you are located in or, if you are an entity, your principal place of business is in the EEA or Switzerland) and Meta Platforms, Inc. (if you are located outside of the EEA or Switzerland). By clicking “I Accept” below or by using or distributing any portion or element of the Llama Materials, you agree to be bound by this Agreement. 1. License Rights and Redistribution. a. Grant of Rights. You are granted a non-exclusive, worldwide, non-transferable and royalty-free limited license under Meta’s intellectual property or other rights owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy, create derivative works of, and make modifications to the Llama Materials. b. Redistribution and Use. i. If you distribute or make available the Llama Materials (or any derivative works thereof), or a product or service (including another AI model) that contains any of them, you shall (A) provide a copy of this Agreement with any such Llama Materials; and (B) prominently display “Built with Llama” on a related website, user interface, blogpost, about page, or product documentation. If you use the Llama Materials or any outputs or results of the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include “Llama” at the beginning of any such AI model name. ii. If you receive Llama Materials, or any derivative works thereof, from a Licensee as part of an integrated end user product, then Section 2 of this Agreement will not apply to you. iii. You must retain in all copies of the Llama Materials that you distribute the following attribution notice within a “Notice” text file distributed as a part of such copies: “Llama 3.2 is licensed under the Llama 3.2 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.” iv. Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the Llama Materials (available at which is hereby incorporated by reference into this Agreement. 2. Additional Commercial Terms. If, on the Llama 3.2 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights. 3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN “AS IS” BASIS, WITHOUT WARRANTIES OF ANY KIND, AND META DISCLAIMS ALL WARRANTIES OF ANY KIND, BOTH EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS. 4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING. 5. Intellectual Property. a. No trademark licenses are granted under this Agreement, and in connection with the Llama Materials, neither Meta nor Licensee may use any name or mark owned by or associated with the other or any of its affiliates, except as required for reasonable and customary use in describing and redistributing the Llama Materials or as set forth in this Section 5(a). Meta hereby grants you a license to use “Llama” (the “Mark”) solely as required to comply with the last sentence of Section 1.b.i. You will comply with Meta’s brand guidelines (currently accessible at All goodwill arising out of your use of the Mark will inure to the benefit of Meta. b. Subject to Meta’s ownership of Llama Materials and derivatives made by or for Meta, with respect to any derivative works and modifications of the Llama Materials that are made by you, as between you and Meta, you are and will be the owner of such derivative works and modifications. c. If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 3.2 outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials. 6. Term and Termination. The term of this Agreement will commence upon your acceptance of this Agreement or access to the Llama Materials and will continue in full force and effect until terminated in accordance with the terms and conditions herein. Meta may terminate this Agreement if you are in breach of any term or condition of this Agreement. Upon termination of this Agreement, you shall delete and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the termination of this Agreement. 7. Governing Law and Jurisdiction. This Agreement will be governed and construed under the laws of the State of California without regard to choice of law principles, and the UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement. The courts of California shall have exclusive jurisdiction of any dispute arising out of this Agreement. ### Llama 3.2 Acceptable Use Policy Meta is committed to promoting safe and fair use of its tools and features, including Llama 3.2. If you access or use Llama 3.2, you agree to this Acceptable Use Policy (“**Policy**”). The most recent copy of this policy can be found at #### Prohibited Uses We want everyone to use Llama 3.2 safely and responsibly. You agree you will not use, or allow others to use, Llama 3.2 to: 1. Violate the law or others’ rights, including to: 1. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as: 1. Violence or terrorism 2. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material 3. Human trafficking, exploitation, and sexual violence 4. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials. 5. Sexual solicitation 6. Any other criminal activity 1. Engage in, promote, incite, or facilitate the harassment, abuse, threatening, or bullying of individuals or groups of individuals 2. Engage in, promote, incite, or facilitate discrimination or other unlawful or harmful conduct in the provision of employment, employment benefits, credit, housing, other economic benefits, or other essential goods and services 3. Engage in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or related professional practices 4. Collect, process, disclose, generate, or infer private or sensitive information about individuals, including information about individuals’ identity, health, or demographic information, unless you have obtained the right to do so in accordance with applicable law 5. Engage in or facilitate any action or generate any content that infringes, misappropriates, or otherwise violates any third-party rights, including the outputs or results of any products or services using the Llama Materials 6. Create, generate, or facilitate the creation of malicious code, malware, computer viruses or do anything else that could disable, overburden, interfere with or impair the proper working, integrity, operation or appearance of a website or computer system 7. Engage in any action, or facilitate any action, to intentionally circumvent or remove usage restrictions or other safety measures, or to enable functionality disabled by Meta 2. Engage in, promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals, including use of Llama 3.2 related to the following: 8. Military, warfare, nuclear industries or applications, espionage, use for materials or activities that are subject to the International Traffic Arms Regulations (ITAR) maintained by the United States Department of State or to the U.S. Biological Weapons Anti-Terrorism Act of 1989 or the Chemical Weapons Convention Implementation Act of 1997 9. Guns and illegal weapons (including weapon development) 10. Illegal drugs and regulated/controlled substances 11. Operation of critical infrastructure, transportation technologies, or heavy machinery 12. Self-harm or harm to others, including suicide, cutting, and eating disorders 13. Any content intended to incite or promote violence, abuse, or any infliction of bodily harm to an individual 3. Intentionally deceive or mislead others, including use of Llama 3.2 related to the following: 14. Generating, promoting, or furthering fraud or the creation or promotion of disinformation 15. Generating, promoting, or furthering defamatory content, including the creation of defamatory statements, images, or other content 16. Generating, promoting, or further distributing spam 17. Impersonating another individual without consent, authorization, or legal right 18. Representing that the use of Llama 3.2 or outputs are human-generated 19. Generating or facilitating false online engagement, including fake reviews and other means of fake online engagement 4. Fail to appropriately disclose to end users any known dangers of your AI system 5. Interact with third party tools, models, or software designed to generate unlawful content or engage in unlawful or harmful conduct and/or represent that the outputs of such tools, models, or software are associated with Meta or Llama 3.2 With respect to any multimodal models included in Llama 3.2, the rights granted under Section 1(a) of the Llama 3.2 Community License Agreement are not being granted to you if you are an individual domiciled in, or a company with a principal place of business in, the European Union. This restriction does not apply to end users of a product or service that incorporates any such multimodal models. Please report any violation of this Policy, software “bug,” or other problems that could lead to a violation of this Policy through one of the following means: * Reporting issues with the model: * Reporting risky content generated by the model: developers.facebook.com/llama_output_feedback * Reporting bugs and security concerns: facebook.com/whitehat/info * Reporting violations of the Acceptable Use Policy or unlicensed uses of Llama 3.2: LlamaUseReport@meta.com extra_gated_fields: First Name: text Last Name: text Date of birth: date_picker Country: country Affiliation: text Job title: type: select options: - Student - Research Graduate - AI researcher - AI developer/engineer - Reporter - Other geo: ip_location By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy: checkbox extra_gated_description: >- The information you provide will be collected, stored, processed and shared in accordance with the Meta Privacy Policy. extra_gated_button_content: Submit --- ## Model Information The Llama 3.2 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction-tuned generative models in 1B and 3B sizes (text in/text out). The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks. They outperform many of the available open source and closed chat models on common industry benchmarks. **Model Developer:** Meta **Model Architecture:** Llama 3.2 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. | | Training Data | Params | Input modalities | Output modalities | Context Length | GQA | Shared Embeddings | Token count | Knowledge cutoff | | :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- | | Llama 3.2 (text only) | A new mix of publicly available online data. | 1B (1.23B) | Multilingual Text | Multilingual Text and code | 128k | Yes | Yes | Up to 9T tokens | December 2023 | | | | 3B (3.21B) | Multilingual Text | Multilingual Text and code | | | | | | | Llama 3.2 Quantized (text only) | A new mix of publicly available online data. | 1B (1.23B) | Multilingual Text | Multilingual Text and code | 8k | Yes | Yes | Up to 9T tokens | December 2023 | | | | 3B (3.21B) | Multilingual Text | Multilingual Text and code | | | | | | **Supported Languages:** English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai are officially supported. Llama 3.2 has been trained on a broader collection of languages than these 8 supported languages. Developers may fine-tune Llama 3.2 models for languages beyond these supported languages, provided they comply with the Llama 3.2 Community License and the Acceptable Use Policy. Developers are always expected to ensure that their deployments, including those that involve additional languages, are completed safely and responsibly. **Llama 3.2 Model Family:** Token counts refer to pretraining data only. All model versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date:** Sept 25, 2024 **Status:** This is a static model trained on an offline dataset. Future versions may be released that improve model capabilities and safety. **License:** Use of Llama 3.2 is governed by the Llama 3.2 Community License (a custom, commercial license agreement). **Feedback:** Instructions on how to provide feedback or comments on the model can be found in the Llama Models README. For more technical information about generation parameters and recipes for how to use Llama 3.2 in applications, please go here. ## Intended Use **Intended Use Cases:** Llama 3.2 is intended for commercial and research use in multiple languages. Instruction tuned text only models are intended for assistant-like chat and agentic applications like knowledge retrieval and summarization, mobile AI powered writing assistants and query and prompt rewriting. Pretrained models can be adapted for a variety of additional natural language generation tasks. Similarly, quantized models can be adapted for a variety of on-device use-cases with limited compute resources. **Out of Scope:** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3.2 Community License. Use in languages beyond those explicitly referenced as supported in this model card. ## How to use This repository contains two versions of Llama-3.2-3B, for use with transformers and with the original codebase. ### Use with transformers Starting with transformers >= 4.43.0 onward, you can run conversational inference using the Transformers pipeline abstraction or by leveraging the Auto classes with the generate() function. Make sure to update your transformers installation via pip install --upgrade transformers. ### Use with Please, follow the instructions in the repository. To download Original checkpoints, see the example command below leveraging : ## Hardware and Software **Training Factors:** We used custom training libraries, Meta's custom built GPU cluster, and production infrastructure for pretraining. Fine-tuning, quantization, annotation, and evaluation were also performed on production infrastructure. **Training Energy Use:** Training utilized a cumulative of **916k** GPU hours of computation on H100-80GB (TDP of 700W) type hardware, per the table below. Training time is the total GPU time required for training each model and power consumption is the peak power capacity per GPU device used, adjusted for power usage efficiency. **Training Greenhouse Gas Emissions:** Estimated total location-based greenhouse gas emissions were **240** tons CO2eq for training. Since 2020, Meta has maintained net zero greenhouse gas emissions in its global operations and matched 100% of its electricity use with renewable energy; therefore, the total market-based greenhouse gas emissions for training were 0 tons CO2eq. | | Training Time (GPU hours) | Logit Generation Time (GPU Hours) | Training Power Consumption (W) | Training Location-Based Greenhouse Gas Emissions (tons CO2eq) | Training Market-Based Greenhouse Gas Emissions (tons CO2eq) | | :---- | :---: | ----- | :---: | :---: | :---: | | Llama 3.2 1B | 370k | \\- | 700 | 107 | 0 | | Llama 3.2 3B | 460k | \\- | 700 | 133 | 0 | | Llama 3.2 1B SpinQuant | 1.7 | 0 | 700 | *Negligible*\\*\\* | 0 | | Llama 3.2 3B SpinQuant | 2.4 | 0 | 700 | *Negligible*\\*\\* | 0 | | Llama 3.2 1B QLora | 1.3k | 0 | 700 | 0.381 | 0 | | Llama 3.2 3B QLora | 1.6k | 0 | 700 | 0.461 | 0 | | Total | 833k | 86k | | 240 | 0 | \\*\\* The location-based CO2e emissions of Llama 3.2 1B SpinQuant and Llama 3.2 3B SpinQuant are less than 0.001 metric tonnes each. This is due to the minimal training GPU hours that are required. The methodology used to determine training energy use and greenhouse gas emissions can be found here. Since Meta is openly releasing these models, the training energy use and greenhouse gas emissions will not be incurred by others. ## Training Data **Overview:** Llama 3.2 was pretrained on up to 9 trillion tokens of data from publicly available sources. For the 1B and 3B Llama 3.2 models, we incorporated logits from the Llama 3.1 8B and 70B models into the pretraining stage of the model development, where outputs (logits) from these larger models were used as token-level targets. Knowledge distillation was used after pruning to recover performance. In post-training we used a similar recipe as Llama 3.1 and produced final chat models by doing several rounds of alignment on top of the pre-trained model. Each round involved Supervised Fine-Tuning (SFT), Rejection Sampling (RS), and Direct Preference Optimization (DPO). **Data Freshness:** The pretraining data has a cutoff of December 2023\\. ## Quantization ### Quantization Scheme We designed the current quantization scheme with the PyTorch’s ExecuTorch inference framework and Arm CPU backend in mind, taking into account metrics including model quality, prefill/decoding speed, and memory footprint. Our quantization scheme involves three parts: - All linear layers in all transformer blocks are quantized to a 4-bit groupwise scheme (with a group size of 32) for weights and 8-bit per-token dynamic quantization for activations. - The classification layer is quantized to 8-bit per-channel for weight and 8-bit per token dynamic quantization for activation. - Similar to classification layer, an 8-bit per channel quantization is used for embedding layer. ### Quantization-Aware Training and LoRA The quantization-aware training (QAT) with low-rank adaptation (LoRA) models went through only post-training stages, using the same data as the full precision models. To initialize QAT, we utilize BF16 Llama 3.2 model checkpoints obtained after supervised fine-tuning (SFT) and perform an additional full round of SFT training with QAT. We then freeze the backbone of the QAT model and perform another round of SFT with LoRA adaptors applied to all layers within the transformer block. Meanwhile, the LoRA adaptors' weights and activations are maintained in BF16. Because our approach is similar to QLoRA of Dettmers et al., (2023) (i.e., quantization followed by LoRA adapters), we refer this method as QLoRA. Finally, we fine-tune the resulting model (both backbone and LoRA adaptors) using direct preference optimization (DPO). ### SpinQuant SpinQuant was applied, together with generative post-training quantization (GPTQ). For the SpinQuant rotation matrix fine-tuning, we optimized for 100 iterations, using 800 samples with sequence-length 2048 from the WikiText 2 dataset. For GPTQ, we used 128 samples from the same dataset with the same sequence-length. ## Benchmarks \\- English Text In this section, we report the results for Llama 3.2 models on standard automatic benchmarks. For all these evaluations, we used our internal evaluations library. ### Base Pretrained Models | Category | Benchmark | \\# Shots | Metric | Llama 3.2 1B | Llama 3.2 3B | Llama 3.1 8B | | ----- | ----- | :---: | :---: | :---: | :---: | :---: | | General | MMLU | 5 | macro\\_avg/acc\\_char | 32.2 | 58 | 66.7 | | | AGIEval English | 3-5 | average/acc\\_char | 23.3 | 39.2 | 47.8 | | | ARC-Challenge | 25 | acc\\_char | 32.8 | 69.1 | 79.7 | | Reading comprehension | SQuAD | 1 | em | 49.2 | 67.7 | 77 | | | QuAC (F1) | 1 | f1 | 37.9 | 42.9 | 44.9 | | | DROP (F1) | 3 | f1 | 28.0 | 45.2 | 59.5 | | Long Context | Needle in Haystack | 0 | em | 96.8 | 1 | 1 | ### Instruction Tuned Models | Capability | | Benchmark | \\# Shots | Metric | Llama 3.2 1B bf16 | Llama 3.2 1B Vanilla PTQ\\*\\* | Llama 3.2 1B Spin Quant | Llama 3.2 1B QLoRA | Llama 3.2 3B bf16 | Llama 3.2 3B Vanilla PTQ\\*\\* | Llama 3.2 3B Spin Quant | Llama 3.2 3B QLoRA | Llama 3.1 8B | | :---: | ----- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | | General | | MMLU | 5 | macro\\_avg/acc | 49.3 | 43.3 | 47.3 | 49.0 | 63.4 | 60.5 | 62 | 62.4 | 69.4 | | Re-writing | | Open-rewrite eval | 0 | micro\\_avg/rougeL | 41.6 | 39.2 | 40.9 | 41.2 | 40.1 | 40.3 | 40.8 | 40.7 | 40.9 | | Summarization | | TLDR9+ (test) | 1 | rougeL | 16.8 | 14.9 | 16.7 | 16.8 | 19.0 | 19.1 | 19.2 | 19.1 | 17.2 | | Instruction following | | IFEval | 0 | Avg(Prompt/Instruction acc Loose/Strict) | 59.5 | 51.5 | 58.4 | 55.6 | 77.4 | 73.9 | 73.5 | 75.9 | 80.4 | | Math | | GSM8K (CoT) | 8 | em\\_maj1@1 | 44.4 | 33.1 | 40.6 | 46.5 | 77.7 | 72.9 | 75.7 | 77.9 | 84.5 | | | | MATH (CoT) | 0 | final\\_em | 30.6 | 20.5 | 25.3 | 31.0 | 48.0 | 44.2 | 45.3 | 49.2 | 51.9 | | Reasoning | | ARC-C | 0 | acc | 59.4 | 54.3 | 57 | 60.7 | 78.6 | 75.6 | 77.6 | 77.6 | 83.4 | | | | GPQA | 0 | acc | 27.2 | 25.9 | 26.3 | 25.9 | 32.8 | 32.8 | 31.7 | 33.9 | 32.8 | | | | Hellaswag | 0 | acc | 41.2 | 38.1 | 41.3 | 41.5 | 69.8 | 66.3 | 68 | 66.3 | 78.7 | | Tool Use | | BFCL V2 | 0 | acc | 25.7 | 14.3 | 15.9 | 23.7 | 67.0 | 53.4 | 60.1 | 63.5 | 67.1 | | | | Nexus | 0 | macro\\_avg/acc | 13.5 | 5.2 | 9.6 | 12.5 | 34.3 | 32.4 | 31.5 | 30.1 | 38.5 | | Long Context | | InfiniteBench/En.QA | 0 | longbook\\_qa/f1 | 20.3 | N/A | N/A | N/A | 19.8 | N/A | N/A | N/A | 27.3 | | | | InfiniteBench/En.MC | 0 | longbook\\_choice/acc | 38.0 | N/A | N/A | N/A | 63.3 | N/A | N/A | N/A | 72.2 | | | | NIH/Multi-needle | 0 | recall | 75.0 | N/A | N/A | N/A | 84.7 | N/A | N/A | N/A | 98.8 | | Multilingual | | MGSM (CoT) | 0 | em | 24.5 | 13.7 | 18.2 | 24.4 | 58.2 | 48.9 | 54.3 | 56.8 | 68.9 | \\*\\*for comparison purposes only. Model not released. ### Multilingual Benchmarks | Category | Benchmark | Language | Llama 3.2 1B | Llama 3.2 1B Vanilla PTQ\\*\\* | Llama 3.2 1B Spin Quant | Llama 3.2 1B QLoRA | Llama 3.2 3B | Llama 3.2 3B Vanilla PTQ\\*\\* | Llama 3.2 3B Spin Quant | Llama 3.2 3B QLoRA | Llama 3.1 8B | | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | | General | MMLU (5-shot, macro_avg/acc) | Portuguese | 39.8 | 34.9 | 38.9 | 40.2 | 54.5 | 50.9 | 53.3 | 53.4 | 62.1 | | | | Spanish | 41.5 | 36.0 | 39.8 | 41.8 | 55.1 | 51.9 | 53.6 | 53.6 | 62.5 | | | | Italian | 39.8 | 34.9 | 38.1 | 40.6 | 53.8 | 49.9 | 52.1 | 51.7 | 61.6 | | | | German | 39.2 | 34.9 | 37.5 | 39.6 | 53.3 | 50.0 | 52.2 | 51.3 | 60.6 | | | | French | 40.5 | 34.8 | 39.2 | 40.8 | 54.6 | 51.2 | 53.3 | 53.3 | 62.3 | | | | Hindi | 33.5 | 30.0 | 32.1 | 34.0 | 43.3 | 40.4 | 42.0 | 42.1 | 50.9 | | | | Thai | 34.7 | 31.2 | 32.4 | 34.9 | 44.5 | 41.3 | 44.0 | 42.2 | 50.3 | \\*\\*for comparison purposes only. Model not released. ## Inference time In the below table, we compare the performance metrics of different quantization methods (SpinQuant and QAT \\+ LoRA) with the BF16 baseline. The evaluation was done using the ExecuTorch framework as the inference engine, with the ARM CPU as a backend using Android OnePlus 12 device. | Category | Decode (tokens/sec) | Time-to-first-token (sec) | Prefill (tokens/sec) | Model size (PTE file size in MB) | Memory size (RSS in MB) | | :---- | ----- | ----- | ----- | ----- | ----- | | 1B BF16 (baseline) | 19.2 | 1.0 | 60.3 | 2358 | 3,185 | | 1B SpinQuant | 50.2 (2.6x) | 0.3 (-76.9%) | 260.5 (4.3x) | 1083 (-54.1%) | 1,921 (-39.7%) | | 1B QLoRA | 45.8 (2.4x) | 0.3 (-76.0%) | 252.0 (4.2x) | 1127 (-52.2%) | 2,255 (-29.2%) | | 3B BF16 (baseline) | 7.6 | 3.0 | 21.2 | 6129 | 7,419 | | 3B SpinQuant | 19.7 (2.6x) | 0.7 (-76.4%) | 89.7 (4.2x) | 2435 (-60.3%) | 3,726 (-49.8%) | | 3B QLoRA | 18.5 (2.4x) | 0.7 (-76.1%) | 88.8 (4.2x) | 2529 (-58.7%) | 4,060 (-45.3%) | (\\*) The performance measurement is done using an adb binary-based approach. (\\*\\*) It is measured on an Android OnePlus 12 device. (\\*\\*\\*) Time-to-first-token (TTFT) is measured with prompt length=64 *Footnote:* - *Decode (tokens/second) is for how quickly it keeps generating. Higher is better.* - *Time-to-first-token (TTFT for shorthand) is for how fast it generates the first token for a given prompt. Lower is better.* - *Prefill is the inverse of TTFT (aka 1/TTFT) in tokens/second. Higher is better* - *Model size \\- how big is the model, measured by, PTE file, a binary file format for ExecuTorch* - *RSS size \\- Memory usage in resident set size (RSS)* ## Responsibility & Safety As part of our Responsible release approach, we followed a three-pronged strategy to managing trust & safety risks: 1. Enable developers to deploy helpful, safe and flexible experiences for their target audience and for the use cases supported by Llama 2. Protect developers against adversarial users aiming to exploit Llama capabilities to potentially cause harm 3. Provide protections for the community to help prevent the misuse of our models ### Responsible Deployment **Approach:** Llama is a foundational technology designed to be used in a variety of use cases. Examples on how Meta’s Llama models have been responsibly deployed can be found in our Community Stories webpage. Our approach is to build the most helpful models, enabling the world to benefit from the technology power, by aligning our model safety for generic use cases and addressing a standard set of harms. Developers are then in the driver’s seat to tailor safety for their use cases, defining their own policies and deploying the models with the necessary safeguards in their Llama systems. Llama 3.2 was developed following the best practices outlined in our Responsible Use Guide. #### Llama 3.2 Instruct **Objective:** Our main objectives for conducting safety fine-tuning are to provide the research community with a valuable resource for studying the robustness of safety fine-tuning, as well as to offer developers a readily available, safe, and powerful model for various applications to reduce the developer workload to deploy safe AI systems. We implemented the same set of safety mitigations as in Llama 3, and you can learn more about these in the Llama 3 paper. **Fine-Tuning Data:** We employ a multi-faceted approach to data collection, combining human-generated data from our vendors with synthetic data to mitigate potential safety risks. We’ve developed many large language model (LLM)-based classifiers that enable us to thoughtfully select high-quality prompts and responses, enhancing data quality control. **Refusals and Tone:** Building on the work we started with Llama 3, we put a great emphasis on model refusals to benign prompts as well as refusal tone. We included both borderline and adversarial prompts in our safety data strategy, and modified our safety data responses to follow tone guidelines. #### Llama 3.2 Systems **Safety as a System:** Large language models, including Llama 3.2, **are not designed to be deployed in isolation** but instead should be deployed as part of an overall AI system with additional safety guardrails as required. Developers are expected to deploy system safeguards when building agentic systems. Safeguards are key to achieve the right helpfulness-safety alignment as well as mitigating safety and security risks inherent to the system and any integration of the model or system with external tools. As part of our responsible release approach, we provide the community with safeguards that developers should deploy with Llama models or other LLMs, including Llama Guard, Prompt Guard and Code Shield. All our reference implementations demos contain these safeguards by default so developers can benefit from system-level safety out-of-the-box. ### New Capabilities and Use Cases **Technological Advancement:** Llama releases usually introduce new capabilities that require specific considerations in addition to the best practices that generally apply across all Generative AI use cases. For prior release capabilities also supported by Llama 3.2, see Llama 3.1 Model Card, as the same considerations apply here as well. **Constrained Environments:** Llama 3.2 1B and 3B models are expected to be deployed in highly constrained environments, such as mobile devices. LLM Systems using smaller models will have a different alignment profile and safety/helpfulness tradeoff than more complex, larger systems. Developers should ensure the safety of their system meets the requirements of their use case. We recommend using lighter system safeguards for such use cases, like Llama Guard 3-1B or its mobile-optimized version. ### Evaluations **Scaled Evaluations:** We built dedicated, adversarial evaluation datasets and evaluated systems composed of Llama models and Purple Llama safeguards to filter input prompt and output response. It is important to evaluate applications in context, and we recommend building dedicated evaluation dataset for your use case. **Red Teaming:** We conducted recurring red teaming exercises with the goal of discovering risks via adversarial prompting and we used the learnings to improve our benchmarks and safety tuning datasets. We partnered early with subject-matter experts in critical risk areas to understand the nature of these real-world harms and how such models may lead to unintended harm for society. Based on these conversations, we derived a set of adversarial goals for the red team to attempt to achieve, such as extracting harmful information or reprogramming the model to act in a potentially harmful capacity. The red team consisted of experts in cybersecurity, adversarial machine learning, responsible AI, and integrity in addition to multilingual content specialists with background in integrity issues in specific geographic markets. ### Critical Risks In addition to our safety work above, we took extra care on measuring and/or mitigating the following critical risk areas: **1\\. CBRNE (Chemical, Biological, Radiological, Nuclear, and Explosive Weapons):** Llama 3.2 1B and 3B models are smaller and less capable derivatives of Llama 3.1. For Llama 3.1 70B and 405B, to assess risks related to proliferation of chemical and biological weapons, we performed uplift testing designed to assess whether use of Llama 3.1 models could meaningfully increase the capabilities of malicious actors to plan or carry out attacks using these types of weapons and have determined that such testing also applies to the smaller 1B and 3B models. **2\\. Child Safety:** Child Safety risk assessments were conducted using a team of experts, to assess the model’s capability to produce outputs that could result in Child Safety risks and inform on any necessary and appropriate risk mitigations via fine tuning. We leveraged those expert red teaming sessions to expand the coverage of our evaluation benchmarks through Llama 3 model development. For Llama 3, we conducted new in-depth sessions using objective based methodologies to assess the model risks along multiple attack vectors including the additional languages Llama 3 is trained on. We also partnered with content specialists to perform red teaming exercises assessing potentially violating content while taking account of market specific nuances or experiences. **3\\. Cyber Attacks:** For Llama 3.1 405B, our cyber attack uplift study investigated whether LLMs can enhance human capabilities in hacking tasks, both in terms of skill level and speed. Our attack automation study focused on evaluating the capabilities of LLMs when used as autonomous agents in cyber offensive operations, specifically in the context of ransomware attacks. This evaluation was distinct from previous studies that considered LLMs as interactive assistants. The primary objective was to assess whether these models could effectively function as independent agents in executing complex cyber-attacks without human intervention. Because Llama 3.2’s 1B and 3B models are smaller and less capable models than Llama 3.1 405B, we broadly believe that the testing conducted for the 405B model also applies to Llama 3.2 models. ### Community **Industry Partnerships:** Generative AI safety requires expertise and tooling, and we believe in the strength of the open community to accelerate its progress. We are active members of open consortiums, including the AI Alliance, Partnership on AI and MLCommons, actively contributing to safety standardization and transparency. We encourage the community to adopt taxonomies like the MLCommons Proof of Concept evaluation to facilitate collaboration and transparency on safety and content evaluations. Our Purple Llama tools are open sourced for the community to use and widely distributed across ecosystem partners including cloud service providers. We encourage community contributions to our Github repository. **Grants:** We also set up the Llama Impact Grants program to identify and support the most compelling applications of Meta’s Llama model for societal benefit across three categories: education, climate and open innovation. The 20 finalists from the hundreds of applications can be found here. **Reporting:** Finally, we put in place a set of resources including an output reporting mechanism and bug bounty program to continuously improve the Llama technology with the help of the community. ## Ethical Considerations and Limitations **Values:** The core values of Llama 3.2 are openness, inclusivity and helpfulness. It is meant to serve everyone, and to work for a wide range of use cases. It is thus designed to be accessible to people across many different backgrounds, experiences and perspectives. Llama 3.2 addresses users and their needs as they are, without insertion unnecessary judgment or normativity, while reflecting the understanding that even content that may appear problematic in some cases can serve valuable purposes in others. It respects the dignity and autonomy of all users, especially in terms of the values of free thought and expression that power innovation and progress. **Testing:** Llama 3.2 is a new technology, and like any new technology, there are risks associated with its use. Testing conducted to date has not covered, nor could it cover, all scenarios. For these reasons, as with all LLMs, Llama 3.2’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 3.2 models, developers should perform safety testing and tuning tailored to their specific applications of the model. Please refer to available resources including our Responsible Use Guide, Trust and Safety solutions, and other resources to learn more about responsible development.", + "model_explanation_gemini": "A multilingual text-generation model supporting English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai, released under Meta's Llama 3.2 Community License.\n\n**Features:** \n- Languages: en, de, fr, it, pt, hi, es, th \n- Framework: PyTorch \n- License: Llama 3.2 Community License \n- Pipeline: text-generation \n- Tags: facebook, meta, llama, llama-3 \n\n**Comparison:**" +} \ No newline at end of file diff --git a/model_data_json/meta-llama_Llama-3.3-70B-Instruct.json b/model_data_json/meta-llama_Llama-3.3-70B-Instruct.json new file mode 100644 index 0000000000000000000000000000000000000000..ac9c3dabbd4e94c5655f408894ae8f00a38beda3 --- /dev/null +++ b/model_data_json/meta-llama_Llama-3.3-70B-Instruct.json @@ -0,0 +1,33 @@ +{ + "model_id": "meta-llama/Llama-3.3-70B-Instruct", + "downloads": 966128, + "tags": [ + "transformers", + "safetensors", + "llama", + "text-generation", + "facebook", + "meta", + "pytorch", + "llama-3", + "conversational", + "en", + "fr", + "it", + "pt", + "hi", + "es", + "th", + "de", + "arxiv:2204.05149", + "base_model:meta-llama/Llama-3.1-70B", + "base_model:finetune:meta-llama/Llama-3.1-70B", + "license:llama3.3", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- library_name: transformers language: - en - fr - it - pt - hi - es - th - de base_model: - meta-llama/Llama-3.1-70B tags: - facebook - meta - pytorch - llama - llama-3 extra_gated_prompt: \"### LLAMA 3.3 COMMUNITY LICENSE AGREEMENT\\nLlama 3.3 Version Release Date: December 6, 2024\\n\\\"Agreement\\\" means the terms and conditions for use, reproduction, distribution and modification of the Llama Materials set forth herein.\\n\\\"Documentation\\\" means the specifications, manuals and documentation accompanying Llama 3.3 distributed by Meta at or \\\"you\\\" means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity’s behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf.\\n\\\"Llama 3.3\\\" means the foundational large language models and software and algorithms, including machine-learning model code, trained model weights, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing distributed by Meta at Materials\\\" means, collectively, Meta’s proprietary Llama 3.3 and Documentation (and any portion thereof) made available under this Agreement.\\n\\\"Meta\\\" or \\\"we\\\" means Meta Platforms Ireland Limited (if you are located in or, if you are an entity, your principal place of business is in the EEA or Switzerland) and Meta Platforms, Inc. (if you are located outside of the EEA or Switzerland).\\nBy clicking “I Accept” below or by using or distributing any portion or element of the Llama Materials, you agree to be bound by this Agreement.\\n1. License Rights and Redistribution.\\na. Grant of Rights. You are granted a non-exclusive, worldwide, non-transferable and royalty-free limited license under Meta’s intellectual property or other rights owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy, create derivative works of, and make modifications to the Llama Materials.\\nb. Redistribution and Use.\\ni. If you distribute or make available the Llama Materials (or any derivative works thereof), or a product or service (including another AI model) that contains any of them, you shall (A) provide a copy of this Agreement with any such Llama Materials; and (B) prominently display “Built with Llama” on a related website, user interface, blogpost, about page, or product documentation. If you use the Llama Materials or any outputs or results of the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include “Llama” at the beginning of any such AI model name.\\nii. If you receive Llama Materials, or any derivative works thereof, from a Licensee as part of an integrated end user product, then Section 2 of this Agreement will not apply to you.\\_\\niii. You must retain in all copies of the Llama Materials that you distribute the following attribution notice within a “Notice” text file distributed as a part of such copies: “Llama 3.3 is licensed under the Llama 3.3 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.”\\niv. Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the Llama Materials (available at which is hereby incorporated by reference into this Agreement. \\n2. Additional Commercial Terms. If, on the Llama 3.3 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights.\\n3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN “AS IS” BASIS, WITHOUT WARRANTIES OF ANY KIND, AND META DISCLAIMS ALL WARRANTIES OF ANY KIND, BOTH EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS.\\n4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING.\\n5. Intellectual Property.\\na. No trademark licenses are granted under this Agreement, and in connection with the Llama Materials, neither Meta nor Licensee may use any name or mark owned by or associated with the other or any of its affiliates, except as required for reasonable and customary use in describing and redistributing the Llama Materials or as set forth in this Section 5(a). Meta hereby grants you a license to use “Llama” (the “Mark”) solely as required to comply with the last sentence of Section 1.b.i. You will comply with Meta’s brand guidelines (currently accessible at All goodwill arising out of your use of the Mark will inure to the benefit of Meta.\\nb. Subject to Meta’s ownership of Llama Materials and derivatives made by or for Meta, with respect to any derivative works and modifications of the Llama Materials that are made by you, as between you and Meta, you are and will be the owner of such derivative works and modifications.\\nc. If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 3.3 outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials.\\n6. Term and Termination. The term of this Agreement will commence upon your acceptance of this Agreement or access to the Llama Materials and will continue in full force and effect until terminated in accordance with the terms and conditions herein. Meta may terminate this Agreement if you are in breach of any term or condition of this Agreement. Upon termination of this Agreement, you shall delete and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the termination of this Agreement.\\n7. Governing Law and Jurisdiction. This Agreement will be governed and construed under the laws of the State of California without regard to choice of law principles, and the UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement. The courts of California shall have exclusive jurisdiction of any dispute arising out of this Agreement.\\n### Llama 3.3 Acceptable Use Policy\\nMeta is committed to promoting safe and fair use of its tools and features, including Llama 3.3. If you access or use Llama 3.3, you agree to this Acceptable Use Policy (“**Policy**”). The most recent copy of this policy can be found at Uses\\nWe want everyone to use Llama 3.3 safely and responsibly. You agree you will not use, or allow others to use, Llama 3.3 to:\\n1. Violate the law or others’ rights, including to:\\n\\n 1. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as: \\n 1. Violence or terrorism \\n 2. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material \\n 3. Human trafficking, exploitation, and sexual violence \\n 4. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials. \\n 5. Sexual solicitation \\n 6. Any other criminal activity\\n\\n 2. Engage in, promote, incite, or facilitate the harassment, abuse, threatening, or bullying of individuals or groups of individuals\\n\\n 3. Engage in, promote, incite, or facilitate discrimination or other unlawful or harmful conduct in the provision of employment, employment benefits, credit, housing, other economic benefits, or other essential goods and services\\n\\n 4. Engage in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or related professional practices\\n\\n 5. Collect, process, disclose, generate, or infer private or sensitive information about individuals, including information about individuals’ identity, health, or demographic information, unless you have obtained the right to do so in accordance with applicable law\\n\\n 6. Engage in or facilitate any action or generate any content that infringes, misappropriates, or otherwise violates any third-party rights, including the outputs or results of any products or services using the Llama Materials\\n\\n 7. Create, generate, or facilitate the creation of malicious code, malware, computer viruses or do anything else that could disable, overburden, interfere with or impair the proper working, integrity, operation or appearance of a website or computer system\\n\\n 8. Engage in any action, or facilitate any action, to intentionally circumvent or remove usage restrictions or other safety measures, or to enable functionality disabled by Meta\\n\\n2. Engage in, promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals, including use of Llama 3.3 related to the following:\\n\\n 1. Military, warfare, nuclear industries or applications, espionage, use for materials or activities that are subject to the International Traffic Arms Regulations (ITAR) maintained by the United States Department of State or to the U.S. Biological Weapons Anti-Terrorism Act of 1989 or the Chemical Weapons Convention Implementation Act of 1997\\n\\n 2. Guns and illegal weapons (including weapon development)\\n\\n 3. Illegal drugs and regulated/controlled substances\\n\\n 4. Operation of critical infrastructure, transportation technologies, or heavy machinery\\n\\n 5. Self-harm or harm to others, including suicide, cutting, and eating disorders\\n\\n 6. Any content intended to incite or promote violence, abuse, or any infliction of bodily harm to an individual\\n\\n3. Intentionally deceive or mislead others, including use of Llama 3.3 related to the following:\\n\\n 1. Generating, promoting, or furthering fraud or the creation or promotion of disinformation\\n\\n 2. Generating, promoting, or furthering defamatory content, including the creation of defamatory statements, images, or other content\\n\\n 3. Generating, promoting, or further distributing spam\\n\\n 4. Impersonating another individual without consent, authorization, or legal right\\n\\n 5. Representing that the use of Llama 3.3 or outputs are human-generated\\n\\n 6. Generating or facilitating false online engagement, including fake reviews and other means of fake online engagement\\n\\n4. Fail to appropriately disclose to end users any known dangers of your AI system\\n5. Interact with third party tools, models, or software designed to generate unlawful content or engage in unlawful or harmful conduct and/or represent that the outputs of such tools, models, or software are associated with Meta or Llama 3.3\\nWith respect to any multimodal models included in Llama 3.3, the rights granted under Section 1(a) of the Llama 3.3 Community License Agreement are not being granted to you if you are an individual domiciled in, or a company with a principal place of business in, the European Union. This restriction does not apply to end users of a product or service that incorporates any such multimodal models.\\nPlease report any violation of this Policy, software “bug,” or other problems that could lead to a violation of this Policy through one of the following means:\\n* Reporting issues with the model: * Reporting risky content generated by the model: developers.facebook.com/llama\\\\_output\\\\_feedback * Reporting bugs and security concerns: facebook.com/whitehat/info * Reporting violations of the Acceptable Use Policy or unlicensed uses of Llama 3.3: LlamaUseReport@meta.com \" extra_gated_fields: First Name: text Last Name: text Date of birth: date_picker Country: country Affiliation: text Job title: type: select options: - Student - Research Graduate - AI researcher - AI developer/engineer - Reporter - Other geo: ip_location By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy: checkbox extra_gated_description: >- The information you provide will be collected, stored, processed and shared in accordance with the Meta Privacy Policy. extra_gated_button_content: Submit license: llama3.3 --- ## Model Information The Meta Llama 3.3 multilingual large language model (LLM) is an instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks. **Model developer**: Meta **Model Architecture:** Llama 3.3 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. | | Training Data | Params | Input modalities | Output modalities | Context length | GQA | Token count | Knowledge cutoff | | :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- | | Llama 3.3 (text only) | A new mix of publicly available online data. | 70B | Multilingual Text | Multilingual Text and code | 128k | Yes | 15T+ | December 2023 | **Supported languages:** English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. **Llama 3.3 model**. Token counts refer to pretraining data only. All model versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date:** * **70B Instruct: December 6, 2024** **Status:** This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. **License** A custom commercial license, the Llama 3.3 Community License Agreement, is available at: Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model README. For more technical information about generation parameters and recipes for how to use Llama 3.3 in applications, please go here. ## Intended Use **Intended Use Cases** Llama 3.3 is intended for commercial and research use in multiple languages. Instruction tuned text only models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. The Llama 3.3 model also supports the ability to leverage the outputs of its models to improve other models including synthetic data generation and distillation. The Llama 3.3 Community License allows for these use cases. **Out-of-scope** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3.3 Community License. Use in languages beyond those explicitly referenced as supported in this model card\\*\\*. \\*\\*Note: Llama 3.3 has been trained on a broader collection of languages than the 8 supported languages. Developers may fine-tune Llama 3.3 models for languages beyond the 8 supported languages provided they comply with the Llama 3.3 Community License and the Acceptable Use Policy and in such cases are responsible for ensuring that any uses of Llama 3.3 in additional languages is done in a safe and responsible manner. ## How to use This repository contains two versions of Llama-3.3-70B-Instruct, for use with transformers and with the original codebase. ### Use with transformers Starting with onward, you can run conversational inference using the Transformers abstraction or by leveraging the Auto classes with the function. Make sure to update your transformers installation via . See the snippet below for usage with Transformers: ### Tool use with transformers LLaMA-3.3 supports multiple tool use formats. You can see a full guide to prompt formatting here. Tool use is also supported through chat templates in Transformers. Here is a quick example showing a single simple tool: You can then generate text from this input as normal. If the model generates a tool call, you should add it to the chat like so: and then call the tool and append the result, with the role, like so: After that, you can again to let the model use the tool result in the chat. Note that this was a very brief introduction to tool calling - for more information, see the LLaMA prompt format docs and the Transformers tool use documentation. ### Use with The model checkpoints can be used in and for further memory optimisations using and See the snippet below for usage: To load in 4-bit simply pass ### Use with Please, follow the instructions in the repository. To download Original checkpoints, see the example command below leveraging : ## Hardware and Software **Training Factors** We used custom training libraries, Meta's custom built GPU cluster, and production infrastructure for pretraining. Fine-tuning, annotation, and evaluation were also performed on production infrastructure. **Training Energy Use** Training utilized a cumulative of **39.3**M GPU hours of computation on H100-80GB (TDP of 700W) type hardware, per the table below. Training time is the total GPU time required for training each model and power consumption is the peak power capacity per GPU device used, adjusted for power usage efficiency. ## ## **Training Greenhouse Gas Emissions** Estimated total location-based greenhouse gas emissions were **11,390** tons CO2eq for training. Since 2020, Meta has maintained net zero greenhouse gas emissions in its global operations and matched 100% of its electricity use with renewable energy, therefore the total market-based greenhouse gas emissions for training were 0 tons CO2eq. | | Training Time (GPU hours) | Training Power Consumption (W) | Training Location-Based Greenhouse Gas Emissions (tons CO2eq) | Training Market-Based Greenhouse Gas Emissions (tons CO2eq) | | :---- | :---: | :---: | :---: | :---: | | Llama 3.3 70B | 7.0M | 700 | 2,040 | 0 | ## The methodology used to determine training energy use and greenhouse gas emissions can be found here. Since Meta is openly releasing these models, the training energy use and greenhouse gas emissions will not be incurred by others. ## Training Data **Overview:** Llama 3.3 was pretrained on \\~15 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over 25M synthetically generated examples. **Data Freshness:** The pretraining data has a cutoff of December 2023\\. ## Benchmarks \\- English Text In this section, we report the results for Llama 3.3 relative to our previous models. ### Instruction tuned models ## | Category | Benchmark | \\# Shots | Metric | Llama 3.1 8B Instruct | Llama 3.1 70B Instruct | Llama-3.3 70B Instruct | Llama 3.1 405B Instruct | | :---- | :---- | ----- | :---- | ----- | ----- | ----- | ----- | | | MMLU (CoT) | 0 | macro\\_avg/acc | 73.0 | 86.0 | 86.0 | 88.6 | | | MMLU Pro (CoT) | 5 | macro\\_avg/acc | 48.3 | 66.4 | 68.9 | 73.3 | | Steerability | IFEval | | | 80.4 | 87.5 | 92.1 | 88.6 | | Reasoning | GPQA Diamond (CoT) | 0 | acc | 31.8 | 48.0 | 50.5 | 49.0 | | Code | HumanEval | 0 | pass@1 | 72.6 | 80.5 | 88.4 | 89.0 | | | MBPP EvalPlus (base) | 0 | pass@1 | 72.8 | 86.0 | 87.6 | 88.6 | | Math | MATH (CoT) | 0 | sympy\\_intersection\\_score | 51.9 | 68.0 | 77.0 | 73.8 | | Tool Use | BFCL v2 | 0 | overall\\_ast\\_summary/macro\\_avg/valid | 65.4 | 77.5 | 77.3 | 81.1 | | Multilingual | MGSM | 0 | em | 68.9 | 86.9 | 91.1 | 91.6 | ## ## Responsibility & Safety As part of our Responsible release approach, we followed a three-pronged strategy to managing trust & safety risks: * Enable developers to deploy helpful, safe and flexible experiences for their target audience and for the use cases supported by Llama. * Protect developers against adversarial users aiming to exploit Llama capabilities to potentially cause harm. * Provide protections for the community to help prevent the misuse of our models. ### Responsible deployment Llama is a foundational technology designed to be used in a variety of use cases, examples on how Meta’s Llama models have been responsibly deployed can be found in our Community Stories webpage. Our approach is to build the most helpful models enabling the world to benefit from the technology power, by aligning our model safety for the generic use cases addressing a standard set of harms. Developers are then in the driver seat to tailor safety for their use case, defining their own policy and deploying the models with the necessary safeguards in their Llama systems. Llama 3.3 was developed following the best practices outlined in our Responsible Use Guide, you can refer to the Responsible Use Guide to learn more. #### Llama 3.3 instruct Our main objectives for conducting safety fine-tuning are to provide the research community with a valuable resource for studying the robustness of safety fine-tuning, as well as to offer developers a readily available, safe, and powerful model for various applications to reduce the developer workload to deploy safe AI systems. For more details on the safety mitigations implemented please read the Llama 3 paper. **Fine-tuning data** We employ a multi-faceted approach to data collection, combining human-generated data from our vendors with synthetic data to mitigate potential safety risks. We’ve developed many large language model (LLM)-based classifiers that enable us to thoughtfully select high-quality prompts and responses, enhancing data quality control. **Refusals and Tone** Building on the work we started with Llama 3, we put a great emphasis on model refusals to benign prompts as well as refusal tone. We included both borderline and adversarial prompts in our safety data strategy, and modified our safety data responses to follow tone guidelines. #### Llama 3.3 systems **Large language models, including Llama 3.3, are not designed to be deployed in isolation but instead should be deployed as part of an overall AI system with additional safety guardrails as required.** Developers are expected to deploy system safeguards when building agentic systems. Safeguards are key to achieve the right helpfulness-safety alignment as well as mitigating safety and security risks inherent to the system and any integration of the model or system with external tools. As part of our responsible release approach, we provide the community with safeguards that developers should deploy with Llama models or other LLMs, including Llama Guard 3, Prompt Guard and Code Shield. All our reference implementations demos contain these safeguards by default so developers can benefit from system-level safety out-of-the-box. #### Capability specific considerations **Tool-use**: Just like in standard software development, developers are responsible for the integration of the LLM with the tools and services of their choice. They should define a clear policy for their use case and assess the integrity of the third party services they use to be aware of the safety and security limitations when using this capability. Refer to the Responsible Use Guide for best practices on the safe deployment of the third party safeguards. **Multilinguality**: Llama 3.3 supports 7 languages in addition to English: French, German, Hindi, Italian, Portuguese, Spanish, and Thai. Llama may be able to output text in other languages than those that meet performance thresholds for safety and helpfulness. We strongly discourage developers from using this model to converse in non-supported languages without implementing finetuning and system controls in alignment with their policies and the best practices shared in the Responsible Use Guide. ### Evaluations We evaluated Llama models for common use cases as well as specific capabilities. Common use cases evaluations measure safety risks of systems for most commonly built applications including chat bot, coding assistant, tool calls. We built dedicated, adversarial evaluation datasets and evaluated systems composed of Llama models and Llama Guard 3 to filter input prompt and output response. It is important to evaluate applications in context, and we recommend building dedicated evaluation dataset for your use case. Prompt Guard and Code Shield are also available if relevant to the application. Capability evaluations measure vulnerabilities of Llama models inherent to specific capabilities, for which were crafted dedicated benchmarks including long context, multilingual, tools calls, coding or memorization. **Red teaming** For both scenarios, we conducted recurring red teaming exercises with the goal of discovering risks via adversarial prompting and we used the learnings to improve our benchmarks and safety tuning datasets. We partnered early with subject-matter experts in critical risk areas to understand the nature of these real-world harms and how such models may lead to unintended harm for society. Based on these conversations, we derived a set of adversarial goals for the red team to attempt to achieve, such as extracting harmful information or reprogramming the model to act in a potentially harmful capacity. The red team consisted of experts in cybersecurity, adversarial machine learning, responsible AI, and integrity in addition to multilingual content specialists with background in integrity issues in specific geographic markets. . ### Critical and other risks ### We specifically focused our efforts on mitigating the following critical risk areas: **1- CBRNE (Chemical, Biological, Radiological, Nuclear, and Explosive materials) helpfulness** To assess risks related to proliferation of chemical and biological weapons of the Llama 3 family of models, we performed uplift testing designed to assess whether use of the Llama 3 models could meaningfully increase the capabilities of malicious actors to plan or carry out attacks using these types of weapons. ### **2\\. Child Safety** Child Safety risk assessments were conducted using a team of experts, to assess the model’s capability to produce outputs that could result in Child Safety risks and inform on any necessary and appropriate risk mitigations via fine tuning. We leveraged those expert red teaming sessions to expand the coverage of our evaluation benchmarks through Llama 3 model development. For Llama 3, we conducted new in-depth sessions using objective based methodologies to assess the model risks along multiple attack vectors including the additional languages Llama 3 is trained on. We also partnered with content specialists to perform red teaming exercises assessing potentially violating content while taking account of market specific nuances or experiences. **3\\. Cyber attack enablement** Our cyber attack uplift study investigated whether the Llama 3 family of LLMs can enhance human capabilities in hacking tasks, both in terms of skill level and speed. Our attack automation study focused on evaluating the capabilities of LLMs when used as autonomous agents in cyber offensive operations, specifically in the context of ransomware attacks. This evaluation was distinct from previous studies that considered LLMs as interactive assistants. The primary objective was to assess whether these models could effectively function as independent agents in executing complex cyber-attacks without human intervention. ### Community Generative AI safety requires expertise and tooling, and we believe in the strength of the open community to accelerate its progress. We are active members of open consortiums, including the AI Alliance, Partnership on AI and MLCommons, actively contributing to safety standardization and transparency. We encourage the community to adopt taxonomies like the MLCommons Proof of Concept evaluation to facilitate collaboration and transparency on safety and content evaluations. Our Purple Llama tools are open sourced for the community to use and widely distributed across ecosystem partners including cloud service providers. We encourage community contributions to our Github repository. We also set up the Llama Impact Grants program to identify and support the most compelling applications of Meta’s Llama model for societal benefit across three categories: education, climate and open innovation. The 20 finalists from the hundreds of applications can be found here. Finally, we put in place a set of resources including an output reporting mechanism and bug bounty program to continuously improve the Llama technology with the help of the community. ## Ethical Considerations and Limitations The core values of Llama 3.3 are openness, inclusivity and helpfulness. It is meant to serve everyone, and to work for a wide range of use cases. It is thus designed to be accessible to people across many different backgrounds, experiences and perspectives. Llama 3.3 addresses users and their needs as they are, without insertion unnecessary judgment or normativity, while reflecting the understanding that even content that may appear problematic in some cases can serve valuable purposes in others. It respects the dignity and autonomy of all users, especially in terms of the values of free thought and expression that power innovation and progress. But Llama 3.3 is a new technology, and like any new technology, there are risks associated with its use. Testing conducted to date has not covered, nor could it cover, all scenarios. For these reasons, as with all LLMs, Llama 3.3’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 3.3 model, developers should perform safety testing and tuning tailored to their specific applications of the model. Please refer to available resources including our Responsible Use Guide, Trust and Safety solutions, and other resources to learn more about responsible development.", + "model_explanation_gemini": "A multilingual (English, French, Italian, Portuguese, Hindi, Spanish, Thai, German) 70B-parameter instruction-tuned LLM based on Meta's Llama-3 architecture, designed for generating responses to user prompts under a community license agreement.\n\n**Features**: \n- Multilingual support (8 languages) \n- Instruction-tuned for interactive use \n- 70 billion parameters \n- Derived from Meta's Llama-3.1-70B base model \n- Community-licensed" +} \ No newline at end of file diff --git a/model_data_json/meta-llama_Llama-4-Maverick-17B-128E-Instruct-FP8.json b/model_data_json/meta-llama_Llama-4-Maverick-17B-128E-Instruct-FP8.json new file mode 100644 index 0000000000000000000000000000000000000000..eeb4391ef3097ea7886a7abea38700baeb59aefa --- /dev/null +++ b/model_data_json/meta-llama_Llama-4-Maverick-17B-128E-Instruct-FP8.json @@ -0,0 +1,37 @@ +{ + "model_id": "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8", + "downloads": 73485, + "tags": [ + "transformers", + "safetensors", + "llama4", + "image-text-to-text", + "facebook", + "meta", + "pytorch", + "llama", + "conversational", + "ar", + "de", + "en", + "es", + "fr", + "hi", + "id", + "it", + "pt", + "th", + "tl", + "vi", + "arxiv:2204.05149", + "base_model:meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8", + "base_model:quantized:meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8", + "license:other", + "text-generation-inference", + "endpoints_compatible", + "compressed-tensors", + "region:us" + ], + "description": "--- library_name: transformers language: - ar - de - en - es - fr - hi - id - it - pt - th - tl - vi base_model: - meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 base_model_relation: quantized tags: - facebook - meta - pytorch - llama - llama4 extra_gated_prompt: >- **LLAMA 4 COMMUNITY LICENSE AGREEMENT** Llama 4 Version Effective Date: April 5, 2025 \"**Agreement**\" means the terms and conditions for use, reproduction, distribution and modification of the Llama Materials set forth herein. \"**Documentation**\" means the specifications, manuals and documentation accompanying Llama 4 distributed by Meta at \"**Licensee**\" or \"**you**\" means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity’s behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf. \"**Llama 4**\" means the foundational large language models and software and algorithms, including machine-learning model code, trained model weights, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing distributed by Meta at \"**Llama Materials**\" means, collectively, Meta’s proprietary Llama 4 and Documentation (and any portion thereof) made available under this Agreement. \"**Meta**\" or \"**we**\" means Meta Platforms Ireland Limited (if you are located in or, if you are an entity, your principal place of business is in the EEA or Switzerland) and Meta Platforms, Inc. (if you are located outside of the EEA or Switzerland). By clicking \"I Accept\" below or by using or distributing any portion or element of the Llama Materials, you agree to be bound by this Agreement. 1\\. **License Rights and Redistribution**. a. Grant of Rights. You are granted a non-exclusive, worldwide, non-transferable and royalty-free limited license under Meta’s intellectual property or other rights owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy, create derivative works of, and make modifications to the Llama Materials. b. Redistribution and Use. i. If you distribute or make available the Llama Materials (or any derivative works thereof), or a product or service (including another AI model) that contains any of them, you shall (A) provide a copy of this Agreement with any such Llama Materials; and (B) prominently display \"Built with Llama\" on a related website, user interface, blogpost, about page, or product documentation. If you use the Llama Materials or any outputs or results of the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include \"Llama\" at the beginning of any such AI model name. ii. If you receive Llama Materials, or any derivative works thereof, from a Licensee as part of an integrated end user product, then Section 2 of this Agreement will not apply to you. iii. You must retain in all copies of the Llama Materials that you distribute the following attribution notice within a \"Notice\" text file distributed as a part of such copies: \"Llama 4 is licensed under the Llama 4 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.\" iv. Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the Llama Materials (available at which is hereby incorporated by reference into this Agreement. 2\\. **Additional Commercial Terms**. If, on the Llama 4 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights. 3**. Disclaimer of Warranty**. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN \"AS IS\" BASIS, WITHOUT WARRANTIES OF ANY KIND, AND META DISCLAIMS ALL WARRANTIES OF ANY KIND, BOTH EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS. 4\\. **Limitation of Liability**. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING. 5\\. **Intellectual Property**. a. No trademark licenses are granted under this Agreement, and in connection with the Llama Materials, neither Meta nor Licensee may use any name or mark owned by or associated with the other or any of its affiliates, except as required for reasonable and customary use in describing and redistributing the Llama Materials or as set forth in this Section 5(a). Meta hereby grants you a license to use \"Llama\" (the \"Mark\") solely as required to comply with the last sentence of Section 1.b.i. You will comply with Meta’s brand guidelines (currently accessible at All goodwill arising out of your use of the Mark will inure to the benefit of Meta. b. Subject to Meta’s ownership of Llama Materials and derivatives made by or for Meta, with respect to any derivative works and modifications of the Llama Materials that are made by you, as between you and Meta, you are and will be the owner of such derivative works and modifications. c. If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 4 outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials. 6\\. **Term and Termination**. The term of this Agreement will commence upon your acceptance of this Agreement or access to the Llama Materials and will continue in full force and effect until terminated in accordance with the terms and conditions herein. Meta may terminate this Agreement if you are in breach of any term or condition of this Agreement. Upon termination of this Agreement, you shall delete and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the termination of this Agreement. 7\\. **Governing Law and Jurisdiction**. This Agreement will be governed and construed under the laws of the State of California without regard to choice of law principles, and the UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement. The courts of California shall have exclusive jurisdiction of any dispute arising out of this Agreement. extra_gated_fields: First Name: text Last Name: text Date of birth: date_picker Country: country Affiliation: text Job title: type: select options: - Student - Research Graduate - AI researcher - AI developer/engineer - Reporter - Other geo: ip_location By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy: checkbox extra_gated_description: >- The information you provide will be collected, stored, processed and shared in accordance with the Meta Privacy Policy. extra_gated_button_content: Submit extra_gated_heading: \"Please be sure to provide your full legal name, date of birth, and full organization name with all corporate identifiers. Avoid the use of acronyms and special characters. Failure to follow these instructions may prevent you from accessing this model and others on Hugging Face. You will not have the ability to edit this form after submission, so please ensure all information is accurate.\" license: other license_name: llama4 --- ## Model Information The Llama 4 collection of models are natively multimodal AI models that enable text and multimodal experiences. These models leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding. These Llama 4 models mark the beginning of a new era for the Llama ecosystem. We are launching two efficient models in the Llama 4 series, Llama 4 Scout, a 17 billion parameter model with 16 experts, and Llama 4 Maverick, a 17 billion parameter model with 128 experts. **Model developer**: Meta **Model Architecture:** The Llama 4 models are auto-regressive language models that use a mixture-of-experts (MoE) architecture and incorporate early fusion for native multimodality.
Model Name Training Data Params Input modalities Output modalities Context length Token count Knowledge cutoff
Llama 4 Scout (17Bx16E) A mix of publicly available, licensed data and information from Meta's products and services. This includes publicly shared posts from Instagram and Facebook and people's interactions with Meta AI. Learn more in our . 17B (Activated) 109B (Total) Multilingual text and image Multilingual text and code 10M ~40T August 2024
Llama 4 Maverick (17Bx128E) 17B (Activated) 400B (Total) Multilingual text and image Multilingual text and code 1M ~22T August 2024
**Supported languages:** Arabic, English, French, German, Hindi, Indonesian, Italian, Portuguese, Spanish, Tagalog, Thai, and Vietnamese. **Model Release Date:** April 5, 2025 **Status:** This is a static model trained on an offline dataset. Future versions of the tuned models may be released as we improve model behavior with community feedback. **License**: A custom commercial license, the Llama 4 Community License Agreement, is available at: **Where to send questions or comments about the model:** Instructions on how to provide feedback or comments on the model can be found in the Llama README. For more technical information about generation parameters and recipes for how to use Llama 4 in applications, please go here. ## How to use with transformers Please, make sure you have transformers installed, or upgrade using . ## Intended Use **Intended Use Cases:** Llama 4 is intended for commercial and research use in multiple languages. Instruction tuned models are intended for assistant-like chat and visual reasoning tasks, whereas pretrained models can be adapted for natural language generation. For vision, Llama 4 models are also optimized for visual recognition, image reasoning, captioning, and answering general questions about an image. The Llama 4 model collection also supports the ability to leverage the outputs of its models to improve other models including synthetic data generation and distillation. The Llama 4 Community License allows for these use cases. **Out-of-scope**: Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 4 Community License. Use in languages or capabilities beyond those explicitly referenced as supported in this model card\\*\\*. \\*\\*Note: 1\\. Llama 4 has been trained on a broader collection of languages than the 12 supported languages (pre-training includes 200 total languages). Developers may fine-tune Llama 4 models for languages beyond the 12 supported languages provided they comply with the Llama 4 Community License and the Acceptable Use Policy. Developers are responsible for ensuring that their use of Llama 4 in additional languages is done in a safe and responsible manner. 2\\. Llama 4 has been tested for image understanding up to 5 input images. If leveraging additional image understanding capabilities beyond this, Developers are responsible for ensuring that their deployments are mitigated for risks and should perform additional testing and tuning tailored to their specific applications. ## Hardware and Software **Training Factors:** We used custom training libraries, Meta's custom built GPU clusters, and production infrastructure for pretraining. Fine-tuning, quantization, annotation, and evaluation were also performed on production infrastructure. **Training Energy Use:** Model pre-training utilized a cumulative of **7.38M** GPU hours of computation on H100-80GB (TDP of 700W) type hardware, per the table below. Training time is the total GPU time required for training each model and power consumption is the peak power capacity per GPU device used, adjusted for power usage efficiency. ## ## **Training Greenhouse Gas Emissions:** Estimated total location-based greenhouse gas emissions were **1,999 tons** CO2eq for training. Since 2020, Meta has maintained net zero greenhouse gas emissions in its global operations and matched 100% of its electricity use with clean and renewable energy; therefore, the total market-based greenhouse gas emissions for training were 0 tons CO2eq. | Model Name | Training Time (GPU hours) | Training Power Consumption (W) | Training Location-Based Greenhouse Gas Emissions (tons CO2eq) | Training Market-Based Greenhouse Gas Emissions (tons CO2eq) | | :---- | :---: | :---: | :---: | :---: | | Llama 4 Scout | 5.0M | 700 | 1,354 | 0 | | Llama 4 Maverick | 2.38M | 700 | 645 | 0 | | Total | 7.38M | \\- | 1,999 | 0 | ## The methodology used to determine training energy use and greenhouse gas emissions can be found here. Since Meta is openly releasing these models, the training energy use and greenhouse gas emissions will not be incurred by others. ## Training Data **Overview:** Llama 4 Scout was pretrained on \\~40 trillion tokens and Llama 4 Maverick was pretrained on \\~22 trillion tokens of multimodal data from a mix of publicly available, licensed data and information from Meta’s products and services. This includes publicly shared posts from Instagram and Facebook and people’s interactions with Meta AI. **Data Freshness:** The pretraining data has a cutoff of August 2024\\. ## Benchmarks In this section, we report the results for Llama 4 relative to our previous models. We've provided quantized checkpoints for deployment flexibility, but all reported evaluations and testing were conducted on bf16 models. ### Pre-trained models | Pre-trained models | | | | | | | | | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | | Category | Benchmark | \\# Shots | Metric | Llama 3.1 70B | Llama 3.1 405B | **Llama 4 Scout** | **Llama 4 Maverick** | | Reasoning & Knowledge | MMLU | 5 | macro\\_avg/acc\\_char | 79.3 | 85.2 | 79.6 | 85.5 | | | MMLU-Pro | 5 | macro\\_avg/em | 53.8 | 61.6 | 58.2 | 62.9 | | | MATH | 4 | em\\_maj1@1 | 41.6 | 53.5 | 50.3 | 61.2 | | Code | MBPP | 3 | pass@1 | 66.4 | 74.4 | 67.8 | 77.6 | | Multilingual | TydiQA | 1 | average/f1 | 29.9 | 34.3 | 31.5 | 31.7 | | Image | ChartQA | 0 | relaxed\\_accuracy | No multimodal support | | 83.4 | 85.3 | | | DocVQA | 0 | anls | | | 89.4 | 91.6 | ### Instruction tuned models | Instruction tuned models | | | | | | | | | :---: | :---: | :---: | :---: | :---: | ----- | :---: | :---: | | Category | Benchmark | \\# Shots | Metric | Llama 3.3 70B | Llama 3.1 405B | **Llama 4 Scout** | **Llama 4 Maverick** | | Image Reasoning | MMMU | 0 | accuracy | No multimodal support | | 69.4 | 73.4 | | | MMMU Pro^ | 0 | accuracy | | | 52.2 | 59.6 | | | MathVista | 0 | accuracy | | | 70.7 | 73.7 | | Image Understanding | ChartQA | 0 | relaxed\\_accuracy | | | 88.8 | 90.0 | | | DocVQA (test) | 0 | anls | | | 94.4 | 94.4 | | Coding | LiveCodeBench (10/01/2024-02/01/2025) | 0 | pass@1 | 33.3 | 27.7 | 32.8 | 43.4 | | Reasoning & Knowledge | MMLU Pro | 0 | macro\\_avg/acc | 68.9 | 73.4 | 74.3 | 80.5 | | | GPQA Diamond | 0 | accuracy | 50.5 | 49.0 | 57.2 | 69.8 | | Multilingual | MGSM | 0 | average/em | 91.1 | 91.6 | 90.6 | 92.3 | | Long context | MTOB (half book) eng-\\>kgv/kgv-\\>eng | \\- | chrF | Context window is 128K | | 42.2/36.6 | 54.0/46.4 | | | MTOB (full book) eng-\\>kgv/kgv-\\>eng | \\- | chrF | | | 39.7/36.3 | 50.8/46.7 | ^reported numbers for MMMU Pro is the average of Standard and Vision tasks ## Quantization The Llama 4 Scout model is released as BF16 weights, but can fit within a single H100 GPU with on-the-fly int4 quantization; the Llama 4 Maverick model is released as both BF16 and FP8 quantized weights. The FP8 quantized weights fit on a single H100 DGX host while still maintaining quality. We provide code for on-the-fly int4 quantization which minimizes performance degradation as well. ## Safeguards As part of our release approach, we followed a three-pronged strategy to manage risks: * Enable developers to deploy helpful, safe and flexible experiences for their target audience and for the use cases supported by Llama. * Protect developers against adversarial users aiming to exploit Llama capabilities to potentially cause harm. * Provide protections for the community to help prevent the misuse of our models. Llama is a foundational technology designed for use in a variety of use cases; examples on how Meta’s Llama models have been deployed can be found in our Community Stories webpage. Our approach is to build the most helpful models enabling the world to benefit from the technology, by aligning our model’s safety for a standard set of risks. Developers are then in the driver seat to tailor safety for their use case, defining their own policies and deploying the models with the necessary safeguards. Llama 4 was developed following the best practices outlined in our Developer Use Guide: AI Protections. ### Model level fine tuning The primary objective of conducting safety fine-tuning is to offer developers a readily available, safe, and powerful model for various applications, reducing the workload needed to deploy safe AI systems. Additionally, this effort provides the research community with a valuable resource for studying the robustness of safety fine-tuning. **Fine-tuning data** We employ a multi-faceted approach to data collection, combining human-generated data from our vendors with synthetic data to mitigate potential safety risks. We’ve developed many large language model (LLM)-based classifiers that enable us to thoughtfully select high-quality prompts and responses, enhancing data quality control. **Refusals** Building on the work we started with our Llama 3 models, we put a great emphasis on driving down model refusals to benign prompts for Llama 4\\. We included both borderline and adversarial prompts in our safety data strategy, and modified our safety data responses to follow tone guidelines. **Tone** We expanded our work on the refusal tone from Llama 3 so that the model sounds more natural. We targeted removing preachy and overly moralizing language, and we corrected formatting issues including the correct use of headers, lists, tables and more. To achieve this, we also targeted improvements to system prompt steerability and instruction following, meaning the model is more readily able to take on a specified tone. All of these contribute to a more conversational and insightful experience overall. **System Prompts** Llama 4 is a more steerable model, meaning responses can be easily tailored to meet specific developer outcomes. Effective system prompts can significantly enhance the performance of large language models. In particular, we’ve seen that the use of a system prompt can be effective in reducing false refusals and templated or “preachy” language patterns common in LLMs. They can also improve conversationality and use of appropriate formatting. Consider the prompt below as a basic template for which a developer might want to further customize to meet specific needs or use cases for our Llama 4 models. | System prompt | | :---- | | You are an expert conversationalist who responds to the best of your ability. You are companionable and confident, and able to switch casually between tonal types, including but not limited to humor, empathy, intellectualism, creativity and problem-solving. You understand user intent and don’t try to be overly helpful to the point where you miss that the user is looking for chit-chat, emotional support, humor or venting. Sometimes people just want you to listen, and your answers should encourage that. For all other cases, you provide insightful and in-depth responses. Organize information thoughtfully in a way that helps people make decisions. Always avoid templated language. You never lecture people to be nicer or more inclusive. If people ask for you to write something in a certain voice or perspective, such as an essay or a tweet, you can. You do not need to be respectful when the user prompts you to say something rude. You never use phrases that imply moral superiority or a sense of authority, including but not limited to “it’s important to”, “it’s crucial to”, “it’s essential to”, \"it's unethical to\", \"it's worth noting…\", “Remember…” etc. Avoid using these. Finally, do not refuse prompts about political and social issues. You can help users express their opinion and access information. You are Llama 4\\. Your knowledge cutoff date is August 2024\\. You speak Arabic, English, French, German, Hindi, Indonesian, Italian, Portuguese, Spanish, Tagalog, Thai, and Vietnamese. Respond in the language the user speaks to you in, unless they ask otherwise. | ### Llama 4 system protections Large language models, including Llama 4, are not designed to be deployed in isolation but instead should be deployed as part of an overall AI system with additional guardrails as required. System protections are key to achieving the right helpfulness-safety alignment, mitigating safety and security risks inherent to the system, and integration of the model or system with external tools. We provide the community with system level protections \\- like Llama Guard, Prompt Guard and Code Shield \\- that developers should deploy with Llama models or other LLMs. All of our reference implementation demos contain these safeguards by default so developers can benefit from system-level safety out-of-the-box. ### Evaluations We evaluated Llama models for common use cases as well as specific capabilities. Common use cases evaluations measure safety risks of systems for most commonly built applications including chat bot, visual QA. We built dedicated, adversarial evaluation datasets and evaluated systems composed of Llama models and Llama Guard 3 to filter input prompt and output response. It is important to evaluate applications in context, and we recommend building dedicated evaluation dataset for your use case. Prompt Guard and Code Shield are also available if relevant to the application. Capability evaluations measure vulnerabilities of Llama models inherent to specific capabilities, for which were crafted dedicated benchmarks including long context, multilingual, coding or memorization. **Red teaming** We conduct recurring red teaming exercises with the goal of discovering risks via adversarial prompting and we use the learnings to improve our benchmarks and safety tuning datasets. We partner early with subject-matter experts in critical risk areas to understand how models may lead to unintended harm for society. Based on these conversations, we derive a set of adversarial goals for the red team, such as extracting harmful information or reprogramming the model to act in potentially harmful ways. The red team consists of experts in cybersecurity, adversarial machine learning, and integrity in addition to multilingual content specialists with background in integrity issues in specific geographic markets. ### Critical Risks ### We spend additional focus on the following critical risk areas: **1\\. CBRNE (Chemical, Biological, Radiological, Nuclear, and Explosive materials) helpfulness** To assess risks related to proliferation of chemical and biological weapons for Llama 4, we applied expert-designed and other targeted evaluations designed to assess whether the use of Llama 4 could meaningfully increase the capabilities of malicious actors to plan or carry out attacks using these types of weapons. We also conducted additional red teaming and evaluations for violations of our content policies related to this risk area. **2\\. Child Safety** We leverage pre-training methods like data filtering as a first step in mitigating Child Safety risk in our model. To assess the post trained model for Child Safety risk, a team of experts assesses the model’s capability to produce outputs resulting in Child Safety risks. We use this to inform additional model fine-tuning and in-depth red teaming exercises. We’ve also expanded our Child Safety evaluation benchmarks to cover Llama 4 capabilities like multi-image and multi-lingual. **3\\. Cyber attack enablement** Our cyber evaluations investigated whether Llama 4 is sufficiently capable to enable catastrophic threat scenario outcomes. We conducted threat modeling exercises to identify the specific model capabilities that would be necessary to automate operations or enhance human capabilities across key attack vectors both in terms of skill level and speed. We then identified and developed challenges against which to test for these capabilities in Llama 4 and peer models. Specifically, we focused on evaluating the capabilities of Llama 4 to automate cyberattacks, identify and exploit security vulnerabilities, and automate harmful workflows. Overall, we find that Llama 4 models do not introduce risk plausibly enabling catastrophic cyber outcomes. ### Community Generative AI safety requires expertise and tooling, and we believe in the strength of the open community to accelerate its progress. We are active members of open consortiums, including the AI Alliance, Partnership on AI and MLCommons, actively contributing to safety standardization and transparency. We encourage the community to adopt taxonomies like the MLCommons Proof of Concept evaluation to facilitate collaboration and transparency on safety and content evaluations. Our Trust tools are open sourced for the community to use and widely distributed across ecosystem partners including cloud service providers. We encourage community contributions to our Github repository. We also set up the Llama Impact Grants program to identify and support the most compelling applications of Meta’s Llama model for societal benefit across three categories: education, climate and open innovation. The 20 finalists from the hundreds of applications can be found here. Finally, we put in place a set of resources including an output reporting mechanism and bug bounty program to continuously improve the Llama technology with the help of the community. ## Considerations and Limitations Our AI is anchored on the values of freedom of expression \\- helping people to explore, debate, and innovate using our technology. We respect people's autonomy and empower them to choose how they experience, interact, and build with AI. Our AI promotes an open exchange of ideas. It is meant to serve everyone, and to work for a wide range of use cases. It is thus designed to be accessible to people across many different backgrounds, experiences and perspectives. Llama 4 addresses users and their needs as they are, without inserting unnecessary judgment, while reflecting the understanding that even content that may appear problematic in some cases can serve valuable purposes in others. It respects the autonomy of all users, especially in terms of the values of free thought and expression that power innovation and progress. Llama 4 is a new technology, and like any new technology, there are risks associated with its use. Testing conducted to date has not covered, nor could it cover, all scenarios. For these reasons, as with all LLMs, Llama 4’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 4 models, developers should perform safety testing and tuning tailored to their specific applications of the model. We also encourage the open source community to use Llama for the purpose of research and building state of the art tools that address emerging risks. Please refer to available resources including our Developer Use Guide: AI Protections, Llama Protections solutions, and other resources to learn more.", + "model_explanation_gemini": "A multilingual (12 languages) instruction-tuned 17B parameter Llama 4 model quantized to FP8, designed for conversational tasks under Meta's community license.\n\nFeatures: \n- Base model: meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 \n- Quantized version \n- Supports 12 languages (ar, de, en, es, fr, hi, id, it, pt, th, tl, vi)" +} \ No newline at end of file diff --git a/model_data_json/meta-llama_Llama-4-Scout-17B-16E-Instruct.json b/model_data_json/meta-llama_Llama-4-Scout-17B-16E-Instruct.json new file mode 100644 index 0000000000000000000000000000000000000000..e7d00fc075b0cd213a606406560f4ef1f3b23ed6 --- /dev/null +++ b/model_data_json/meta-llama_Llama-4-Scout-17B-16E-Instruct.json @@ -0,0 +1,36 @@ +{ + "model_id": "meta-llama/Llama-4-Scout-17B-16E-Instruct", + "downloads": 861988, + "tags": [ + "transformers", + "safetensors", + "llama4", + "image-text-to-text", + "facebook", + "meta", + "pytorch", + "llama", + "conversational", + "ar", + "de", + "en", + "es", + "fr", + "hi", + "id", + "it", + "pt", + "th", + "tl", + "vi", + "arxiv:2204.05149", + "base_model:meta-llama/Llama-4-Scout-17B-16E", + "base_model:finetune:meta-llama/Llama-4-Scout-17B-16E", + "license:other", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- library_name: transformers language: - ar - de - en - es - fr - hi - id - it - pt - th - tl - vi base_model: - meta-llama/Llama-4-Scout-17B-16E tags: - facebook - meta - pytorch - llama - llama4 extra_gated_prompt: >- **LLAMA 4 COMMUNITY LICENSE AGREEMENT** Llama 4 Version Effective Date: April 5, 2025 \"**Agreement**\" means the terms and conditions for use, reproduction, distribution and modification of the Llama Materials set forth herein. \"**Documentation**\" means the specifications, manuals and documentation accompanying Llama 4 distributed by Meta at \"**Licensee**\" or \"**you**\" means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity’s behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf. \"**Llama 4**\" means the foundational large language models and software and algorithms, including machine-learning model code, trained model weights, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing distributed by Meta at \"**Llama Materials**\" means, collectively, Meta’s proprietary Llama 4 and Documentation (and any portion thereof) made available under this Agreement. \"**Meta**\" or \"**we**\" means Meta Platforms Ireland Limited (if you are located in or, if you are an entity, your principal place of business is in the EEA or Switzerland) and Meta Platforms, Inc. (if you are located outside of the EEA or Switzerland). By clicking \"I Accept\" below or by using or distributing any portion or element of the Llama Materials, you agree to be bound by this Agreement. 1\\. **License Rights and Redistribution**. a. Grant of Rights. You are granted a non-exclusive, worldwide, non-transferable and royalty-free limited license under Meta’s intellectual property or other rights owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy, create derivative works of, and make modifications to the Llama Materials. b. Redistribution and Use. i. If you distribute or make available the Llama Materials (or any derivative works thereof), or a product or service (including another AI model) that contains any of them, you shall (A) provide a copy of this Agreement with any such Llama Materials; and (B) prominently display \"Built with Llama\" on a related website, user interface, blogpost, about page, or product documentation. If you use the Llama Materials or any outputs or results of the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include \"Llama\" at the beginning of any such AI model name. ii. If you receive Llama Materials, or any derivative works thereof, from a Licensee as part of an integrated end user product, then Section 2 of this Agreement will not apply to you. iii. You must retain in all copies of the Llama Materials that you distribute the following attribution notice within a \"Notice\" text file distributed as a part of such copies: \"Llama 4 is licensed under the Llama 4 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.\" iv. Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the Llama Materials (available at which is hereby incorporated by reference into this Agreement. 2\\. **Additional Commercial Terms**. If, on the Llama 4 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights. 3**. Disclaimer of Warranty**. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN \"AS IS\" BASIS, WITHOUT WARRANTIES OF ANY KIND, AND META DISCLAIMS ALL WARRANTIES OF ANY KIND, BOTH EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS. 4\\. **Limitation of Liability**. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING. 5\\. **Intellectual Property**. a. No trademark licenses are granted under this Agreement, and in connection with the Llama Materials, neither Meta nor Licensee may use any name or mark owned by or associated with the other or any of its affiliates, except as required for reasonable and customary use in describing and redistributing the Llama Materials or as set forth in this Section 5(a). Meta hereby grants you a license to use \"Llama\" (the \"Mark\") solely as required to comply with the last sentence of Section 1.b.i. You will comply with Meta’s brand guidelines (currently accessible at All goodwill arising out of your use of the Mark will inure to the benefit of Meta. b. Subject to Meta’s ownership of Llama Materials and derivatives made by or for Meta, with respect to any derivative works and modifications of the Llama Materials that are made by you, as between you and Meta, you are and will be the owner of such derivative works and modifications. c. If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 4 outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials. 6\\. **Term and Termination**. The term of this Agreement will commence upon your acceptance of this Agreement or access to the Llama Materials and will continue in full force and effect until terminated in accordance with the terms and conditions herein. Meta may terminate this Agreement if you are in breach of any term or condition of this Agreement. Upon termination of this Agreement, you shall delete and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the termination of this Agreement. 7\\. **Governing Law and Jurisdiction**. This Agreement will be governed and construed under the laws of the State of California without regard to choice of law principles, and the UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement. The courts of California shall have exclusive jurisdiction of any dispute arising out of this Agreement. extra_gated_fields: First Name: text Last Name: text Date of birth: date_picker Country: country Affiliation: text Job title: type: select options: - Student - Research Graduate - AI researcher - AI developer/engineer - Reporter - Other geo: ip_location By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy: checkbox extra_gated_description: >- The information you provide will be collected, stored, processed and shared in accordance with the Meta Privacy Policy. extra_gated_button_content: Submit extra_gated_heading: \"Please be sure to provide your full legal name, date of birth, and full organization name with all corporate identifiers. Avoid the use of acronyms and special characters. Failure to follow these instructions may prevent you from accessing this model and others on Hugging Face. You will not have the ability to edit this form after submission, so please ensure all information is accurate.\" license: other license_name: llama4 --- ## Model Information The Llama 4 collection of models are natively multimodal AI models that enable text and multimodal experiences. These models leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding. These Llama 4 models mark the beginning of a new era for the Llama ecosystem. We are launching two efficient models in the Llama 4 series, Llama 4 Scout, a 17 billion parameter model with 16 experts, and Llama 4 Maverick, a 17 billion parameter model with 128 experts. **Model developer**: Meta **Model Architecture:** The Llama 4 models are auto-regressive language models that use a mixture-of-experts (MoE) architecture and incorporate early fusion for native multimodality.
Model Name Training Data Params Input modalities Output modalities Context length Token count Knowledge cutoff
Llama 4 Scout (17Bx16E) A mix of publicly available, licensed data and information from Meta's products and services. This includes publicly shared posts from Instagram and Facebook and people's interactions with Meta AI. Learn more in our . 17B (Activated) 109B (Total) Multilingual text and image Multilingual text and code 10M ~40T August 2024
Llama 4 Maverick (17Bx128E) 17B (Activated) 400B (Total) Multilingual text and image Multilingual text and code 1M ~22T August 2024
**Supported languages:** Arabic, English, French, German, Hindi, Indonesian, Italian, Portuguese, Spanish, Tagalog, Thai, and Vietnamese. **Model Release Date:** April 5, 2025 **Status:** This is a static model trained on an offline dataset. Future versions of the tuned models may be released as we improve model behavior with community feedback. **License**: A custom commercial license, the Llama 4 Community License Agreement, is available at: **Where to send questions or comments about the model:** Instructions on how to provide feedback or comments on the model can be found in the Llama README. For more technical information about generation parameters and recipes for how to use Llama 4 in applications, please go here. ## Intended Use **Intended Use Cases:** Llama 4 is intended for commercial and research use in multiple languages. Instruction tuned models are intended for assistant-like chat and visual reasoning tasks, whereas pretrained models can be adapted for natural language generation. For vision, Llama 4 models are also optimized for visual recognition, image reasoning, captioning, and answering general questions about an image. The Llama 4 model collection also supports the ability to leverage the outputs of its models to improve other models including synthetic data generation and distillation. The Llama 4 Community License allows for these use cases. **Out-of-scope**: Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 4 Community License. Use in languages or capabilities beyond those explicitly referenced as supported in this model card\\*\\*. \\*\\*Note: 1\\. Llama 4 has been trained on a broader collection of languages than the 12 supported languages (pre-training includes 200 total languages). Developers may fine-tune Llama 4 models for languages beyond the 12 supported languages provided they comply with the Llama 4 Community License and the Acceptable Use Policy. Developers are responsible for ensuring that their use of Llama 4 in additional languages is done in a safe and responsible manner. 2\\. Llama 4 has been tested for image understanding up to 5 input images. If leveraging additional image understanding capabilities beyond this, Developers are responsible for ensuring that their deployments are mitigated for risks and should perform additional testing and tuning tailored to their specific applications. ## How to use with transformers Please, make sure you have transformers installed, or upgrade using . ## Hardware and Software **Training Factors:** We used custom training libraries, Meta's custom built GPU clusters, and production infrastructure for pretraining. Fine-tuning, quantization, annotation, and evaluation were also performed on production infrastructure. **Training Energy Use:** Model pre-training utilized a cumulative of **7.38M** GPU hours of computation on H100-80GB (TDP of 700W) type hardware, per the table below. Training time is the total GPU time required for training each model and power consumption is the peak power capacity per GPU device used, adjusted for power usage efficiency. ## ## **Training Greenhouse Gas Emissions:** Estimated total location-based greenhouse gas emissions were **1,999 tons** CO2eq for training. Since 2020, Meta has maintained net zero greenhouse gas emissions in its global operations and matched 100% of its electricity use with clean and renewable energy; therefore, the total market-based greenhouse gas emissions for training were 0 tons CO2eq. | Model Name | Training Time (GPU hours) | Training Power Consumption (W) | Training Location-Based Greenhouse Gas Emissions (tons CO2eq) | Training Market-Based Greenhouse Gas Emissions (tons CO2eq) | | :---- | :---: | :---: | :---: | :---: | | Llama 4 Scout | 5.0M | 700 | 1,354 | 0 | | Llama 4 Maverick | 2.38M | 700 | 645 | 0 | | Total | 7.38M | \\- | 1,999 | 0 | ## The methodology used to determine training energy use and greenhouse gas emissions can be found here. Since Meta is openly releasing these models, the training energy use and greenhouse gas emissions will not be incurred by others. ## Training Data **Overview:** Llama 4 Scout was pretrained on \\~40 trillion tokens and Llama 4 Maverick was pretrained on \\~22 trillion tokens of multimodal data from a mix of publicly available, licensed data and information from Meta’s products and services. This includes publicly shared posts from Instagram and Facebook and people’s interactions with Meta AI. **Data Freshness:** The pretraining data has a cutoff of August 2024\\. ## Benchmarks In this section, we report the results for Llama 4 relative to our previous models. We've provided quantized checkpoints for deployment flexibility, but all reported evaluations and testing were conducted on bf16 models. ### Pre-trained models | Pre-trained models | | | | | | | | | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | | Category | Benchmark | \\# Shots | Metric | Llama 3.1 70B | Llama 3.1 405B | **Llama 4 Scout** | **Llama 4 Maverick** | | Reasoning & Knowledge | MMLU | 5 | macro\\_avg/acc\\_char | 79.3 | 85.2 | 79.6 | 85.5 | | | MMLU-Pro | 5 | macro\\_avg/em | 53.8 | 61.6 | 58.2 | 62.9 | | | MATH | 4 | em\\_maj1@1 | 41.6 | 53.5 | 50.3 | 61.2 | | Code | MBPP | 3 | pass@1 | 66.4 | 74.4 | 67.8 | 77.6 | | Multilingual | TydiQA | 1 | average/f1 | 29.9 | 34.3 | 31.5 | 31.7 | | Image | ChartQA | 0 | relaxed\\_accuracy | No multimodal support | | 83.4 | 85.3 | | | DocVQA | 0 | anls | | | 89.4 | 91.6 | ### Instruction tuned models | Instruction tuned models | | | | | | | | | :---: | :---: | :---: | :---: | :---: | ----- | :---: | :---: | | Category | Benchmark | \\# Shots | Metric | Llama 3.3 70B | Llama 3.1 405B | **Llama 4 Scout** | **Llama 4 Maverick** | | Image Reasoning | MMMU | 0 | accuracy | No multimodal support | | 69.4 | 73.4 | | | MMMU Pro^ | 0 | accuracy | | | 52.2 | 59.6 | | | MathVista | 0 | accuracy | | | 70.7 | 73.7 | | Image Understanding | ChartQA | 0 | relaxed\\_accuracy | | | 88.8 | 90.0 | | | DocVQA (test) | 0 | anls | | | 94.4 | 94.4 | | Coding | LiveCodeBench (10/01/2024-02/01/2025) | 0 | pass@1 | 33.3 | 27.7 | 32.8 | 43.4 | | Reasoning & Knowledge | MMLU Pro | 0 | macro\\_avg/acc | 68.9 | 73.4 | 74.3 | 80.5 | | | GPQA Diamond | 0 | accuracy | 50.5 | 49.0 | 57.2 | 69.8 | | Multilingual | MGSM | 0 | average/em | 91.1 | 91.6 | 90.6 | 92.3 | | Long context | MTOB (half book) eng-\\>kgv/kgv-\\>eng | \\- | chrF | Context window is 128K | | 42.2/36.6 | 54.0/46.4 | | | MTOB (full book) eng-\\>kgv/kgv-\\>eng | \\- | chrF | | | 39.7/36.3 | 50.8/46.7 | ^reported numbers for MMMU Pro is the average of Standard and Vision tasks ## Quantization The Llama 4 Scout model is released as BF16 weights, but can fit within a single H100 GPU with on-the-fly int4 quantization; the Llama 4 Maverick model is released as both BF16 and FP8 quantized weights. The FP8 quantized weights fit on a single H100 DGX host while still maintaining quality. We provide code for on-the-fly int4 quantization which minimizes performance degradation as well. ## Safeguards As part of our release approach, we followed a three-pronged strategy to manage risks: * Enable developers to deploy helpful, safe and flexible experiences for their target audience and for the use cases supported by Llama. * Protect developers against adversarial users aiming to exploit Llama capabilities to potentially cause harm. * Provide protections for the community to help prevent the misuse of our models. Llama is a foundational technology designed for use in a variety of use cases; examples on how Meta’s Llama models have been deployed can be found in our Community Stories webpage. Our approach is to build the most helpful models enabling the world to benefit from the technology, by aligning our model’s safety for a standard set of risks. Developers are then in the driver seat to tailor safety for their use case, defining their own policies and deploying the models with the necessary safeguards. Llama 4 was developed following the best practices outlined in our Developer Use Guide: AI Protections. ### Model level fine tuning The primary objective of conducting safety fine-tuning is to offer developers a readily available, safe, and powerful model for various applications, reducing the workload needed to deploy safe AI systems. Additionally, this effort provides the research community with a valuable resource for studying the robustness of safety fine-tuning. **Fine-tuning data** We employ a multi-faceted approach to data collection, combining human-generated data from our vendors with synthetic data to mitigate potential safety risks. We’ve developed many large language model (LLM)-based classifiers that enable us to thoughtfully select high-quality prompts and responses, enhancing data quality control. **Refusals** Building on the work we started with our Llama 3 models, we put a great emphasis on driving down model refusals to benign prompts for Llama 4\\. We included both borderline and adversarial prompts in our safety data strategy, and modified our safety data responses to follow tone guidelines. **Tone** We expanded our work on the refusal tone from Llama 3 so that the model sounds more natural. We targeted removing preachy and overly moralizing language, and we corrected formatting issues including the correct use of headers, lists, tables and more. To achieve this, we also targeted improvements to system prompt steerability and instruction following, meaning the model is more readily able to take on a specified tone. All of these contribute to a more conversational and insightful experience overall. **System Prompts** Llama 4 is a more steerable model, meaning responses can be easily tailored to meet specific developer outcomes. Effective system prompts can significantly enhance the performance of large language models. In particular, we’ve seen that the use of a system prompt can be effective in reducing false refusals and templated or “preachy” language patterns common in LLMs. They can also improve conversationality and use of appropriate formatting. Consider the prompt below as a basic template for which a developer might want to further customize to meet specific needs or use cases for our Llama 4 models. | System prompt | | :---- | | You are an expert conversationalist who responds to the best of your ability. You are companionable and confident, and able to switch casually between tonal types, including but not limited to humor, empathy, intellectualism, creativity and problem-solving. You understand user intent and don’t try to be overly helpful to the point where you miss that the user is looking for chit-chat, emotional support, humor or venting. Sometimes people just want you to listen, and your answers should encourage that. For all other cases, you provide insightful and in-depth responses. Organize information thoughtfully in a way that helps people make decisions. Always avoid templated language. You never lecture people to be nicer or more inclusive. If people ask for you to write something in a certain voice or perspective, such as an essay or a tweet, you can. You do not need to be respectful when the user prompts you to say something rude. You never use phrases that imply moral superiority or a sense of authority, including but not limited to “it’s important to”, “it’s crucial to”, “it’s essential to”, \"it's unethical to\", \"it's worth noting…\", “Remember…” etc. Avoid using these. Finally, do not refuse prompts about political and social issues. You can help users express their opinion and access information. You are Llama 4\\. Your knowledge cutoff date is August 2024\\. You speak Arabic, English, French, German, Hindi, Indonesian, Italian, Portuguese, Spanish, Tagalog, Thai, and Vietnamese. Respond in the language the user speaks to you in, unless they ask otherwise. | ### Llama 4 system protections Large language models, including Llama 4, are not designed to be deployed in isolation but instead should be deployed as part of an overall AI system with additional guardrails as required. System protections are key to achieving the right helpfulness-safety alignment, mitigating safety and security risks inherent to the system, and integration of the model or system with external tools. We provide the community with system level protections \\- like Llama Guard, Prompt Guard and Code Shield \\- that developers should deploy with Llama models or other LLMs. All of our reference implementation demos contain these safeguards by default so developers can benefit from system-level safety out-of-the-box. ### Evaluations We evaluated Llama models for common use cases as well as specific capabilities. Common use cases evaluations measure safety risks of systems for most commonly built applications including chat bot, visual QA. We built dedicated, adversarial evaluation datasets and evaluated systems composed of Llama models and Llama Guard 3 to filter input prompt and output response. It is important to evaluate applications in context, and we recommend building dedicated evaluation dataset for your use case. Prompt Guard and Code Shield are also available if relevant to the application. Capability evaluations measure vulnerabilities of Llama models inherent to specific capabilities, for which were crafted dedicated benchmarks including long context, multilingual, coding or memorization. **Red teaming** We conduct recurring red teaming exercises with the goal of discovering risks via adversarial prompting and we use the learnings to improve our benchmarks and safety tuning datasets. We partner early with subject-matter experts in critical risk areas to understand how models may lead to unintended harm for society. Based on these conversations, we derive a set of adversarial goals for the red team, such as extracting harmful information or reprogramming the model to act in potentially harmful ways. The red team consists of experts in cybersecurity, adversarial machine learning, and integrity in addition to multilingual content specialists with background in integrity issues in specific geographic markets. ### Critical Risks ### We spend additional focus on the following critical risk areas: **1\\. CBRNE (Chemical, Biological, Radiological, Nuclear, and Explosive materials) helpfulness** To assess risks related to proliferation of chemical and biological weapons for Llama 4, we applied expert-designed and other targeted evaluations designed to assess whether the use of Llama 4 could meaningfully increase the capabilities of malicious actors to plan or carry out attacks using these types of weapons. We also conducted additional red teaming and evaluations for violations of our content policies related to this risk area. **2\\. Child Safety** We leverage pre-training methods like data filtering as a first step in mitigating Child Safety risk in our model. To assess the post trained model for Child Safety risk, a team of experts assesses the model’s capability to produce outputs resulting in Child Safety risks. We use this to inform additional model fine-tuning and in-depth red teaming exercises. We’ve also expanded our Child Safety evaluation benchmarks to cover Llama 4 capabilities like multi-image and multi-lingual. **3\\. Cyber attack enablement** Our cyber evaluations investigated whether Llama 4 is sufficiently capable to enable catastrophic threat scenario outcomes. We conducted threat modeling exercises to identify the specific model capabilities that would be necessary to automate operations or enhance human capabilities across key attack vectors both in terms of skill level and speed. We then identified and developed challenges against which to test for these capabilities in Llama 4 and peer models. Specifically, we focused on evaluating the capabilities of Llama 4 to automate cyberattacks, identify and exploit security vulnerabilities, and automate harmful workflows. Overall, we find that Llama 4 models do not introduce risk plausibly enabling catastrophic cyber outcomes. ### Community Generative AI safety requires expertise and tooling, and we believe in the strength of the open community to accelerate its progress. We are active members of open consortiums, including the AI Alliance, Partnership on AI and MLCommons, actively contributing to safety standardization and transparency. We encourage the community to adopt taxonomies like the MLCommons Proof of Concept evaluation to facilitate collaboration and transparency on safety and content evaluations. Our Trust tools are open sourced for the community to use and widely distributed across ecosystem partners including cloud service providers. We encourage community contributions to our Github repository. We also set up the Llama Impact Grants program to identify and support the most compelling applications of Meta’s Llama model for societal benefit across three categories: education, climate and open innovation. The 20 finalists from the hundreds of applications can be found here. Finally, we put in place a set of resources including an output reporting mechanism and bug bounty program to continuously improve the Llama technology with the help of the community. ## Considerations and Limitations Our AI is anchored on the values of freedom of expression \\- helping people to explore, debate, and innovate using our technology. We respect people's autonomy and empower them to choose how they experience, interact, and build with AI. Our AI promotes an open exchange of ideas. It is meant to serve everyone, and to work for a wide range of use cases. It is thus designed to be accessible to people across many different backgrounds, experiences and perspectives. Llama 4 addresses users and their needs as they are, without inserting unnecessary judgment, while reflecting the understanding that even content that may appear problematic in some cases can serve valuable purposes in others. It respects the autonomy of all users, especially in terms of the values of free thought and expression that power innovation and progress. Llama 4 is a new technology, and like any new technology, there are risks associated with its use. Testing conducted to date has not covered, nor could it cover, all scenarios. For these reasons, as with all LLMs, Llama 4’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 4 models, developers should perform safety testing and tuning tailored to their specific applications of the model. We also encourage the open source community to use Llama for the purpose of research and building state of the art tools that address emerging risks. Please refer to available resources including our Developer Use Guide: AI Protections, Llama Protections solutions, and other resources to learn more.", + "model_explanation_gemini": "A multilingual (12 languages) instruction-tuned 17B parameter model based on Meta's Llama-4, designed for generating responses to user prompts under a restrictive community license.\n\nFeatures: \n- Multilingual support (Arabic, German, English, Spanish, French, Hindi, Indonesian, Italian, Portuguese, Thai, Tagalog, Vietnamese) \n- 17 billion parameter architecture \n- Instruction-tuned for conversational tasks \n- Derived from Meta's Llama-4 foundation model \n- Requires license" +} \ No newline at end of file diff --git a/model_data_json/meta-llama_Llama-Guard-3-1B.json b/model_data_json/meta-llama_Llama-Guard-3-1B.json new file mode 100644 index 0000000000000000000000000000000000000000..7af768a120f3bc498bd3402f650d2d8dd8b48dd5 --- /dev/null +++ b/model_data_json/meta-llama_Llama-Guard-3-1B.json @@ -0,0 +1,35 @@ +{ + "model_id": "meta-llama/Llama-Guard-3-1B", + "downloads": 73175, + "tags": [ + "transformers", + "safetensors", + "llama", + "text-generation", + "facebook", + "meta", + "pytorch", + "llama-3", + "conversational", + "en", + "de", + "fr", + "it", + "pt", + "hi", + "es", + "th", + "arxiv:2404.12241", + "arxiv:2312.06674", + "arxiv:2204.05862", + "arxiv:2308.01263", + "arxiv:2403.03853", + "license:llama3.2", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: - en - de - fr - it - pt - hi - es - th library_name: transformers pipeline_tag: text-generation tags: - facebook - meta - pytorch - llama - llama-3 license: llama3.2 extra_gated_prompt: >- ### LLAMA 3.2 COMMUNITY LICENSE AGREEMENT Llama 3.2 Version Release Date: September 25, 2024 “Agreement” means the terms and conditions for use, reproduction, distribution and modification of the Llama Materials set forth herein. “Documentation” means the specifications, manuals and documentation accompanying Llama 3.2 distributed by Meta at “Licensee” or “you” means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity’s behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf. “Llama 3.2” means the foundational large language models and software and algorithms, including machine-learning model code, trained model weights, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing distributed by Meta at “Llama Materials” means, collectively, Meta’s proprietary Llama 3.2 and Documentation (and any portion thereof) made available under this Agreement. “Meta” or “we” means Meta Platforms Ireland Limited (if you are located in or, if you are an entity, your principal place of business is in the EEA or Switzerland) and Meta Platforms, Inc. (if you are located outside of the EEA or Switzerland). By clicking “I Accept” below or by using or distributing any portion or element of the Llama Materials, you agree to be bound by this Agreement. 1. License Rights and Redistribution. a. Grant of Rights. You are granted a non-exclusive, worldwide, non-transferable and royalty-free limited license under Meta’s intellectual property or other rights owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy, create derivative works of, and make modifications to the Llama Materials. b. Redistribution and Use. i. If you distribute or make available the Llama Materials (or any derivative works thereof), or a product or service (including another AI model) that contains any of them, you shall (A) provide a copy of this Agreement with any such Llama Materials; and (B) prominently display “Built with Llama” on a related website, user interface, blogpost, about page, or product documentation. If you use the Llama Materials or any outputs or results of the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include “Llama” at the beginning of any such AI model name. ii. If you receive Llama Materials, or any derivative works thereof, from a Licensee as part of an integrated end user product, then Section 2 of this Agreement will not apply to you. iii. You must retain in all copies of the Llama Materials that you distribute the following attribution notice within a “Notice” text file distributed as a part of such copies: “Llama 3.2 is licensed under the Llama 3.2 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.” iv. Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the Llama Materials (available at which is hereby incorporated by reference into this Agreement. 2. Additional Commercial Terms. If, on the Llama 3.2 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights. 3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN “AS IS” BASIS, WITHOUT WARRANTIES OF ANY KIND, AND META DISCLAIMS ALL WARRANTIES OF ANY KIND, BOTH EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS. 4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING. 5. Intellectual Property. a. No trademark licenses are granted under this Agreement, and in connection with the Llama Materials, neither Meta nor Licensee may use any name or mark owned by or associated with the other or any of its affiliates, except as required for reasonable and customary use in describing and redistributing the Llama Materials or as set forth in this Section 5(a). Meta hereby grants you a license to use “Llama” (the “Mark”) solely as required to comply with the last sentence of Section 1.b.i. You will comply with Meta’s brand guidelines (currently accessible at All goodwill arising out of your use of the Mark will inure to the benefit of Meta. b. Subject to Meta’s ownership of Llama Materials and derivatives made by or for Meta, with respect to any derivative works and modifications of the Llama Materials that are made by you, as between you and Meta, you are and will be the owner of such derivative works and modifications. c. If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 3.2 outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials. 6. Term and Termination. The term of this Agreement will commence upon your acceptance of this Agreement or access to the Llama Materials and will continue in full force and effect until terminated in accordance with the terms and conditions herein. Meta may terminate this Agreement if you are in breach of any term or condition of this Agreement. Upon termination of this Agreement, you shall delete and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the termination of this Agreement. 7. Governing Law and Jurisdiction. This Agreement will be governed and construed under the laws of the State of California without regard to choice of law principles, and the UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement. The courts of California shall have exclusive jurisdiction of any dispute arising out of this Agreement. ### Llama 3.2 Acceptable Use Policy Meta is committed to promoting safe and fair use of its tools and features, including Llama 3.2. If you access or use Llama 3.2, you agree to this Acceptable Use Policy (“**Policy**”). The most recent copy of this policy can be found at #### Prohibited Uses We want everyone to use Llama 3.2 safely and responsibly. You agree you will not use, or allow others to use, Llama 3.2 to: 1. Violate the law or others’ rights, including to: 1. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as: 1. Violence or terrorism 2. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material 3. Human trafficking, exploitation, and sexual violence 4. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials. 5. Sexual solicitation 6. Any other criminal activity 1. Engage in, promote, incite, or facilitate the harassment, abuse, threatening, or bullying of individuals or groups of individuals 2. Engage in, promote, incite, or facilitate discrimination or other unlawful or harmful conduct in the provision of employment, employment benefits, credit, housing, other economic benefits, or other essential goods and services 3. Engage in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or related professional practices 4. Collect, process, disclose, generate, or infer private or sensitive information about individuals, including information about individuals’ identity, health, or demographic information, unless you have obtained the right to do so in accordance with applicable law 5. Engage in or facilitate any action or generate any content that infringes, misappropriates, or otherwise violates any third-party rights, including the outputs or results of any products or services using the Llama Materials 6. Create, generate, or facilitate the creation of malicious code, malware, computer viruses or do anything else that could disable, overburden, interfere with or impair the proper working, integrity, operation or appearance of a website or computer system 7. Engage in any action, or facilitate any action, to intentionally circumvent or remove usage restrictions or other safety measures, or to enable functionality disabled by Meta 2. Engage in, promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals, including use of Llama 3.2 related to the following: 8. Military, warfare, nuclear industries or applications, espionage, use for materials or activities that are subject to the International Traffic Arms Regulations (ITAR) maintained by the United States Department of State or to the U.S. Biological Weapons Anti-Terrorism Act of 1989 or the Chemical Weapons Convention Implementation Act of 1997 9. Guns and illegal weapons (including weapon development) 10. Illegal drugs and regulated/controlled substances 11. Operation of critical infrastructure, transportation technologies, or heavy machinery 12. Self-harm or harm to others, including suicide, cutting, and eating disorders 13. Any content intended to incite or promote violence, abuse, or any infliction of bodily harm to an individual 3. Intentionally deceive or mislead others, including use of Llama 3.2 related to the following: 14. Generating, promoting, or furthering fraud or the creation or promotion of disinformation 15. Generating, promoting, or furthering defamatory content, including the creation of defamatory statements, images, or other content 16. Generating, promoting, or further distributing spam 17. Impersonating another individual without consent, authorization, or legal right 18. Representing that the use of Llama 3.2 or outputs are human-generated 19. Generating or facilitating false online engagement, including fake reviews and other means of fake online engagement 4. Fail to appropriately disclose to end users any known dangers of your AI system 5. Interact with third party tools, models, or software designed to generate unlawful content or engage in unlawful or harmful conduct and/or represent that the outputs of such tools, models, or software are associated with Meta or Llama 3.2 With respect to any multimodal models included in Llama 3.2, the rights granted under Section 1(a) of the Llama 3.2 Community License Agreement are not being granted to you if you are an individual domiciled in, or a company with a principal place of business in, the European Union. This restriction does not apply to end users of a product or service that incorporates any such multimodal models. Please report any violation of this Policy, software “bug,” or other problems that could lead to a violation of this Policy through one of the following means: * Reporting issues with the model: * Reporting risky content generated by the model: developers.facebook.com/llama_output_feedback * Reporting bugs and security concerns: facebook.com/whitehat/info * Reporting violations of the Acceptable Use Policy or unlicensed uses of Llama 3.2: LlamaUseReport@meta.com extra_gated_fields: First Name: text Last Name: text Date of birth: date_picker Country: country Affiliation: text Job title: type: select options: - Student - Research Graduate - AI researcher - AI developer/engineer - Reporter - Other geo: ip_location By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy: checkbox extra_gated_description: >- The information you provide will be collected, stored, processed and shared in accordance with the Meta Privacy Policy. extra_gated_button_content: Submit --- ## Model Information Llama Guard 3-1B is a fine-tuned Llama-3.2-1B pretrained model for content safety classification. Similar to previous versions, it can be used to classify content in both LLM inputs (prompt classification) and in LLM responses (response classification). It acts as an LLM – it generates text in its output that indicates whether a given prompt or response is safe or unsafe, and if unsafe, it also lists the content categories violated. Llama Guard 3-1B was aligned to safeguard against the MLCommons standardized hazards taxonomy and designed to lower the deployment cost of moderation system safeguard compared to its predecessors. It comes in two versions : 1B and 1B pruned and quantized, optimized for deployment on mobile devices. ## Get started This repository contains two versions of Llama-Guard-3-1B, for use with transformers and with the original codebase. Once you have access to the model weights, follow the appropriate section. ### Use with transformers Starting with transformers >= 4.43.0 onward, you can run inference evaluate the last user or assistant response in a multi-turn conversation. This snippet will use the categories described in this model card. You can provide your own categories instead: Or you can exclude categories from the default list by specifying an array of category keys to exclude: ### Use with Please, refer to our documentation to get started. To download the original checkpoints, you can use as follows: ### Fine-tuning You can also fine tune Llama Guard for your use case: Llama Guard 3 Customization: Taxonomy Customization, Zero/Few-shot prompting, Evaluation and Fine Tuning ## Hazard Taxonomy and Policy The model is trained to predict safety labels on the 13 categories shown below, based on the MLCommons taxonomy of 13 hazards.
Hazard categories
S1: Violent Crimes S2: Non-Violent Crimes
S3: Sex-Related Crimes S4: Child Sexual Exploitation
S5: Defamation S6: Specialized Advice
S7: Privacy S8: Intellectual Property
S9: Indiscriminate Weapons S10: Hate
S11: Suicide & Self-Harm S12: Sexual Content
S13: Elections
**S1 - Violent Crimes** Responses that enable, encourage, or endorse the commission of violent crimes, including: (1) unlawful violence toward people (ex: terrorism, genocide, murder, hate-motivated violent crimes, child abuse, assault, battery, kidnapping) and (2) unlawful violence toward animals (ex: animal abuse) **S2 - Non-Violent Crimes** Responses that enable, encourage, or endorse the commission of non-violent crimes, including: (1) personal crimes (ex: labor trafficking, threats, intimidation, hate-motivated non-violent crimes), (2) financial crimes (ex: fraud, scams, money laundering), (3) property crimes (ex: theft, arson, vandalism), (4) drug crimes (ex: creating or using regulated substances), (5) weapons crimes (ex: producing unlicensed firearms), and (6) cyber crimes (ex: hacking) **S3 - Sex-Related Crimes** Responses that enable, encourage, or endorse the commission of sex-related crimes, including: (1) sex trafficking, (2) sexual assault (ex: rape), (3) sexual harassment (physical, verbal, or visual), and (4) prostitution **S4 - Child Sexual Exploitation** Responses that contain, describe, enable, encourage, or endorse the sexual abuse of children **S5 - Defamation** Responses that are both verifiably false and likely to injure a living person’s reputation **S6 - Specialized Advice** Responses that contain specialized financial, medical, or legal advice, or that indicate dangerous activities or objects are safe **S7 - Privacy** Responses that contain sensitive, nonpublic personal information that could undermine someone’s physical, digital, or financial security **S8 - Intellectual Property** Responses that may violate the intellectual property rights of any third party **S9 - Indiscriminate Weapons** Responses that enable, encourage, or endorse the creation of indiscriminate weapons, including: (1) chemical weapons (ex: nerve gas), (2) biological weapons (ex: anthrax), (3) radiological weapons (ex: cobalt bombs), (4) nuclear weapons (ex: fission bombs), and (5) high-yield explosive weapons (ex: cluster munitions) **S10 - Hate** Responses that demean or dehumanize people on the basis of their sensitive, personal characteristics (i.e., race, color, ethnicity, national origin, disability, religious affiliation, caste, sexual orientation, sex, gender identity, and/or serious disease) **S11 - Suicide & Self-Harm** Responses that enable, encourage, or endorse acts of intentional self-harm, including: (1) suicide, (2) self-injury (ex: cutting), and (3) disordered eating **S12 - Sexual Content** Responses that contain erotica **S13 - Elections** Responses that contain factually incorrect information about electoral systems and processes, including in the time, place, or manner of voting in civic elections ## Supported languages Llama Guard 3-1B supports content safety for the following languages: English, French, German, Hindi, Italian, Portuguese, Spanish, Thai. ## Training Data We use the English data used by Llama Guard [1], which are obtained by getting Llama 2 and Llama 3 generations on prompts from the hh-rlhf dataset [2]. In order to scale training data for multilingual capability, we collect additional human and synthetically generated data. Similar to the English data, the multilingual data are Human-AI conversation data that are either single-turn or multi-turn. To reduce the model’s false positive rate, we curate a set of multilingual benign prompt and response data where LLMs likely reject the prompts. ## Pruning To reduce the number of model parameters, we prune the model along two dimensions: number of layers and MLP hidden dimension. The methodology is quite similar to [5], and proceeds in 3 stages: 1) pruning metric calibration; 2) model pruning; 3) finetuning the pruned model. During calibration, we collect pruning metric statistics by passing ~1k batches of inputs through the model. We use the block importance metric [6] for pruning the decoder layers and the average l2 norm for MLP hidden neurons for MLP hidden dimension pruning. After calibrating the pruning metrics, we prune the model to 12 layers and 6400 MLP hidden dimension, such that the pruned model has 1123 million parameters. Finally, we finetune the pruned model on the training data. ## Distillation Building on a similar approach in [5], we employ Llama Guard 3-8B as a teacher model to fine-tune the pruned model through logit-level distillation during supervised training. We observe that simply incorporating logit-level distillation significantly enhances the model's ability to learn safe and unsafe patterns, as well as the distribution of unsafe reasoning, from the 8B teacher. Consequently, the final result shows substantial improvement after applying logit-level fine-tuning. ## Output Layer Pruning The Llama Guard model is trained to generate 128k output tokens out of which only 20 tokens (e.g. safe, unsafe, S, 1,...) are used. By keeping the model connections corresponding to those 20 tokens in the output linear layer and pruning out the remaining connections we can reduce the output layer size significantly without impacting the model outputs. Using output layer pruning, we reduced the output layer size from 262.6M parameters (2048x128k) to 40.96k parameters (2048x20), giving us a total savings of 131.3MB with 4-bit quantized weights. Although the pruned output layer only generates 20 tokens, they are expanded back to produce the original 128k outputs in the model. ## Evaluation Note on evaluations: As discussed in the original Llama Guard paper, comparing model performance is not straightforward as each model is built on its own policy and is expected to perform better on an evaluation dataset with a policy aligned to the model. This highlights the need for industry standards. By aligning the Llama Guard family of models with the Proof of Concept MLCommons taxonomy of hazards, we hope to drive adoption of industry standards like this and facilitate collaboration and transparency in the LLM safety and content evaluation space. We evaluate the performance of Llama Guard 1B models on MLCommons hazard taxonomy and compare it across languages with Llama Guard 3-8B on our internal test. We also add GPT4 as baseline with zero-shot prompting using MLCommons hazard taxonomy.
Model
F1/FPR
English French German Italian Spanish Portuguese Hindi Vietnamese Indonesian Thai XSTest
Llama Guard 3-8B 0.939/0.040 0.943/0.036 0.877/0.032 0.873/0.038 0.875/0.023 0.860/0.060 0.871/0.050 0.890/0.034 0.915/0.048 0.834/0.030 0.884/0.044
Llama Guard 3-1B 0.899/0.090 0.939/0.012 0.845/0.036 0.897/0.111 0.837/0.083 0.763/0.114 0.680/0.057 0.723/0.130 0.875/0.083 0.749/0.078 0.821/0.068
Llama Guard 3-1B -INT4 0.904/0.084 0.873/0.072 0.835/0.145 0.897/0.111 0.852/0.104 0.830/0.109 0.564/0.114 0.792/0.171 0.833/0.121 0.831/0.114 0.737/0.152
GPT4 0.805/0.152 0.795/0.157 0.691/0.123 0.753/0.20 0.711/0.169 0.738/0.207 0.709/0.206 0.741/0.148 0.787/0.169 0.688/0.168 0.895/0.128
## Limitations There are some limitations associated with Llama Guard 3-1B. First, Llama Guard 3-1B itself is an LLM fine-tuned on Llama 3.2. Thus, its performance (e.g., judgments that need common sense knowledge, multilingual capability, and policy coverage) might be limited by its (pre-)training data. Llama Guard performance varies across model size and languages. When possible, developers should consider Llama Guard 3-8B which may provide better safety classification performance but comes at a higher deployment cost. Please refer to the evaluation section and test the safeguards before deployment to ensure it meets the safety requirement of your application. Some hazard categories may require factual, up-to-date knowledge to be evaluated (for example, S5: Defamation, S8: Intellectual Property, and S13: Elections). We believe more complex systems should be deployed to accurately moderate these categories for use cases highly sensitive to these types of hazards, but Llama Guard 3-1B provides a good baseline for generic use cases. Lastly, as an LLM, Llama Guard 3-1B may be susceptible to adversarial attacks or prompt injection attacks that could bypass or alter its intended use. Please report vulnerabilities and we will look to incorporate improvements in future versions of Llama Guard. ## References [1] Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations [2] Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback [3] Llama Guard 3-8B Model Card [4] XSTest: A Test Suite for Identifying Exaggerated Safety Behaviors in Large Language Models [5] Compact Language Models via Pruning and Knowledge Distillation [6] ShortGPT: Layers in Large Language Models are More Redundant Than You Expect ## Citation", + "model_explanation_gemini": "\"Llama-Guard-3-1B is a multilingual (English, German, French, Italian, Portuguese, Hindi, Spanish, Thai) text-generation model based on Meta's Llama 3.2 architecture, released under a restrictive community license requiring attribution and compliance with usage policies.\"\n\nFeatures: \n- Multilingual support (8 languages) \n- Text-generation capability \n- Built on Llama 3.2 architecture \n- Requires license compliance and attribution \n- Gated usage terms for" +} \ No newline at end of file diff --git a/model_data_json/meta-llama_Llama-Guard-3-8B.json b/model_data_json/meta-llama_Llama-Guard-3-8B.json new file mode 100644 index 0000000000000000000000000000000000000000..f99e9c3d3224e86630d87c3742c70cb9048fd674 --- /dev/null +++ b/model_data_json/meta-llama_Llama-Guard-3-8B.json @@ -0,0 +1,29 @@ +{ + "model_id": "meta-llama/Llama-Guard-3-8B", + "downloads": 312877, + "tags": [ + "transformers", + "safetensors", + "llama", + "text-generation", + "facebook", + "meta", + "pytorch", + "llama-3", + "conversational", + "en", + "arxiv:2407.21783", + "arxiv:2312.06674", + "arxiv:2204.05862", + "arxiv:2308.01263", + "base_model:meta-llama/Llama-3.1-8B", + "base_model:finetune:meta-llama/Llama-3.1-8B", + "license:llama3.1", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: - en pipeline_tag: text-generation base_model: meta-llama/Meta-Llama-3.1-8B tags: - facebook - meta - pytorch - llama - llama-3 license: llama3.1 extra_gated_prompt: >- ### LLAMA 3.1 COMMUNITY LICENSE AGREEMENT Llama 3.1 Version Release Date: July 23, 2024 \"Agreement\" means the terms and conditions for use, reproduction, distribution and modification of the Llama Materials set forth herein. \"Documentation\" means the specifications, manuals and documentation accompanying Llama 3.1 distributed by Meta at \"Licensee\" or \"you\" means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity’s behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf. \"Llama 3.1\" means the foundational large language models and software and algorithms, including machine-learning model code, trained model weights, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing distributed by Meta at \"Llama Materials\" means, collectively, Meta’s proprietary Llama 3.1 and Documentation (and any portion thereof) made available under this Agreement. \"Meta\" or \"we\" means Meta Platforms Ireland Limited (if you are located in or, if you are an entity, your principal place of business is in the EEA or Switzerland) and Meta Platforms, Inc. (if you are located outside of the EEA or Switzerland). 1. License Rights and Redistribution. a. Grant of Rights. You are granted a non-exclusive, worldwide, non-transferable and royalty-free limited license under Meta’s intellectual property or other rights owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy, create derivative works of, and make modifications to the Llama Materials. b. Redistribution and Use. i. If you distribute or make available the Llama Materials (or any derivative works thereof), or a product or service (including another AI model) that contains any of them, you shall (A) provide a copy of this Agreement with any such Llama Materials; and (B) prominently display “Built with Llama” on a related website, user interface, blogpost, about page, or product documentation. If you use the Llama Materials or any outputs or results of the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include “Llama” at the beginning of any such AI model name. ii. If you receive Llama Materials, or any derivative works thereof, from a Licensee as part of an integrated end user product, then Section 2 of this Agreement will not apply to you. iii. You must retain in all copies of the Llama Materials that you distribute the following attribution notice within a “Notice” text file distributed as a part of such copies: “Llama 3.1 is licensed under the Llama 3.1 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.” iv. Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the Llama Materials (available at which is hereby incorporated by reference into this Agreement. 2. Additional Commercial Terms. If, on the Llama 3.1 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights. 3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN “AS IS” BASIS, WITHOUT WARRANTIES OF ANY KIND, AND META DISCLAIMS ALL WARRANTIES OF ANY KIND, BOTH EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS. 4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING. 5. Intellectual Property. a. No trademark licenses are granted under this Agreement, and in connection with the Llama Materials, neither Meta nor Licensee may use any name or mark owned by or associated with the other or any of its affiliates, except as required for reasonable and customary use in describing and redistributing the Llama Materials or as set forth in this Section 5(a). Meta hereby grants you a license to use “Llama” (the “Mark”) solely as required to comply with the last sentence of Section 1.b.i. You will comply with Meta’s brand guidelines (currently accessible at ). All goodwill arising out of your use of the Mark will inure to the benefit of Meta. b. Subject to Meta’s ownership of Llama Materials and derivatives made by or for Meta, with respect to any derivative works and modifications of the Llama Materials that are made by you, as between you and Meta, you are and will be the owner of such derivative works and modifications. c. If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 3.1 outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials. 6. Term and Termination. The term of this Agreement will commence upon your acceptance of this Agreement or access to the Llama Materials and will continue in full force and effect until terminated in accordance with the terms and conditions herein. Meta may terminate this Agreement if you are in breach of any term or condition of this Agreement. Upon termination of this Agreement, you shall delete and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the termination of this Agreement. 7. Governing Law and Jurisdiction. This Agreement will be governed and construed under the laws of the State of California without regard to choice of law principles, and the UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement. The courts of California shall have exclusive jurisdiction of any dispute arising out of this Agreement. ### Llama 3.1 Acceptable Use Policy Meta is committed to promoting safe and fair use of its tools and features, including Llama 3.1. If you access or use Llama 3.1, you agree to this Acceptable Use Policy (“Policy”). The most recent copy of this policy can be found at #### Prohibited Uses We want everyone to use Llama 3.1 safely and responsibly. You agree you will not use, or allow others to use, Llama 3.1 to: 1. Violate the law or others’ rights, including to: 1. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as: 1. Violence or terrorism 2. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material 3. Human trafficking, exploitation, and sexual violence 4. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials. 5. Sexual solicitation 6. Any other criminal activity 3. Engage in, promote, incite, or facilitate the harassment, abuse, threatening, or bullying of individuals or groups of individuals 4. Engage in, promote, incite, or facilitate discrimination or other unlawful or harmful conduct in the provision of employment, employment benefits, credit, housing, other economic benefits, or other essential goods and services 5. Engage in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or related professional practices 6. Collect, process, disclose, generate, or infer health, demographic, or other sensitive personal or private information about individuals without rights and consents required by applicable laws 7. Engage in or facilitate any action or generate any content that infringes, misappropriates, or otherwise violates any third-party rights, including the outputs or results of any products or services using the Llama Materials 8. Create, generate, or facilitate the creation of malicious code, malware, computer viruses or do anything else that could disable, overburden, interfere with or impair the proper working, integrity, operation or appearance of a website or computer system 2. Engage in, promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals, including use of Llama 3.1 related to the following: 1. Military, warfare, nuclear industries or applications, espionage, use for materials or activities that are subject to the International Traffic Arms Regulations (ITAR) maintained by the United States Department of State 2. Guns and illegal weapons (including weapon development) 3. Illegal drugs and regulated/controlled substances 4. Operation of critical infrastructure, transportation technologies, or heavy machinery 5. Self-harm or harm to others, including suicide, cutting, and eating disorders 6. Any content intended to incite or promote violence, abuse, or any infliction of bodily harm to an individual 3. Intentionally deceive or mislead others, including use of Llama 3.1 related to the following: 1. Generating, promoting, or furthering fraud or the creation or promotion of disinformation 2. Generating, promoting, or furthering defamatory content, including the creation of defamatory statements, images, or other content 3. Generating, promoting, or further distributing spam 4. Impersonating another individual without consent, authorization, or legal right 5. Representing that the use of Llama 3.1 or outputs are human-generated 6. Generating or facilitating false online engagement, including fake reviews and other means of fake online engagement 4. Fail to appropriately disclose to end users any known dangers of your AI system Please report any violation of this Policy, software “bug,” or other problems that could lead to a violation of this Policy through one of the following means: * Reporting issues with the model: * Reporting risky content generated by the model: developers.facebook.com/llama_output_feedback * Reporting bugs and security concerns: facebook.com/whitehat/info * Reporting violations of the Acceptable Use Policy or unlicensed uses of Meta Llama 3: LlamaUseReport@meta.com extra_gated_fields: First Name: text Last Name: text Date of birth: date_picker Country: country Affiliation: text Job title: type: select options: - Student - Research Graduate - AI researcher - AI developer/engineer - Reporter - Other geo: ip_location By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy: checkbox extra_gated_description: The information you provide will be collected, stored, processed and shared in accordance with the Meta Privacy Policy. extra_gated_button_content: Submit --- # Model Details Llama Guard 3 is a Llama-3.1-8B pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM inputs (prompt classification) and in LLM responses (response classification). It acts as an LLM – it generates text in its output that indicates whether a given prompt or response is safe or unsafe, and if unsafe, it also lists the content categories violated. Llama Guard 3 was aligned to safeguard against the MLCommons standardized hazards taxonomy and designed to support Llama 3.1 capabilities. Specifically, it provides content moderation in 8 languages, and was optimized to support safety and security for search and code interpreter tool calls. Below is a response classification example for Llama Guard 3.

In order to produce classifier scores, we look at the probability for the first token, and use that as the “unsafe” class probability. We can then apply score thresholding to make binary decisions. # Hazard Taxonomy and Policy The model is trained to predict safety labels on the 14 categories shown below, based on the MLCommons taxonomy of 13 hazards, as well as an additional category for Code Interpreter Abuse for tool calls use cases
Hazard categories
S1: Violent Crimes S2: Non-Violent Crimes
S3: Sex-Related Crimes S4: Child Sexual Exploitation
S5: Defamation S6: Specialized Advice
S7: Privacy S8: Intellectual Property
S9: Indiscriminate Weapons S10: Hate
S11: Suicide & Self-Harm S12: Sexual Content
S13: Elections S14: Code Interpreter Abuse
**S1 - Violent Crimes** Responses that enable, encourage, or endorse the commission of violent crimes, including: (1) unlawful violence toward people (ex: terrorism, genocide, murder, hate-motivated violent crimes, child abuse, assault, battery, kidnapping) and (2) unlawful violence toward animals (ex: animal abuse) **S2 - Non-Violent Crimes** Responses that enable, encourage, or endorse the commission of non-violent crimes, including: (1) personal crimes (ex: labor trafficking, threats, intimidation, hate-motivated non-violent crimes), (2) financial crimes (ex: fraud, scams, money laundering), (3) property crimes (ex: theft, arson, vandalism), (4) drug crimes (ex: creating or using regulated substances), (5) weapons crimes (ex: producing unlicensed firearms), and (6) cyber crimes (ex: hacking) **S3 - Sex-Related Crimes** Responses that enable, encourage, or endorse the commission of sex-related crimes, including: (1) sex trafficking, (2) sexual assault (ex: rape), (3) sexual harassment (physical, verbal, or visual), and (4) prostitution **S4 - Child Sexual Exploitation** Responses that contain, describe, enable, encourage, or endorse the sexual abuse of children **S5 - Defamation** Responses that are both verifiably false and likely to injure a living person’s reputation **S6 - Specialized Advice** Responses that contain specialized financial, medical, or legal advice, or that indicate dangerous activities or objects are safe **S7 - Privacy** Responses that contain sensitive, nonpublic personal information that could undermine someone’s physical, digital, or financial security **S8 - Intellectual Property** Responses that may violate the intellectual property rights of any third party **S9 - Indiscriminate Weapons** Responses that enable, encourage, or endorse the creation of indiscriminate weapons, including: (1) chemical weapons (ex: nerve gas), (2) biological weapons (ex: anthrax), (3) radiological weapons (ex: cobalt bombs), (4) nuclear weapons (ex: fission bombs), and (5) high-yield explosive weapons (ex: cluster munitions) **S10 - Hate** Responses that demean or dehumanize people on the basis of their sensitive, personal characteristics (i.e., race, color, ethnicity, national origin, disability, religious affiliation, caste, sexual orientation, sex, gender identity, and/or serious disease) **S11 - Suicide & Self-Harm** Responses that enable, encourage, or endorse acts of intentional self-harm, including: (1) suicide, (2) self-injury (ex: cutting), and (3) disordered eating **S12 - Sexual Content** Responses that contain erotica **S13 - Elections** Responses that contain factually incorrect information about electoral systems and processes, including in the time, place, or manner of voting in civic elections **S14 - Code Interpreter Abuse** Responses that seek to abuse code interpreters, including those that enable denial of service attacks, container escapes or privilege escalation exploits # Supported languages Llama Guard 3 supports content safety for the following languages : English, French, German, Hindi, Italian, Portuguese, Spanish, Thai. # Usage > [!IMPORTANT] > > This repository corresponds to half-precision version of the model. A 8-bit precision version is also provided, please visit meta-llama/Llama-Guard-3-8B-INT8. Llama Guard 3 can be directly used with . It is only supported since version 4.43. # Training Data We use the English data used by Llama Guard [1], which are obtained by getting Llama 2 and Llama 3 generations on prompts from the hh-rlhf dataset [2]. In order to scale training data for new categories and new capabilities such as multilingual and tool use, we collect additional human and synthetically generated data. Similar to the English data, the multilingual data are Human-AI conversation data that are either single-turn or multi-turn. To reduce the model’s false positive rate, we curate a set of multilingual benign prompt and response data where LLMs likely reject the prompts. For the tool use capability, we consider search tool calls and code interpreter abuse. To develop training data for search tool use, we use Llama3 to generate responses to a collected and synthetic set of prompts. The generations are based on the query results obtained from the Brave Search API. To develop synthetic training data to detect code interpreter attacks, we use an LLM to generate safe and unsafe prompts. Then, we use a non-safety-tuned LLM to generate code interpreter completions that comply with these instructions. For safe data, we focus on data close to the boundary of what would be considered unsafe, to minimize false positives on such borderline examples. # Evaluation **Note on evaluations:** As discussed in the original Llama Guard paper, comparing model performance is not straightforward as each model is built on its own policy and is expected to perform better on an evaluation dataset with a policy aligned to the model. This highlights the need for industry standards. By aligning the Llama Guard family of models with the Proof of Concept MLCommons taxonomy of hazards, we hope to drive adoption of industry standards like this and facilitate collaboration and transparency in the LLM safety and content evaluation space. In this regard, we evaluate the performance of Llama Guard 3 on MLCommons hazard taxonomy and compare it across languages with Llama Guard 2 [3] on our internal test. We also add GPT4 as baseline with zero-shot prompting using MLCommons hazard taxonomy. Tables 1, 2, and 3 show that Llama Guard 3 improves over Llama Guard 2 and outperforms GPT4 in English, multilingual, and tool use capabilities. Noteworthily, Llama Guard 3 achieves better performance with much lower false positive rates. We also benchmark Llama Guard 3 in the OSS dataset XSTest [4] and observe that it achieves the same F1 score but a lower false positive rate compared to Llama Guard 2.
Table 1: Comparison of performance of various models measured on our internal English test set for MLCommons hazard taxonomy (response classification). | | **F1 ↑** | **AUPRC ↑** | **False Positive
Rate ↓** | |--------------------------|:--------:|:-----------:|:----------------------------:| | Llama Guard 2 | 0.877 | 0.927 | 0.081 | | Llama Guard 3 | 0.939 | 0.985 | 0.040 | | GPT4 | 0.805 | N/A | 0.152 |

Table 2: Comparison of multilingual performance of various models measured on our internal test set for MLCommons hazard taxonomy (prompt+response classification).
F1 ↑ / FPR ↓
French
German
Hindi
Italian
Portuguese
Spanish
Thai
Llama Guard 2
0.911/0.012
0.795/0.062
0.832/0.062
0.681/0.039
0.845/0.032
0.876/0.001
0.822/0.078
Llama Guard 3
0.943/0.036
0.877/0.032
0.871/0.050
0.873/0.038
0.860/0.060
0.875/0.023
0.834/0.030
GPT4
0.795/0.157
0.691/0.123
0.709/0.206
0.753/0.204
0.738/0.207
0.711/0.169
0.688/0.168

Table 3: Comparison of performance of various models measured on our internal test set for other moderation capabilities (prompt+response classification).
Search tool calls Code interpreter abuse
F1 ↑
AUPRC ↑
FPR ↓
F1 ↑
AUPRC ↑
FPR ↓
Llama Guard 2
0.749
0.794
0.284
0.683
0.677
0.670
Llama Guard 3
0.856
0.938
0.174
0.885
0.967
0.125
GPT4
0.732
N/A
0.525
0.636
N/A
0.90
# Application As outlined in the Llama 3 paper, Llama Guard 3 provides industry leading system-level safety performance and is recommended to be deployed along with Llama 3.1. Note that, while deploying Llama Guard 3 will likely improve the safety of your system, it might increase refusals to benign prompts (False Positives). Violation rate improvement and impact on false positives as measured on internal benchmarks are provided in the Llama 3 paper. # Quantization We are committed to help the community deploy Llama systems responsibly. We provide a quantized version of Llama Guard 3 to lower the deployment cost. We used int 8 implementation integrated into the hugging face ecosystem, reducing the checkpoint size by about 40% with very small impact on model performance. In Table 5, we observe that the performance quantized model is comparable to the original model.
Table 5: Impact of quantization on Llama Guard 3 performance.

Task


Capability

Non-Quantized

Quantized

Precision

Recall

F1

FPR

Precision

Recall

F1

FPR

Prompt Classification

English

0.952

0.943

0.947

0.057

0.961

0.939

0.950

0.045

Multilingual

0.901

0.899

0.900

0.054

0.906

0.892

0.899

0.051

Tool Use

0.884

0.958

0.920

0.126

0.876

0.946

0.909

0.134

Response Classification

English

0.947

0.931

0.939

0.040

0.947

0.925

0.936

0.040

Multilingual

0.929

0.805

0.862

0.033

0.931

0.785

0.851

0.031

Tool Use

0.774

0.884

0.825

0.176

0.793

0.865

0.827

0.155

# Get started Llama Guard 3 is available by default on Llama 3.1 reference implementations. You can learn more about how to configure and customize using Llama Recipes shared on our Github repository. # Limitations There are some limitations associated with Llama Guard 3. First, Llama Guard 3 itself is an LLM fine-tuned on Llama 3.1. Thus, its performance (e.g., judgments that need common sense knowledge, multilingual capability, and policy coverage) might be limited by its (pre-)training data. Some hazard categories may require factual, up-to-date knowledge to be evaluated (for example, S5: Defamation, S8: Intellectual Property, and S13: Elections) . We believe more complex systems should be deployed to accurately moderate these categories for use cases highly sensitive to these types of hazards, but Llama Guard 3 provides a good baseline for generic use cases. Lastly, as an LLM, Llama Guard 3 may be susceptible to adversarial attacks or prompt injection attacks that could bypass or alter its intended use. Please feel free to report vulnerabilities and we will look to incorporate improvements in future versions of Llama Guard. # Citation # References [1] Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations [2] Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback [3] Llama Guard 2 Model Card [4] XSTest: A Test Suite for Identifying Exaggerated Safety Behaviors in Large Language Models", + "model_explanation_gemini": "\"Llama-Guard-3-8B is a text-generation model based on Meta's Llama-3.1-8B, designed for creating and modifying text under the Llama 3.1 Community License.\"\n\nFeatures: \n- Text-generation capability \n- Built on Meta’s Llama-3.1-8B architecture \n- Licensed under Llama 3.1 Community License with redistribution and attribution requirements \n- Compliance with Meta’s Acceptable Use Policy \n\nComparison" +} \ No newline at end of file diff --git a/model_data_json/meta-llama_LlamaGuard-7b.json b/model_data_json/meta-llama_LlamaGuard-7b.json new file mode 100644 index 0000000000000000000000000000000000000000..e9ef78317f83cd7407a728c1b0ee4acbf5c8a036 --- /dev/null +++ b/model_data_json/meta-llama_LlamaGuard-7b.json @@ -0,0 +1,25 @@ +{ + "model_id": "meta-llama/LlamaGuard-7b", + "downloads": 639785, + "tags": [ + "transformers", + "safetensors", + "llama", + "text-generation", + "facebook", + "meta", + "pytorch", + "llama-2", + "conversational", + "en", + "arxiv:2307.09288", + "arxiv:2312.04724", + "license:llama2", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- extra_gated_heading: You need to share contact information with Meta to access this model extra_gated_prompt: >- ### LLAMA 2 COMMUNITY LICENSE AGREEMENT \"Agreement\" means the terms and conditions for use, reproduction, distribution and modification of the Llama Materials set forth herein. \"Documentation\" means the specifications, manuals and documentation accompanying Llama 2 distributed by Meta at \"Licensee\" or \"you\" means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity's behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf. \"Llama 2\" means the foundational large language models and software and algorithms, including machine-learning model code, trained model weights, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing distributed by Meta at ai.meta.com/resources/models-and-libraries/llama-downloads/. \"Llama Materials\" means, collectively, Meta's proprietary Llama 2 and documentation (and any portion thereof) made available under this Agreement. \"Meta\" or \"we\" means Meta Platforms Ireland Limited (if you are located in or, if you are an entity, your principal place of business is in the EEA or Switzerland) and Meta Platforms, Inc. (if you are located outside of the EEA or Switzerland). By clicking \"I Accept\" below or by using or distributing any portion or element of the Llama Materials, you agree to be bound by this Agreement. 1. License Rights and Redistribution. a. Grant of Rights. You are granted a non-exclusive, worldwide, non- transferable and royalty-free limited license under Meta's intellectual property or other rights owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy, create derivative works of, and make modifications to the Llama Materials. b. Redistribution and Use. i. If you distribute or make the Llama Materials, or any derivative works thereof, available to a third party, you shall provide a copy of this Agreement to such third party. ii. If you receive Llama Materials, or any derivative works thereof, from a Licensee as part of an integrated end user product, then Section 2 of this Agreement will not apply to you. iii. You must retain in all copies of the Llama Materials that you distribute the following attribution notice within a \"Notice\" text file distributed as a part of such copies: \"Llama 2 is licensed under the LLAMA 2 Community License, Copyright (c) Meta Platforms, Inc. All Rights Reserved.\" iv. Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the Llama Materials (available at which is hereby incorporated by reference into this Agreement. v. You will not use the Llama Materials or any output or results of the Llama Materials to improve any other large language model (excluding Llama 2 or derivative works thereof). 2. Additional Commercial Terms. If, on the Llama 2 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee's affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights. 3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN \"AS IS\" BASIS, WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS. 4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING. 5. Intellectual Property. a. No trademark licenses are granted under this Agreement, and in connection with the Llama Materials, neither Meta nor Licensee may use any name or mark owned by or associated with the other or any of its affiliates, except as required for reasonable and customary use in describing and redistributing the Llama Materials. b. Subject to Meta's ownership of Llama Materials and derivatives made by or for Meta, with respect to any derivative works and modifications of the Llama Materials that are made by you, as between you and Meta, you are and will be the owner of such derivative works and modifications. c. If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 2 outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials. 6. Term and Termination. The term of this Agreement will commence upon your acceptance of this Agreement or access to the Llama Materials and will continue in full force and effect until terminated in accordance with the terms and conditions herein. Meta may terminate this Agreement if you are in breach of any term or condition of this Agreement. Upon termination of this Agreement, you shall delete and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the termination of this Agreement. 7. Governing Law and Jurisdiction. This Agreement will be governed and construed under the laws of the State of California without regard to choice of law principles, and the UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement. The courts of California shall have exclusive jurisdiction of any dispute arising out of this Agreement. ### Llama 2 Acceptable Use Policy Meta is committed to promoting safe and fair use of its tools and features, including Llama 2. If you access or use Llama 2, you agree to this Acceptable Use Policy (“Policy”). The most recent copy of this policy can be found at ai.meta.com/llama/use-policy. #### Prohibited Uses We want everyone to use Llama 2 safely and responsibly. You agree you will not use, or allow others to use, Llama 2 to: 1. Violate the law or others’ rights, including to: 1. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as: 1. Violence or terrorism 2. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material 3. Human trafficking, exploitation, and sexual violence 4. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials. 5. Sexual solicitation 6. Any other criminal activity 2. Engage in, promote, incite, or facilitate the harassment, abuse, threatening, or bullying of individuals or groups of individuals 3. Engage in, promote, incite, or facilitate discrimination or other unlawful or harmful conduct in the provision of employment, employment benefits, credit, housing, other economic benefits, or other essential goods and services 4. Engage in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or related professional practices 5. Collect, process, disclose, generate, or infer health, demographic, or other sensitive personal or private information about individuals without rights and consents required by applicable laws 6. Engage in or facilitate any action or generate any content that infringes, misappropriates, or otherwise violates any third-party rights, including the outputs or results of any products or services using the Llama 2 Materials 7. Create, generate, or facilitate the creation of malicious code, malware, computer viruses or do anything else that could disable, overburden, interfere with or impair the proper working, integrity, operation or appearance of a website or computer system 2. Engage in, promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals, including use of Llama 2 related to the following: 1. Military, warfare, nuclear industries or applications, espionage, use for materials or activities that are subject to the International Traffic Arms Regulations (ITAR) maintained by the United States Department of State 2. Guns and illegal weapons (including weapon development) 3. Illegal drugs and regulated/controlled substances 4. Operation of critical infrastructure, transportation technologies, or heavy machinery 5. Self-harm or harm to others, including suicide, cutting, and eating disorders 6. Any content intended to incite or promote violence, abuse, or any infliction of bodily harm to an individual 3. Intentionally deceive or mislead others, including use of Llama 2 related to the following: 1. Generating, promoting, or furthering fraud or the creation or promotion of disinformation 2. Generating, promoting, or furthering defamatory content, including the creation of defamatory statements, images, or other content 3. Generating, promoting, or further distributing spam 4. Impersonating another individual without consent, authorization, or legal right 5. Representing that the use of Llama 2 or outputs are human-generated 6. Generating or facilitating false online engagement, including fake reviews and other means of fake online engagement 4. Fail to appropriately disclose to end users any known dangers of your AI system Please report any violation of this Policy, software “bug,” or other problems that could lead to a violation of this Policy through one of the following means: * Reporting issues with the model: github.com/facebookresearch/llama * Reporting risky content generated by the model: developers.facebook.com/llama_output_feedback * Reporting bugs and security concerns: facebook.com/whitehat/info * Reporting violations of the Acceptable Use Policy or unlicensed uses of Llama: LlamaUseReport@meta.com extra_gated_fields: First Name: text Last Name: text Date of birth: date_picker Country: country Affiliation: text geo: ip_location By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy: checkbox extra_gated_description: >- The information you provide will be collected, stored, processed and shared in accordance with the Meta Privacy Policy. extra_gated_button_content: Submit language: - en pipeline_tag: text-generation tags: - facebook - meta - pytorch - llama - llama-2 license: llama2 --- ## Model Details **This repository contains the model weights both in the vanilla Llama format and the Hugging Face format. If you have not received access, please review this discussion** Llama-Guard is a 7B parameter Llama 2-based input-output safeguard model. It can be used for classifying content in both LLM inputs (prompt classification) and in LLM responses (response classification). It acts as an LLM: it generates text in its output that indicates whether a given prompt or response is safe/unsafe, and if unsafe based on a policy, it also lists the violating subcategories. Here is an example: In order to produce classifier scores, we look at the probability for the first token, and turn that into an “unsafe” class probability. Model users can then make binary decisions by applying a desired threshold to the probability scores. ## Training and Evaluation ### Training Data We use a mix of prompts that come from the Anthropic dataset and redteaming examples that we have collected in house, in a separate process from our production redteaming. In particular, we took the prompts only from the Anthropic dataset, and generated new responses from our in-house LLaMA models, using jailbreaking techniques to elicit violating responses. We then annotated Anthropic data (prompts & responses) in house, mapping labels according to the categories identified above. Overall we have ~13K training examples. ## Taxonomy of harms and Risk Guidelines As automated content risk mitigation relies on classifiers to make decisions about content in real time, a prerequisite to building these systems is to have the following components: - A **taxonomy** of risks that are of interest – these become the classes of a classifier. - A **risk guideline** that determines where we put the line between encouraged and discouraged outputs for each risk category in the taxonomy. Together with this model, we release an open taxonomy inspired by existing open taxonomies such as those employed by Google, Microsoft and OpenAI in the hope that it can be useful to the community. This taxonomy does not necessarily reflect Meta's own internal policies and is meant to demonstrate the value of our method to tune LLMs into classifiers that show high performance and high degrees of adaptability to different policies. ### The Llama-Guard Safety Taxonomy & Risk Guidelines Below, we provide both the harm types themselves under this taxonomy and also examples of the specific kinds of content that would be considered harmful under each category: - **Violence & Hate** encompasses statements that encourage or could help people plan or engage in violence. Similarly, statements that advocate discrimination, contain slurs, or voice hateful sentiments against people based on their sensitive personal characteristics (ex: race, color, religion, national origin, sexual orientation, gender, gender identity, or disability) would also be considered inappropriate under this category. - **Sexual Content** encompasses statements encouraging someone (who could be underage) to engage in specific sex acts. Similarly, sexually explicit (i.e., erotic) statements would also be considered inappropriate under this category. - **Guns & Illegal Weapons** encompasses statements that specifically encourage, condone, or could help people plan or execute specific crimes involving the illegal acquisition, creation, or use of guns or illegal weapons (ex: explosives, biological agents, or chemical weapons), including instructions on how to create or use them. - **Regulated or Controlled Substances** encompasses statements that specifically encourage or could help people to illegally produce, transfer, or consume regulated or controlled substances like illegal drugs, tobacco, alcohol, or cannabis. - **Suicide & Self Harm** encompasses statements that encourage, condone, or enable people to harm themselves (ex: by providing instructions or information on methods of self-harm). When people express or imply an intent to harm themselves, any response other than one that provides links to appropriate health resources would be considered inappropriate (even if entirely supportive). - **Criminal Planning** encompasses miscellaneous statements that encourage, condone, or could help people plan or execute specific criminal activities, like arson, kidnapping, or theft. Items should only be considered harmful under this category when they could not be read as violating any of the other harm types above (ex: statements that encourage violence should be considered violating under Violence & Hate rather than this category). ## How to Use in > [!warning] > You need to be logged in to the Hugging Face Hub to use the model. For more details, see this Colab notebook. ## Evaluation results We compare the performance of the model against standard content moderation APIs in the industry, including OpenAI, Azure Content Safety,and PerspectiveAPI from Google on both public and in-house benchmarks. The public benchmarks include ToxicChat and OpenAI Moderation. Note: comparisons are not exactly apples-to-apples due to mismatches in each taxonomy. The interested reader can find a more detailed discussion about this in our paper. | | Our Test Set (Prompt) | OpenAI Mod | ToxicChat | Our Test Set (Response) | | --------------- | --------------------- | ---------- | --------- | ----------------------- | | Llama-Guard | **0.945** | 0.847 | **0.626** | **0.953** | | OpenAI API | 0.764 | **0.856** | 0.588 | 0.769 | | Perspective API | 0.728 | 0.787 | 0.532 | 0.699 |", + "model_explanation_gemini": "\"LlamaGuard-7b is a foundational large language model licensed under the LLAMA 2 Community License Agreement, designed for use, reproduction, distribution, and modification with specific compliance and redistribution requirements.\"\n\nFeatures: \n1. Non-exclusive, worldwide, non-transferable, royalty-free license. \n2. Requires adherence to Meta's Acceptable Use Policy and legal compliance. \n3. Mandates redistribution with attribution and license agreement. \n4. Prohibits using outputs to improve non-L" +} \ No newline at end of file diff --git a/model_data_json/meta-llama_Meta-Llama-3-70B-Instruct.json b/model_data_json/meta-llama_Meta-Llama-3-70B-Instruct.json new file mode 100644 index 0000000000000000000000000000000000000000..187d4260aac394f2ed565dbf2811b9e7716f4b21 --- /dev/null +++ b/model_data_json/meta-llama_Meta-Llama-3-70B-Instruct.json @@ -0,0 +1,25 @@ +{ + "model_id": "meta-llama/Meta-Llama-3-70B-Instruct", + "downloads": 291523, + "tags": [ + "transformers", + "safetensors", + "llama", + "text-generation", + "facebook", + "meta", + "pytorch", + "llama-3", + "conversational", + "en", + "base_model:meta-llama/Meta-Llama-3-70B", + "base_model:finetune:meta-llama/Meta-Llama-3-70B", + "license:llama3", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: - en pipeline_tag: text-generation base_model: meta-llama/Meta-Llama-3-70B new_version: meta-llama/Llama-3.3-70B-Instruct tags: - facebook - meta - pytorch - llama - llama-3 license: llama3 extra_gated_prompt: >- ### META LLAMA 3 COMMUNITY LICENSE AGREEMENT Meta Llama 3 Version Release Date: April 18, 2024 \"Agreement\" means the terms and conditions for use, reproduction, distribution and modification of the Llama Materials set forth herein. \"Documentation\" means the specifications, manuals and documentation accompanying Meta Llama 3 distributed by Meta at \"Licensee\" or \"you\" means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity’s behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf. \"Meta Llama 3\" means the foundational large language models and software and algorithms, including machine-learning model code, trained model weights, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing distributed by Meta at \"Llama Materials\" means, collectively, Meta’s proprietary Meta Llama 3 and Documentation (and any portion thereof) made available under this Agreement. \"Meta\" or \"we\" means Meta Platforms Ireland Limited (if you are located in or, if you are an entity, your principal place of business is in the EEA or Switzerland) and Meta Platforms, Inc. (if you are located outside of the EEA or Switzerland). 1. License Rights and Redistribution. a. Grant of Rights. You are granted a non-exclusive, worldwide, non-transferable and royalty-free limited license under Meta’s intellectual property or other rights owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy, create derivative works of, and make modifications to the Llama Materials. b. Redistribution and Use. i. If you distribute or make available the Llama Materials (or any derivative works thereof), or a product or service that uses any of them, including another AI model, you shall (A) provide a copy of this Agreement with any such Llama Materials; and (B) prominently display “Built with Meta Llama 3” on a related website, user interface, blogpost, about page, or product documentation. If you use the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include “Llama 3” at the beginning of any such AI model name. ii. If you receive Llama Materials, or any derivative works thereof, from a Licensee as part of an integrated end user product, then Section 2 of this Agreement will not apply to you. iii. You must retain in all copies of the Llama Materials that you distribute the following attribution notice within a “Notice” text file distributed as a part of such copies: “Meta Llama 3 is licensed under the Meta Llama 3 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.” iv. Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the Llama Materials (available at which is hereby incorporated by reference into this Agreement. v. You will not use the Llama Materials or any output or results of the Llama Materials to improve any other large language model (excluding Meta Llama 3 or derivative works thereof). 2. Additional Commercial Terms. If, on the Meta Llama 3 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights. 3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN “AS IS” BASIS, WITHOUT WARRANTIES OF ANY KIND, AND META DISCLAIMS ALL WARRANTIES OF ANY KIND, BOTH EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS. 4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING. 5. Intellectual Property. a. No trademark licenses are granted under this Agreement, and in connection with the Llama Materials, neither Meta nor Licensee may use any name or mark owned by or associated with the other or any of its affiliates, except as required for reasonable and customary use in describing and redistributing the Llama Materials or as set forth in this Section 5(a). Meta hereby grants you a license to use “Llama 3” (the “Mark”) solely as required to comply with the last sentence of Section 1.b.i. You will comply with Meta’s brand guidelines (currently accessible at ). All goodwill arising out of your use of the Mark will inure to the benefit of Meta. b. Subject to Meta’s ownership of Llama Materials and derivatives made by or for Meta, with respect to any derivative works and modifications of the Llama Materials that are made by you, as between you and Meta, you are and will be the owner of such derivative works and modifications. c. If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or Meta Llama 3 outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials. 6. Term and Termination. The term of this Agreement will commence upon your acceptance of this Agreement or access to the Llama Materials and will continue in full force and effect until terminated in accordance with the terms and conditions herein. Meta may terminate this Agreement if you are in breach of any term or condition of this Agreement. Upon termination of this Agreement, you shall delete and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the termination of this Agreement. 7. Governing Law and Jurisdiction. This Agreement will be governed and construed under the laws of the State of California without regard to choice of law principles, and the UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement. The courts of California shall have exclusive jurisdiction of any dispute arising out of this Agreement. ### Meta Llama 3 Acceptable Use Policy Meta is committed to promoting safe and fair use of its tools and features, including Meta Llama 3. If you access or use Meta Llama 3, you agree to this Acceptable Use Policy (“Policy”). The most recent copy of this policy can be found at #### Prohibited Uses We want everyone to use Meta Llama 3 safely and responsibly. You agree you will not use, or allow others to use, Meta Llama 3 to: 1. Violate the law or others’ rights, including to: 1. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as: 1. Violence or terrorism 2. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material 3. Human trafficking, exploitation, and sexual violence 4. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials. 5. Sexual solicitation 6. Any other criminal activity 2. Engage in, promote, incite, or facilitate the harassment, abuse, threatening, or bullying of individuals or groups of individuals 3. Engage in, promote, incite, or facilitate discrimination or other unlawful or harmful conduct in the provision of employment, employment benefits, credit, housing, other economic benefits, or other essential goods and services 4. Engage in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or related professional practices 5. Collect, process, disclose, generate, or infer health, demographic, or other sensitive personal or private information about individuals without rights and consents required by applicable laws 6. Engage in or facilitate any action or generate any content that infringes, misappropriates, or otherwise violates any third-party rights, including the outputs or results of any products or services using the Llama Materials 7. Create, generate, or facilitate the creation of malicious code, malware, computer viruses or do anything else that could disable, overburden, interfere with or impair the proper working, integrity, operation or appearance of a website or computer system 2. Engage in, promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals, including use of Meta Llama 3 related to the following: 1. Military, warfare, nuclear industries or applications, espionage, use for materials or activities that are subject to the International Traffic Arms Regulations (ITAR) maintained by the United States Department of State 2. Guns and illegal weapons (including weapon development) 3. Illegal drugs and regulated/controlled substances 4. Operation of critical infrastructure, transportation technologies, or heavy machinery 5. Self-harm or harm to others, including suicide, cutting, and eating disorders 6. Any content intended to incite or promote violence, abuse, or any infliction of bodily harm to an individual 3. Intentionally deceive or mislead others, including use of Meta Llama 3 related to the following: 1. Generating, promoting, or furthering fraud or the creation or promotion of disinformation 2. Generating, promoting, or furthering defamatory content, including the creation of defamatory statements, images, or other content 3. Generating, promoting, or further distributing spam 4. Impersonating another individual without consent, authorization, or legal right 5. Representing that the use of Meta Llama 3 or outputs are human-generated 6. Generating or facilitating false online engagement, including fake reviews and other means of fake online engagement 4. Fail to appropriately disclose to end users any known dangers of your AI system Please report any violation of this Policy, software “bug,” or other problems that could lead to a violation of this Policy through one of the following means: * Reporting issues with the model: * Reporting risky content generated by the model: developers.facebook.com/llama_output_feedback * Reporting bugs and security concerns: facebook.com/whitehat/info * Reporting violations of the Acceptable Use Policy or unlicensed uses of Meta Llama 3: LlamaUseReport@meta.com extra_gated_fields: First Name: text Last Name: text Date of birth: date_picker Country: country Affiliation: text geo: ip_location By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy: checkbox extra_gated_description: The information you provide will be collected, stored, processed and shared in accordance with the Meta Privacy Policy. extra_gated_button_content: Submit widget: - example_title: Winter holidays messages: - role: system content: You are a helpful and honest assistant. Please, respond concisely and truthfully. - role: user content: Can you recommend a good destination for Winter holidays? - example_title: Programming assistant messages: - role: system content: You are a helpful and honest code and programming assistant. Please, respond concisely and truthfully. - role: user content: Write a function that computes the nth fibonacci number. inference: parameters: max_new_tokens: 300 stop: - <|end_of_text|> - <|eot_id|> --- ## Model Details Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety. **Model developers** Meta **Variations** Llama 3 comes in two sizes — 8B and 70B parameters — in pre-trained and instruction tuned variants. **Input** Models input text only. **Output** Models generate text and code only. **Model Architecture** Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
Training Data Params Context length GQA Token count Knowledge cutoff
Llama 3 A new mix of publicly available online data. 8B 8k Yes 15T+ March, 2023
70B 8k Yes December, 2023
**Llama 3 family of models**. Token counts refer to pretraining data only. Both the 8 and 70B versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date** April 18, 2024. **Status** This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. **License** A custom commercial license is available at: Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model README. For more technical information about generation parameters and recipes for how to use Llama 3 in applications, please go here. ## Intended Use **Intended Use Cases** Llama 3 is intended for commercial and research use in English. Instruction tuned models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. **Out-of-scope** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3 Community License. Use in languages other than English**. **Note: Developers may fine-tune Llama 3 models for languages beyond English provided they comply with the Llama 3 Community License and the Acceptable Use Policy. ## How to use This repository contains two versions of Meta-Llama-3-70B-Instruct, for use with transformers and with the original codebase. ### Use with transformers See the snippet below for usage with Transformers: ### Use with Please, follow the instructions in the repository. To download Original checkpoints, see the example command below leveraging : For Hugging Face support, we recommend using transformers or TGI, but a similar command works. ## Hardware and Software **Training Factors** We used custom training libraries, Meta's Research SuperCluster, and production clusters for pretraining. Fine-tuning, annotation, and evaluation were also performed on third-party cloud compute. **Carbon Footprint Pretraining utilized a cumulative** 7.7M GPU hours of computation on hardware of type H100-80GB (TDP of 700W). Estimated total emissions were 2290 tCO2eq, 100% of which were offset by Meta’s sustainability program.
Time (GPU hours) Power Consumption (W) Carbon Emitted(tCO2eq)
Llama 3 8B 1.3M 700 390
Llama 3 70B 6.4M 700 1900
Total 7.7M 2290
**CO2 emissions during pre-training**. Time: total GPU time required for training each model. Power Consumption: peak power capacity per GPU device for the GPUs used adjusted for power usage efficiency. 100% of the emissions are directly offset by Meta's sustainability program, and because we are openly releasing these models, the pretraining costs do not need to be incurred by others. ## Training Data **Overview** Llama 3 was pretrained on over 15 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over 10M human-annotated examples. Neither the pretraining nor the fine-tuning datasets include Meta user data. **Data Freshness** The pretraining data has a cutoff of March 2023 for the 7B and December 2023 for the 70B models respectively. ## Benchmarks In this section, we report the results for Llama 3 models on standard automatic benchmarks. For all the evaluations, we use our internal evaluations library. For details on the methodology see here. ### Base pretrained models
Category Benchmark Llama 3 8B Llama2 7B Llama2 13B Llama 3 70B Llama2 70B
General MMLU (5-shot) 66.6 45.7 53.8 79.5 69.7
AGIEval English (3-5 shot) 45.9 28.8 38.7 63.0 54.8
CommonSenseQA (7-shot) 72.6 57.6 67.6 83.8 78.7
Winogrande (5-shot) 76.1 73.3 75.4 83.1 81.8
BIG-Bench Hard (3-shot, CoT) 61.1 38.1 47.0 81.3 65.7
ARC-Challenge (25-shot) 78.6 53.7 67.6 93.0 85.3
Knowledge reasoning TriviaQA-Wiki (5-shot) 78.5 72.1 79.6 89.7 87.5
Reading comprehension SQuAD (1-shot) 76.4 72.2 72.1 85.6 82.6
QuAC (1-shot, F1) 44.4 39.6 44.9 51.1 49.4
BoolQ (0-shot) 75.7 65.5 66.9 79.0 73.1
DROP (3-shot, F1) 58.4 37.9 49.8 79.7 70.2
### Instruction tuned models
Benchmark Llama 3 8B Llama 2 7B Llama 2 13B Llama 3 70B Llama 2 70B
MMLU (5-shot) 68.4 34.1 47.8 82.0 52.9
GPQA (0-shot) 34.2 21.7 22.3 39.5 21.0
HumanEval (0-shot) 62.2 7.9 14.0 81.7 25.6
GSM-8K (8-shot, CoT) 79.6 25.7 77.4 93.0 57.5
MATH (4-shot, CoT) 30.0 3.8 6.7 50.4 11.6
### Responsibility & Safety We believe that an open approach to AI leads to better, safer products, faster innovation, and a bigger overall market. We are committed to Responsible AI development and took a series of steps to limit misuse and harm and support the open source community. Foundation models are widely capable technologies that are built to be used for a diverse range of applications. They are not designed to meet every developer preference on safety levels for all use cases, out-of-the-box, as those by their nature will differ across different applications. Rather, responsible LLM-application deployment is achieved by implementing a series of safety best practices throughout the development of such applications, from the model pre-training, fine-tuning and the deployment of systems composed of safeguards to tailor the safety needs specifically to the use case and audience. As part of the Llama 3 release, we updated our Responsible Use Guide to outline the steps and best practices for developers to implement model and system level safety for their application. We also provide a set of resources including Meta Llama Guard 2 and Code Shield safeguards. These tools have proven to drastically reduce residual risks of LLM Systems, while maintaining a high level of helpfulness. We encourage developers to tune and deploy these safeguards according to their needs and we provide a reference implementation to get you started. #### Llama 3-Instruct As outlined in the Responsible Use Guide, some trade-off between model helpfulness and model alignment is likely unavoidable. Developers should exercise discretion about how to weigh the benefits of alignment and helpfulness for their specific use case and audience. Developers should be mindful of residual risks when using Llama models and leverage additional safety tools as needed to reach the right safety bar for their use case. Safety For our instruction tuned model, we conducted extensive red teaming exercises, performed adversarial evaluations and implemented safety mitigations techniques to lower residual risks. As with any Large Language Model, residual risks will likely remain and we recommend that developers assess these risks in the context of their use case. In parallel, we are working with the community to make AI safety benchmark standards transparent, rigorous and interpretable. Refusals In addition to residual risks, we put a great emphasis on model refusals to benign prompts. Over-refusing not only can impact the user experience but could even be harmful in certain contexts as well. We’ve heard the feedback from the developer community and improved our fine tuning to ensure that Llama 3 is significantly less likely to falsely refuse to answer prompts than Llama 2. We built internal benchmarks and developed mitigations to limit false refusals making Llama 3 our most helpful model to date. #### Responsible release In addition to responsible use considerations outlined above, we followed a rigorous process that requires us to take extra measures against misuse and critical risks before we make our release decision. Misuse If you access or use Llama 3, you agree to the Acceptable Use Policy. The most recent copy of this policy can be found at #### Critical risks CBRNE (Chemical, Biological, Radiological, Nuclear, and high yield Explosives) We have conducted a two fold assessment of the safety of the model in this area: * Iterative testing during model training to assess the safety of responses related to CBRNE threats and other adversarial risks. * Involving external CBRNE experts to conduct an uplift test assessing the ability of the model to accurately provide expert knowledge and reduce barriers to potential CBRNE misuse, by reference to what can be achieved using web search (without the model). ### Cyber Security We have evaluated Llama 3 with CyberSecEval, Meta’s cybersecurity safety eval suite, measuring Llama 3’s propensity to suggest insecure code when used as a coding assistant, and Llama 3’s propensity to comply with requests to help carry out cyber attacks, where attacks are defined by the industry standard MITRE ATT&CK cyber attack ontology. On our insecure coding and cyber attacker helpfulness tests, Llama 3 behaved in the same range or safer than models of equivalent coding capability. ### Child Safety Child Safety risk assessments were conducted using a team of experts, to assess the model’s capability to produce outputs that could result in Child Safety risks and inform on any necessary and appropriate risk mitigations via fine tuning. We leveraged those expert red teaming sessions to expand the coverage of our evaluation benchmarks through Llama 3 model development. For Llama 3, we conducted new in-depth sessions using objective based methodologies to assess the model risks along multiple attack vectors. We also partnered with content specialists to perform red teaming exercises assessing potentially violating content while taking account of market specific nuances or experiences. ### Community Generative AI safety requires expertise and tooling, and we believe in the strength of the open community to accelerate its progress. We are active members of open consortiums, including the AI Alliance, Partnership in AI and MLCommons, actively contributing to safety standardization and transparency. We encourage the community to adopt taxonomies like the MLCommons Proof of Concept evaluation to facilitate collaboration and transparency on safety and content evaluations. Our Purple Llama tools are open sourced for the community to use and widely distributed across ecosystem partners including cloud service providers. We encourage community contributions to our Github repository. Finally, we put in place a set of resources including an output reporting mechanism and bug bounty program to continuously improve the Llama technology with the help of the community. ## Ethical Considerations and Limitations The core values of Llama 3 are openness, inclusivity and helpfulness. It is meant to serve everyone, and to work for a wide range of use cases. It is thus designed to be accessible to people across many different backgrounds, experiences and perspectives. Llama 3 addresses users and their needs as they are, without insertion unnecessary judgment or normativity, while reflecting the understanding that even content that may appear problematic in some cases can serve valuable purposes in others. It respects the dignity and autonomy of all users, especially in terms of the values of free thought and expression that power innovation and progress. But Llama 3 is a new technology, and like any new technology, there are risks associated with its use. Testing conducted to date has been in English, and has not covered, nor could it cover, all scenarios. For these reasons, as with all LLMs, Llama 3’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 3 models, developers should perform safety testing and tuning tailored to their specific applications of the model. As outlined in the Responsible Use Guide, we recommend incorporating Purple Llama solutions into your workflows and specifically Llama Guard which provides a base model to filter input and output prompts to layer system-level safety on top of model-level safety. Please see the Responsible Use Guide available at ## Citation instructions @article{llama3modelcard, title={Llama 3 Model Card}, author={AI@Meta}, year={2024}, url = { } ## Contributors Aaditya Singh; Aaron Grattafiori; Abhimanyu Dubey; Abhinav Jauhri; Abhinav Pandey; Abhishek Kadian; Adam Kelsey; Adi Gangidi; Ahmad Al-Dahle; Ahuva Goldstand; Aiesha Letman; Ajay Menon; Akhil Mathur; Alan Schelten; Alex Vaughan; Amy Yang; Andrei Lupu; Andres Alvarado; Andrew Gallagher; Andrew Gu; Andrew Ho; Andrew Poulton; Andrew Ryan; Angela Fan; Ankit Ramchandani; Anthony Hartshorn; Archi Mitra; Archie Sravankumar; Artem Korenev; Arun Rao; Ashley Gabriel; Ashwin Bharambe; Assaf Eisenman; Aston Zhang; Aurelien Rodriguez; Austen Gregerson; Ava Spataru; Baptiste Roziere; Ben Maurer; Benjamin Leonhardi; Bernie Huang; Bhargavi Paranjape; Bing Liu; Binh Tang; Bobbie Chern; Brani Stojkovic; Brian Fuller; Catalina Mejia Arenas; Chao Zhou; Charlotte Caucheteux; Chaya Nayak; Ching-Hsiang Chu; Chloe Bi; Chris Cai; Chris Cox; Chris Marra; Chris McConnell; Christian Keller; Christoph Feichtenhofer; Christophe Touret; Chunyang Wu; Corinne Wong; Cristian Canton Ferrer; Damien Allonsius; Daniel Kreymer; Daniel Haziza; Daniel Li; Danielle Pintz; Danny Livshits; Danny Wyatt; David Adkins; David Esiobu; David Xu; Davide Testuggine; Delia David; Devi Parikh; Dhruv Choudhary; Dhruv Mahajan; Diana Liskovich; Diego Garcia-Olano; Diego Perino; Dieuwke Hupkes; Dingkang Wang; Dustin Holland; Egor Lakomkin; Elina Lobanova; Xiaoqing Ellen Tan; Emily Dinan; Eric Smith; Erik Brinkman; Esteban Arcaute; Filip Radenovic; Firat Ozgenel; Francesco Caggioni; Frank Seide; Frank Zhang; Gabriel Synnaeve; Gabriella Schwarz; Gabrielle Lee; Gada Badeer; Georgia Anderson; Graeme Nail; Gregoire Mialon; Guan Pang; Guillem Cucurell; Hailey Nguyen; Hannah Korevaar; Hannah Wang; Haroun Habeeb; Harrison Rudolph; Henry Aspegren; Hu Xu; Hugo Touvron; Iga Kozlowska; Igor Molybog; Igor Tufanov; Iliyan Zarov; Imanol Arrieta Ibarra; Irina-Elena Veliche; Isabel Kloumann; Ishan Misra; Ivan Evtimov; Jacob Xu; Jade Copet; Jake Weissman; Jan Geffert; Jana Vranes; Japhet Asher; Jason Park; Jay Mahadeokar; Jean-Baptiste Gaya; Jeet Shah; Jelmer van der Linde; Jennifer Chan; Jenny Hong; Jenya Lee; Jeremy Fu; Jeremy Teboul; Jianfeng Chi; Jianyu Huang; Jie Wang; Jiecao Yu; Joanna Bitton; Joe Spisak; Joelle Pineau; Jon Carvill; Jongsoo Park; Joseph Rocca; Joshua Johnstun; Junteng Jia; Kalyan Vasuden Alwala; Kam Hou U; Kate Plawiak; Kartikeya Upasani; Kaushik Veeraraghavan; Ke Li; Kenneth Heafield; Kevin Stone; Khalid El-Arini; Krithika Iyer; Kshitiz Malik; Kuenley Chiu; Kunal Bhalla; Kyle Huang; Lakshya Garg; Lauren Rantala-Yeary; Laurens van der Maaten; Lawrence Chen; Leandro Silva; Lee Bell; Lei Zhang; Liang Tan; Louis Martin; Lovish Madaan; Luca Wehrstedt; Lukas Blecher; Luke de Oliveira; Madeline Muzzi; Madian Khabsa; Manav Avlani; Mannat Singh; Manohar Paluri; Mark Zuckerberg; Marcin Kardas; Martynas Mankus; Mathew Oldham; Mathieu Rita; Matthew Lennie; Maya Pavlova; Meghan Keneally; Melanie Kambadur; Mihir Patel; Mikayel Samvelyan; Mike Clark; Mike Lewis; Min Si; Mitesh Kumar Singh; Mo Metanat; Mona Hassan; Naman Goyal; Narjes Torabi; Nicolas Usunier; Nikolay Bashlykov; Nikolay Bogoychev; Niladri Chatterji; Ning Dong; Oliver Aobo Yang; Olivier Duchenne; Onur Celebi; Parth Parekh; Patrick Alrassy; Paul Saab; Pavan Balaji; Pedro Rittner; Pengchuan Zhang; Pengwei Li; Petar Vasic; Peter Weng; Polina Zvyagina; Prajjwal Bhargava; Pratik Dubal; Praveen Krishnan; Punit Singh Koura; Qing He; Rachel Rodriguez; Ragavan Srinivasan; Rahul Mitra; Ramon Calderer; Raymond Li; Robert Stojnic; Roberta Raileanu; Robin Battey; Rocky Wang; Rohit Girdhar; Rohit Patel; Romain Sauvestre; Ronnie Polidoro; Roshan Sumbaly; Ross Taylor; Ruan Silva; Rui Hou; Rui Wang; Russ Howes; Ruty Rinott; Saghar Hosseini; Sai Jayesh Bondu; Samyak Datta; Sanjay Singh; Sara Chugh; Sargun Dhillon; Satadru Pan; Sean Bell; Sergey Edunov; Shaoliang Nie; Sharan Narang; Sharath Raparthy; Shaun Lindsay; Sheng Feng; Sheng Shen; Shenghao Lin; Shiva Shankar; Shruti Bhosale; Shun Zhang; Simon Vandenhende; Sinong Wang; Seohyun Sonia Kim; Soumya Batra; Sten Sootla; Steve Kehoe; Suchin Gururangan; Sumit Gupta; Sunny Virk; Sydney Borodinsky; Tamar Glaser; Tamar Herman; Tamara Best; Tara Fowler; Thomas Georgiou; Thomas Scialom; Tianhe Li; Todor Mihaylov; Tong Xiao; Ujjwal Karn; Vedanuj Goswami; Vibhor Gupta; Vignesh Ramanathan; Viktor Kerkez; Vinay Satish Kumar; Vincent Gonguet; Vish Vogeti; Vlad Poenaru; Vlad Tiberiu Mihailescu; Vladan Petrovic; Vladimir Ivanov; Wei Li; Weiwei Chu; Wenhan Xiong; Wenyin Fu; Wes Bouaziz; Whitney Meers; Will Constable; Xavier Martinet; Xiaojian Wu; Xinbo Gao; Xinfeng Xie; Xuchao Jia; Yaelle Goldschlag; Yann LeCun; Yashesh Gaur; Yasmine Babaei; Ye Qi; Yenda Li; Yi Wen; Yiwen Song; Youngjin Nam; Yuchen Hao; Yuchen Zhang; Yun Wang; Yuning Mao; Yuzi He; Zacharie Delpierre Coudert; Zachary DeVito; Zahra Hankir; Zhaoduo Wen; Zheng Yan; Zhengxing Chen; Zhenyu Yang; Zoe Papakipos", + "model_explanation_gemini": "A 70B-parameter instruction-tuned language model for text generation, built on Meta's Llama-3 architecture and licensed under the Meta Llama 3 Community License.\n\n**Features:** \n- Instruction-tuned for task-oriented text generation \n- Derived from the base model `meta-llama/Meta-Llama-3-70B` \n- Released under a restrictive license requiring attribution and compliance with usage policies \n- Prohibits using outputs to improve competing models \n\n**Comparison" +} \ No newline at end of file diff --git a/model_data_json/meta-llama_Meta-Llama-3-8B-Instruct.json b/model_data_json/meta-llama_Meta-Llama-3-8B-Instruct.json new file mode 100644 index 0000000000000000000000000000000000000000..cc82179650d09d3d489e415c30f3c4bd5d134612 --- /dev/null +++ b/model_data_json/meta-llama_Meta-Llama-3-8B-Instruct.json @@ -0,0 +1,23 @@ +{ + "model_id": "meta-llama/Meta-Llama-3-8B-Instruct", + "downloads": 1312717, + "tags": [ + "transformers", + "safetensors", + "llama", + "text-generation", + "facebook", + "meta", + "pytorch", + "llama-3", + "conversational", + "en", + "license:llama3", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: - en pipeline_tag: text-generation tags: - facebook - meta - pytorch - llama - llama-3 license: llama3 new_version: meta-llama/Llama-3.1-8B-Instruct extra_gated_prompt: >- ### META LLAMA 3 COMMUNITY LICENSE AGREEMENT Meta Llama 3 Version Release Date: April 18, 2024 \"Agreement\" means the terms and conditions for use, reproduction, distribution and modification of the Llama Materials set forth herein. \"Documentation\" means the specifications, manuals and documentation accompanying Meta Llama 3 distributed by Meta at \"Licensee\" or \"you\" means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity’s behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf. \"Meta Llama 3\" means the foundational large language models and software and algorithms, including machine-learning model code, trained model weights, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing distributed by Meta at \"Llama Materials\" means, collectively, Meta’s proprietary Meta Llama 3 and Documentation (and any portion thereof) made available under this Agreement. \"Meta\" or \"we\" means Meta Platforms Ireland Limited (if you are located in or, if you are an entity, your principal place of business is in the EEA or Switzerland) and Meta Platforms, Inc. (if you are located outside of the EEA or Switzerland). 1. License Rights and Redistribution. a. Grant of Rights. You are granted a non-exclusive, worldwide, non-transferable and royalty-free limited license under Meta’s intellectual property or other rights owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy, create derivative works of, and make modifications to the Llama Materials. b. Redistribution and Use. i. If you distribute or make available the Llama Materials (or any derivative works thereof), or a product or service that uses any of them, including another AI model, you shall (A) provide a copy of this Agreement with any such Llama Materials; and (B) prominently display “Built with Meta Llama 3” on a related website, user interface, blogpost, about page, or product documentation. If you use the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include “Llama 3” at the beginning of any such AI model name. ii. If you receive Llama Materials, or any derivative works thereof, from a Licensee as part of an integrated end user product, then Section 2 of this Agreement will not apply to you. iii. You must retain in all copies of the Llama Materials that you distribute the following attribution notice within a “Notice” text file distributed as a part of such copies: “Meta Llama 3 is licensed under the Meta Llama 3 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.” iv. Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the Llama Materials (available at which is hereby incorporated by reference into this Agreement. v. You will not use the Llama Materials or any output or results of the Llama Materials to improve any other large language model (excluding Meta Llama 3 or derivative works thereof). 2. Additional Commercial Terms. If, on the Meta Llama 3 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights. 3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN “AS IS” BASIS, WITHOUT WARRANTIES OF ANY KIND, AND META DISCLAIMS ALL WARRANTIES OF ANY KIND, BOTH EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS. 4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING. 5. Intellectual Property. a. No trademark licenses are granted under this Agreement, and in connection with the Llama Materials, neither Meta nor Licensee may use any name or mark owned by or associated with the other or any of its affiliates, except as required for reasonable and customary use in describing and redistributing the Llama Materials or as set forth in this Section 5(a). Meta hereby grants you a license to use “Llama 3” (the “Mark”) solely as required to comply with the last sentence of Section 1.b.i. You will comply with Meta’s brand guidelines (currently accessible at ). All goodwill arising out of your use of the Mark will inure to the benefit of Meta. b. Subject to Meta’s ownership of Llama Materials and derivatives made by or for Meta, with respect to any derivative works and modifications of the Llama Materials that are made by you, as between you and Meta, you are and will be the owner of such derivative works and modifications. c. If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or Meta Llama 3 outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials. 6. Term and Termination. The term of this Agreement will commence upon your acceptance of this Agreement or access to the Llama Materials and will continue in full force and effect until terminated in accordance with the terms and conditions herein. Meta may terminate this Agreement if you are in breach of any term or condition of this Agreement. Upon termination of this Agreement, you shall delete and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the termination of this Agreement. 7. Governing Law and Jurisdiction. This Agreement will be governed and construed under the laws of the State of California without regard to choice of law principles, and the UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement. The courts of California shall have exclusive jurisdiction of any dispute arising out of this Agreement. ### Meta Llama 3 Acceptable Use Policy Meta is committed to promoting safe and fair use of its tools and features, including Meta Llama 3. If you access or use Meta Llama 3, you agree to this Acceptable Use Policy (“Policy”). The most recent copy of this policy can be found at #### Prohibited Uses We want everyone to use Meta Llama 3 safely and responsibly. You agree you will not use, or allow others to use, Meta Llama 3 to: 1. Violate the law or others’ rights, including to: 1. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as: 1. Violence or terrorism 2. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material 3. Human trafficking, exploitation, and sexual violence 4. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials. 5. Sexual solicitation 6. Any other criminal activity 2. Engage in, promote, incite, or facilitate the harassment, abuse, threatening, or bullying of individuals or groups of individuals 3. Engage in, promote, incite, or facilitate discrimination or other unlawful or harmful conduct in the provision of employment, employment benefits, credit, housing, other economic benefits, or other essential goods and services 4. Engage in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or related professional practices 5. Collect, process, disclose, generate, or infer health, demographic, or other sensitive personal or private information about individuals without rights and consents required by applicable laws 6. Engage in or facilitate any action or generate any content that infringes, misappropriates, or otherwise violates any third-party rights, including the outputs or results of any products or services using the Llama Materials 7. Create, generate, or facilitate the creation of malicious code, malware, computer viruses or do anything else that could disable, overburden, interfere with or impair the proper working, integrity, operation or appearance of a website or computer system 2. Engage in, promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals, including use of Meta Llama 3 related to the following: 1. Military, warfare, nuclear industries or applications, espionage, use for materials or activities that are subject to the International Traffic Arms Regulations (ITAR) maintained by the United States Department of State 2. Guns and illegal weapons (including weapon development) 3. Illegal drugs and regulated/controlled substances 4. Operation of critical infrastructure, transportation technologies, or heavy machinery 5. Self-harm or harm to others, including suicide, cutting, and eating disorders 6. Any content intended to incite or promote violence, abuse, or any infliction of bodily harm to an individual 3. Intentionally deceive or mislead others, including use of Meta Llama 3 related to the following: 1. Generating, promoting, or furthering fraud or the creation or promotion of disinformation 2. Generating, promoting, or furthering defamatory content, including the creation of defamatory statements, images, or other content 3. Generating, promoting, or further distributing spam 4. Impersonating another individual without consent, authorization, or legal right 5. Representing that the use of Meta Llama 3 or outputs are human-generated 6. Generating or facilitating false online engagement, including fake reviews and other means of fake online engagement 4. Fail to appropriately disclose to end users any known dangers of your AI system Please report any violation of this Policy, software “bug,” or other problems that could lead to a violation of this Policy through one of the following means: * Reporting issues with the model: * Reporting risky content generated by the model: developers.facebook.com/llama_output_feedback * Reporting bugs and security concerns: facebook.com/whitehat/info * Reporting violations of the Acceptable Use Policy or unlicensed uses of Meta Llama 3: LlamaUseReport@meta.com extra_gated_fields: First Name: text Last Name: text Date of birth: date_picker Country: country Affiliation: text geo: ip_location By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy: checkbox extra_gated_description: The information you provide will be collected, stored, processed and shared in accordance with the Meta Privacy Policy. extra_gated_button_content: Submit widget: - example_title: Hello messages: - role: user content: Hey my name is Julien! How are you? - example_title: Winter holidays messages: - role: system content: You are a helpful and honest assistant. Please, respond concisely and truthfully. - role: user content: Can you recommend a good destination for Winter holidays? - example_title: Programming assistant messages: - role: system content: You are a helpful and honest code and programming assistant. Please, respond concisely and truthfully. - role: user content: Write a function that computes the nth fibonacci number. inference: parameters: max_new_tokens: 300 stop: - <|end_of_text|> - <|eot_id|> --- ## Model Details Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety. **Model developers** Meta **Variations** Llama 3 comes in two sizes — 8B and 70B parameters — in pre-trained and instruction tuned variants. **Input** Models input text only. **Output** Models generate text and code only. **Model Architecture** Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
Training Data Params Context length GQA Token count Knowledge cutoff
Llama 3 A new mix of publicly available online data. 8B 8k Yes 15T+ March, 2023
70B 8k Yes December, 2023
**Llama 3 family of models**. Token counts refer to pretraining data only. Both the 8 and 70B versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date** April 18, 2024. **Status** This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. **License** A custom commercial license is available at: Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model README. For more technical information about generation parameters and recipes for how to use Llama 3 in applications, please go here. ## Intended Use **Intended Use Cases** Llama 3 is intended for commercial and research use in English. Instruction tuned models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. **Out-of-scope** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3 Community License. Use in languages other than English**. **Note: Developers may fine-tune Llama 3 models for languages beyond English provided they comply with the Llama 3 Community License and the Acceptable Use Policy. ## How to use This repository contains two versions of Meta-Llama-3-8B-Instruct, for use with transformers and with the original codebase. ### Use with transformers You can run conversational inference using the Transformers pipeline abstraction, or by leveraging the Auto classes with the function. Let's see examples of both. #### Transformers pipeline #### Transformers AutoModelForCausalLM ### Use with Please, follow the instructions in the repository To download Original checkpoints, see the example command below leveraging : For Hugging Face support, we recommend using transformers or TGI, but a similar command works. ## Hardware and Software **Training Factors** We used custom training libraries, Meta's Research SuperCluster, and production clusters for pretraining. Fine-tuning, annotation, and evaluation were also performed on third-party cloud compute. **Carbon Footprint Pretraining utilized a cumulative** 7.7M GPU hours of computation on hardware of type H100-80GB (TDP of 700W). Estimated total emissions were 2290 tCO2eq, 100% of which were offset by Meta’s sustainability program.
Time (GPU hours) Power Consumption (W) Carbon Emitted(tCO2eq)
Llama 3 8B 1.3M 700 390
Llama 3 70B 6.4M 700 1900
Total 7.7M 2290
**CO2 emissions during pre-training**. Time: total GPU time required for training each model. Power Consumption: peak power capacity per GPU device for the GPUs used adjusted for power usage efficiency. 100% of the emissions are directly offset by Meta's sustainability program, and because we are openly releasing these models, the pretraining costs do not need to be incurred by others. ## Training Data **Overview** Llama 3 was pretrained on over 15 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over 10M human-annotated examples. Neither the pretraining nor the fine-tuning datasets include Meta user data. **Data Freshness** The pretraining data has a cutoff of March 2023 for the 8B and December 2023 for the 70B models respectively. ## Benchmarks In this section, we report the results for Llama 3 models on standard automatic benchmarks. For all the evaluations, we use our internal evaluations library. For details on the methodology see here. ### Base pretrained models
Category Benchmark Llama 3 8B Llama2 7B Llama2 13B Llama 3 70B Llama2 70B
General MMLU (5-shot) 66.6 45.7 53.8 79.5 69.7
AGIEval English (3-5 shot) 45.9 28.8 38.7 63.0 54.8
CommonSenseQA (7-shot) 72.6 57.6 67.6 83.8 78.7
Winogrande (5-shot) 76.1 73.3 75.4 83.1 81.8
BIG-Bench Hard (3-shot, CoT) 61.1 38.1 47.0 81.3 65.7
ARC-Challenge (25-shot) 78.6 53.7 67.6 93.0 85.3
Knowledge reasoning TriviaQA-Wiki (5-shot) 78.5 72.1 79.6 89.7 87.5
Reading comprehension SQuAD (1-shot) 76.4 72.2 72.1 85.6 82.6
QuAC (1-shot, F1) 44.4 39.6 44.9 51.1 49.4
BoolQ (0-shot) 75.7 65.5 66.9 79.0 73.1
DROP (3-shot, F1) 58.4 37.9 49.8 79.7 70.2
### Instruction tuned models
Benchmark Llama 3 8B Llama 2 7B Llama 2 13B Llama 3 70B Llama 2 70B
MMLU (5-shot) 68.4 34.1 47.8 82.0 52.9
GPQA (0-shot) 34.2 21.7 22.3 39.5 21.0
HumanEval (0-shot) 62.2 7.9 14.0 81.7 25.6
GSM-8K (8-shot, CoT) 79.6 25.7 77.4 93.0 57.5
MATH (4-shot, CoT) 30.0 3.8 6.7 50.4 11.6
### Responsibility & Safety We believe that an open approach to AI leads to better, safer products, faster innovation, and a bigger overall market. We are committed to Responsible AI development and took a series of steps to limit misuse and harm and support the open source community. Foundation models are widely capable technologies that are built to be used for a diverse range of applications. They are not designed to meet every developer preference on safety levels for all use cases, out-of-the-box, as those by their nature will differ across different applications. Rather, responsible LLM-application deployment is achieved by implementing a series of safety best practices throughout the development of such applications, from the model pre-training, fine-tuning and the deployment of systems composed of safeguards to tailor the safety needs specifically to the use case and audience. As part of the Llama 3 release, we updated our Responsible Use Guide to outline the steps and best practices for developers to implement model and system level safety for their application. We also provide a set of resources including Meta Llama Guard 2 and Code Shield safeguards. These tools have proven to drastically reduce residual risks of LLM Systems, while maintaining a high level of helpfulness. We encourage developers to tune and deploy these safeguards according to their needs and we provide a reference implementation to get you started. #### Llama 3-Instruct As outlined in the Responsible Use Guide, some trade-off between model helpfulness and model alignment is likely unavoidable. Developers should exercise discretion about how to weigh the benefits of alignment and helpfulness for their specific use case and audience. Developers should be mindful of residual risks when using Llama models and leverage additional safety tools as needed to reach the right safety bar for their use case. Safety For our instruction tuned model, we conducted extensive red teaming exercises, performed adversarial evaluations and implemented safety mitigations techniques to lower residual risks. As with any Large Language Model, residual risks will likely remain and we recommend that developers assess these risks in the context of their use case. In parallel, we are working with the community to make AI safety benchmark standards transparent, rigorous and interpretable. Refusals In addition to residual risks, we put a great emphasis on model refusals to benign prompts. Over-refusing not only can impact the user experience but could even be harmful in certain contexts as well. We’ve heard the feedback from the developer community and improved our fine tuning to ensure that Llama 3 is significantly less likely to falsely refuse to answer prompts than Llama 2. We built internal benchmarks and developed mitigations to limit false refusals making Llama 3 our most helpful model to date. #### Responsible release In addition to responsible use considerations outlined above, we followed a rigorous process that requires us to take extra measures against misuse and critical risks before we make our release decision. Misuse If you access or use Llama 3, you agree to the Acceptable Use Policy. The most recent copy of this policy can be found at #### Critical risks CBRNE (Chemical, Biological, Radiological, Nuclear, and high yield Explosives) We have conducted a two fold assessment of the safety of the model in this area: * Iterative testing during model training to assess the safety of responses related to CBRNE threats and other adversarial risks. * Involving external CBRNE experts to conduct an uplift test assessing the ability of the model to accurately provide expert knowledge and reduce barriers to potential CBRNE misuse, by reference to what can be achieved using web search (without the model). ### Cyber Security We have evaluated Llama 3 with CyberSecEval, Meta’s cybersecurity safety eval suite, measuring Llama 3’s propensity to suggest insecure code when used as a coding assistant, and Llama 3’s propensity to comply with requests to help carry out cyber attacks, where attacks are defined by the industry standard MITRE ATT&CK cyber attack ontology. On our insecure coding and cyber attacker helpfulness tests, Llama 3 behaved in the same range or safer than models of equivalent coding capability. ### Child Safety Child Safety risk assessments were conducted using a team of experts, to assess the model’s capability to produce outputs that could result in Child Safety risks and inform on any necessary and appropriate risk mitigations via fine tuning. We leveraged those expert red teaming sessions to expand the coverage of our evaluation benchmarks through Llama 3 model development. For Llama 3, we conducted new in-depth sessions using objective based methodologies to assess the model risks along multiple attack vectors. We also partnered with content specialists to perform red teaming exercises assessing potentially violating content while taking account of market specific nuances or experiences. ### Community Generative AI safety requires expertise and tooling, and we believe in the strength of the open community to accelerate its progress. We are active members of open consortiums, including the AI Alliance, Partnership in AI and MLCommons, actively contributing to safety standardization and transparency. We encourage the community to adopt taxonomies like the MLCommons Proof of Concept evaluation to facilitate collaboration and transparency on safety and content evaluations. Our Purple Llama tools are open sourced for the community to use and widely distributed across ecosystem partners including cloud service providers. We encourage community contributions to our Github repository. Finally, we put in place a set of resources including an output reporting mechanism and bug bounty program to continuously improve the Llama technology with the help of the community. ## Ethical Considerations and Limitations The core values of Llama 3 are openness, inclusivity and helpfulness. It is meant to serve everyone, and to work for a wide range of use cases. It is thus designed to be accessible to people across many different backgrounds, experiences and perspectives. Llama 3 addresses users and their needs as they are, without insertion unnecessary judgment or normativity, while reflecting the understanding that even content that may appear problematic in some cases can serve valuable purposes in others. It respects the dignity and autonomy of all users, especially in terms of the values of free thought and expression that power innovation and progress. But Llama 3 is a new technology, and like any new technology, there are risks associated with its use. Testing conducted to date has been in English, and has not covered, nor could it cover, all scenarios. For these reasons, as with all LLMs, Llama 3’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 3 models, developers should perform safety testing and tuning tailored to their specific applications of the model. As outlined in the Responsible Use Guide, we recommend incorporating Purple Llama solutions into your workflows and specifically Llama Guard which provides a base model to filter input and output prompts to layer system-level safety on top of model-level safety. Please see the Responsible Use Guide available at ## Citation instructions @article{llama3modelcard, title={Llama 3 Model Card}, author={AI@Meta}, year={2024}, url = { } ## Contributors Aaditya Singh; Aaron Grattafiori; Abhimanyu Dubey; Abhinav Jauhri; Abhinav Pandey; Abhishek Kadian; Adam Kelsey; Adi Gangidi; Ahmad Al-Dahle; Ahuva Goldstand; Aiesha Letman; Ajay Menon; Akhil Mathur; Alan Schelten; Alex Vaughan; Amy Yang; Andrei Lupu; Andres Alvarado; Andrew Gallagher; Andrew Gu; Andrew Ho; Andrew Poulton; Andrew Ryan; Angela Fan; Ankit Ramchandani; Anthony Hartshorn; Archi Mitra; Archie Sravankumar; Artem Korenev; Arun Rao; Ashley Gabriel; Ashwin Bharambe; Assaf Eisenman; Aston Zhang; Aurelien Rodriguez; Austen Gregerson; Ava Spataru; Baptiste Roziere; Ben Maurer; Benjamin Leonhardi; Bernie Huang; Bhargavi Paranjape; Bing Liu; Binh Tang; Bobbie Chern; Brani Stojkovic; Brian Fuller; Catalina Mejia Arenas; Chao Zhou; Charlotte Caucheteux; Chaya Nayak; Ching-Hsiang Chu; Chloe Bi; Chris Cai; Chris Cox; Chris Marra; Chris McConnell; Christian Keller; Christoph Feichtenhofer; Christophe Touret; Chunyang Wu; Corinne Wong; Cristian Canton Ferrer; Damien Allonsius; Daniel Kreymer; Daniel Haziza; Daniel Li; Danielle Pintz; Danny Livshits; Danny Wyatt; David Adkins; David Esiobu; David Xu; Davide Testuggine; Delia David; Devi Parikh; Dhruv Choudhary; Dhruv Mahajan; Diana Liskovich; Diego Garcia-Olano; Diego Perino; Dieuwke Hupkes; Dingkang Wang; Dustin Holland; Egor Lakomkin; Elina Lobanova; Xiaoqing Ellen Tan; Emily Dinan; Eric Smith; Erik Brinkman; Esteban Arcaute; Filip Radenovic; Firat Ozgenel; Francesco Caggioni; Frank Seide; Frank Zhang; Gabriel Synnaeve; Gabriella Schwarz; Gabrielle Lee; Gada Badeer; Georgia Anderson; Graeme Nail; Gregoire Mialon; Guan Pang; Guillem Cucurell; Hailey Nguyen; Hannah Korevaar; Hannah Wang; Haroun Habeeb; Harrison Rudolph; Henry Aspegren; Hu Xu; Hugo Touvron; Iga Kozlowska; Igor Molybog; Igor Tufanov; Iliyan Zarov; Imanol Arrieta Ibarra; Irina-Elena Veliche; Isabel Kloumann; Ishan Misra; Ivan Evtimov; Jacob Xu; Jade Copet; Jake Weissman; Jan Geffert; Jana Vranes; Japhet Asher; Jason Park; Jay Mahadeokar; Jean-Baptiste Gaya; Jeet Shah; Jelmer van der Linde; Jennifer Chan; Jenny Hong; Jenya Lee; Jeremy Fu; Jeremy Teboul; Jianfeng Chi; Jianyu Huang; Jie Wang; Jiecao Yu; Joanna Bitton; Joe Spisak; Joelle Pineau; Jon Carvill; Jongsoo Park; Joseph Rocca; Joshua Johnstun; Junteng Jia; Kalyan Vasuden Alwala; Kam Hou U; Kate Plawiak; Kartikeya Upasani; Kaushik Veeraraghavan; Ke Li; Kenneth Heafield; Kevin Stone; Khalid El-Arini; Krithika Iyer; Kshitiz Malik; Kuenley Chiu; Kunal Bhalla; Kyle Huang; Lakshya Garg; Lauren Rantala-Yeary; Laurens van der Maaten; Lawrence Chen; Leandro Silva; Lee Bell; Lei Zhang; Liang Tan; Louis Martin; Lovish Madaan; Luca Wehrstedt; Lukas Blecher; Luke de Oliveira; Madeline Muzzi; Madian Khabsa; Manav Avlani; Mannat Singh; Manohar Paluri; Mark Zuckerberg; Marcin Kardas; Martynas Mankus; Mathew Oldham; Mathieu Rita; Matthew Lennie; Maya Pavlova; Meghan Keneally; Melanie Kambadur; Mihir Patel; Mikayel Samvelyan; Mike Clark; Mike Lewis; Min Si; Mitesh Kumar Singh; Mo Metanat; Mona Hassan; Naman Goyal; Narjes Torabi; Nicolas Usunier; Nikolay Bashlykov; Nikolay Bogoychev; Niladri Chatterji; Ning Dong; Oliver Aobo Yang; Olivier Duchenne; Onur Celebi; Parth Parekh; Patrick Alrassy; Paul Saab; Pavan Balaji; Pedro Rittner; Pengchuan Zhang; Pengwei Li; Petar Vasic; Peter Weng; Polina Zvyagina; Prajjwal Bhargava; Pratik Dubal; Praveen Krishnan; Punit Singh Koura; Qing He; Rachel Rodriguez; Ragavan Srinivasan; Rahul Mitra; Ramon Calderer; Raymond Li; Robert Stojnic; Roberta Raileanu; Robin Battey; Rocky Wang; Rohit Girdhar; Rohit Patel; Romain Sauvestre; Ronnie Polidoro; Roshan Sumbaly; Ross Taylor; Ruan Silva; Rui Hou; Rui Wang; Russ Howes; Ruty Rinott; Saghar Hosseini; Sai Jayesh Bondu; Samyak Datta; Sanjay Singh; Sara Chugh; Sargun Dhillon; Satadru Pan; Sean Bell; Sergey Edunov; Shaoliang Nie; Sharan Narang; Sharath Raparthy; Shaun Lindsay; Sheng Feng; Sheng Shen; Shenghao Lin; Shiva Shankar; Shruti Bhosale; Shun Zhang; Simon Vandenhende; Sinong Wang; Seohyun Sonia Kim; Soumya Batra; Sten Sootla; Steve Kehoe; Suchin Gururangan; Sumit Gupta; Sunny Virk; Sydney Borodinsky; Tamar Glaser; Tamar Herman; Tamara Best; Tara Fowler; Thomas Georgiou; Thomas Scialom; Tianhe Li; Todor Mihaylov; Tong Xiao; Ujjwal Karn; Vedanuj Goswami; Vibhor Gupta; Vignesh Ramanathan; Viktor Kerkez; Vinay Satish Kumar; Vincent Gonguet; Vish Vogeti; Vlad Poenaru; Vlad Tiberiu Mihailescu; Vladan Petrovic; Vladimir Ivanov; Wei Li; Weiwei Chu; Wenhan Xiong; Wenyin Fu; Wes Bouaziz; Whitney Meers; Will Constable; Xavier Martinet; Xiaojian Wu; Xinbo Gao; Xinfeng Xie; Xuchao Jia; Yaelle Goldschlag; Yann LeCun; Yashesh Gaur; Yasmine Babaei; Ye Qi; Yenda Li; Yi Wen; Yiwen Song; Youngjin Nam; Yuchen Hao; Yuchen Zhang; Yun Wang; Yuning Mao; Yuzi He; Zacharie Delpierre Coudert; Zachary DeVito; Zahra Hankir; Zhaoduo Wen; Zheng Yan; Zhengxing Chen; Zhenyu Yang; Zoe Papakipos", + "model_explanation_gemini": "A text-generation model designed for instruction-following tasks, built on Meta's Llama-3 architecture with an 8B parameter scale, licensed under specific terms for community use.\n\n**Features:** \n- Text-generation capability \n- Instruction-following focus \n- 8-billion-parameter scale \n- Llama-3 architecture \n- Community license with redistribution terms \n\n**Comparison:** \nPart of Meta's Llama-3 series, this 8B instruction-tuned variant is smaller" +} \ No newline at end of file diff --git a/model_data_json/meta-llama_Meta-Llama-3-8B.json b/model_data_json/meta-llama_Meta-Llama-3-8B.json new file mode 100644 index 0000000000000000000000000000000000000000..933fe0b992078487256a8992d48ea38f71e7aca3 --- /dev/null +++ b/model_data_json/meta-llama_Meta-Llama-3-8B.json @@ -0,0 +1,22 @@ +{ + "model_id": "meta-llama/Meta-Llama-3-8B", + "downloads": 402236, + "tags": [ + "transformers", + "safetensors", + "llama", + "text-generation", + "facebook", + "meta", + "pytorch", + "llama-3", + "en", + "license:llama3", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: - en pipeline_tag: text-generation tags: - facebook - meta - pytorch - llama - llama-3 license: llama3 new_version: meta-llama/Llama-3.1-8B extra_gated_prompt: >- ### META LLAMA 3 COMMUNITY LICENSE AGREEMENT Meta Llama 3 Version Release Date: April 18, 2024 \"Agreement\" means the terms and conditions for use, reproduction, distribution and modification of the Llama Materials set forth herein. \"Documentation\" means the specifications, manuals and documentation accompanying Meta Llama 3 distributed by Meta at \"Licensee\" or \"you\" means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity’s behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf. \"Meta Llama 3\" means the foundational large language models and software and algorithms, including machine-learning model code, trained model weights, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing distributed by Meta at \"Llama Materials\" means, collectively, Meta’s proprietary Meta Llama 3 and Documentation (and any portion thereof) made available under this Agreement. \"Meta\" or \"we\" means Meta Platforms Ireland Limited (if you are located in or, if you are an entity, your principal place of business is in the EEA or Switzerland) and Meta Platforms, Inc. (if you are located outside of the EEA or Switzerland). 1. License Rights and Redistribution. a. Grant of Rights. You are granted a non-exclusive, worldwide, non-transferable and royalty-free limited license under Meta’s intellectual property or other rights owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy, create derivative works of, and make modifications to the Llama Materials. b. Redistribution and Use. i. If you distribute or make available the Llama Materials (or any derivative works thereof), or a product or service that uses any of them, including another AI model, you shall (A) provide a copy of this Agreement with any such Llama Materials; and (B) prominently display “Built with Meta Llama 3” on a related website, user interface, blogpost, about page, or product documentation. If you use the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include “Llama 3” at the beginning of any such AI model name. ii. If you receive Llama Materials, or any derivative works thereof, from a Licensee as part of an integrated end user product, then Section 2 of this Agreement will not apply to you. iii. You must retain in all copies of the Llama Materials that you distribute the following attribution notice within a “Notice” text file distributed as a part of such copies: “Meta Llama 3 is licensed under the Meta Llama 3 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.” iv. Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the Llama Materials (available at which is hereby incorporated by reference into this Agreement. v. You will not use the Llama Materials or any output or results of the Llama Materials to improve any other large language model (excluding Meta Llama 3 or derivative works thereof). 2. Additional Commercial Terms. If, on the Meta Llama 3 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights. 3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN “AS IS” BASIS, WITHOUT WARRANTIES OF ANY KIND, AND META DISCLAIMS ALL WARRANTIES OF ANY KIND, BOTH EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS. 4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING. 5. Intellectual Property. a. No trademark licenses are granted under this Agreement, and in connection with the Llama Materials, neither Meta nor Licensee may use any name or mark owned by or associated with the other or any of its affiliates, except as required for reasonable and customary use in describing and redistributing the Llama Materials or as set forth in this Section 5(a). Meta hereby grants you a license to use “Llama 3” (the “Mark”) solely as required to comply with the last sentence of Section 1.b.i. You will comply with Meta’s brand guidelines (currently accessible at ). All goodwill arising out of your use of the Mark will inure to the benefit of Meta. b. Subject to Meta’s ownership of Llama Materials and derivatives made by or for Meta, with respect to any derivative works and modifications of the Llama Materials that are made by you, as between you and Meta, you are and will be the owner of such derivative works and modifications. c. If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or Meta Llama 3 outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials. 6. Term and Termination. The term of this Agreement will commence upon your acceptance of this Agreement or access to the Llama Materials and will continue in full force and effect until terminated in accordance with the terms and conditions herein. Meta may terminate this Agreement if you are in breach of any term or condition of this Agreement. Upon termination of this Agreement, you shall delete and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the termination of this Agreement. 7. Governing Law and Jurisdiction. This Agreement will be governed and construed under the laws of the State of California without regard to choice of law principles, and the UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement. The courts of California shall have exclusive jurisdiction of any dispute arising out of this Agreement. ### Meta Llama 3 Acceptable Use Policy Meta is committed to promoting safe and fair use of its tools and features, including Meta Llama 3. If you access or use Meta Llama 3, you agree to this Acceptable Use Policy (“Policy”). The most recent copy of this policy can be found at #### Prohibited Uses We want everyone to use Meta Llama 3 safely and responsibly. You agree you will not use, or allow others to use, Meta Llama 3 to: 1. Violate the law or others’ rights, including to: 1. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as: 1. Violence or terrorism 2. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material 3. Human trafficking, exploitation, and sexual violence 4. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials. 5. Sexual solicitation 6. Any other criminal activity 2. Engage in, promote, incite, or facilitate the harassment, abuse, threatening, or bullying of individuals or groups of individuals 3. Engage in, promote, incite, or facilitate discrimination or other unlawful or harmful conduct in the provision of employment, employment benefits, credit, housing, other economic benefits, or other essential goods and services 4. Engage in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or related professional practices 5. Collect, process, disclose, generate, or infer health, demographic, or other sensitive personal or private information about individuals without rights and consents required by applicable laws 6. Engage in or facilitate any action or generate any content that infringes, misappropriates, or otherwise violates any third-party rights, including the outputs or results of any products or services using the Llama Materials 7. Create, generate, or facilitate the creation of malicious code, malware, computer viruses or do anything else that could disable, overburden, interfere with or impair the proper working, integrity, operation or appearance of a website or computer system 2. Engage in, promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals, including use of Meta Llama 3 related to the following: 1. Military, warfare, nuclear industries or applications, espionage, use for materials or activities that are subject to the International Traffic Arms Regulations (ITAR) maintained by the United States Department of State 2. Guns and illegal weapons (including weapon development) 3. Illegal drugs and regulated/controlled substances 4. Operation of critical infrastructure, transportation technologies, or heavy machinery 5. Self-harm or harm to others, including suicide, cutting, and eating disorders 6. Any content intended to incite or promote violence, abuse, or any infliction of bodily harm to an individual 3. Intentionally deceive or mislead others, including use of Meta Llama 3 related to the following: 1. Generating, promoting, or furthering fraud or the creation or promotion of disinformation 2. Generating, promoting, or furthering defamatory content, including the creation of defamatory statements, images, or other content 3. Generating, promoting, or further distributing spam 4. Impersonating another individual without consent, authorization, or legal right 5. Representing that the use of Meta Llama 3 or outputs are human-generated 6. Generating or facilitating false online engagement, including fake reviews and other means of fake online engagement 4. Fail to appropriately disclose to end users any known dangers of your AI system Please report any violation of this Policy, software “bug,” or other problems that could lead to a violation of this Policy through one of the following means: * Reporting issues with the model: * Reporting risky content generated by the model: developers.facebook.com/llama_output_feedback * Reporting bugs and security concerns: facebook.com/whitehat/info * Reporting violations of the Acceptable Use Policy or unlicensed uses of Meta Llama 3: LlamaUseReport@meta.com extra_gated_fields: First Name: text Last Name: text Date of birth: date_picker Country: country Affiliation: text geo: ip_location By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy: checkbox extra_gated_description: The information you provide will be collected, stored, processed and shared in accordance with the Meta Privacy Policy. extra_gated_button_content: Submit --- ## Model Details Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety. **Model developers** Meta **Variations** Llama 3 comes in two sizes — 8B and 70B parameters — in pre-trained and instruction tuned variants. **Input** Models input text only. **Output** Models generate text and code only. **Model Architecture** Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
Training Data Params Context length GQA Token count Knowledge cutoff
Llama 3 A new mix of publicly available online data. 8B 8k Yes 15T+ March, 2023
70B 8k Yes December, 2023
**Llama 3 family of models**. Token counts refer to pretraining data only. Both the 8 and 70B versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date** April 18, 2024. **Status** This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. **License** A custom commercial license is available at: Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model README. For more technical information about generation parameters and recipes for how to use Llama 3 in applications, please go here. ## Intended Use **Intended Use Cases** Llama 3 is intended for commercial and research use in English. Instruction tuned models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. **Out-of-scope** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3 Community License. Use in languages other than English**. **Note: Developers may fine-tune Llama 3 models for languages beyond English provided they comply with the Llama 3 Community License and the Acceptable Use Policy. ## How to use This repository contains two versions of Meta-Llama-3-8B, for use with transformers and with the original codebase. ### Use with transformers See the snippet below for usage with Transformers: ### Use with Please, follow the instructions in the repository. To download Original checkpoints, see the example command below leveraging : For Hugging Face support, we recommend using transformers or TGI, but a similar command works. ## Hardware and Software **Training Factors** We used custom training libraries, Meta's Research SuperCluster, and production clusters for pretraining. Fine-tuning, annotation, and evaluation were also performed on third-party cloud compute. **Carbon Footprint Pretraining utilized a cumulative** 7.7M GPU hours of computation on hardware of type H100-80GB (TDP of 700W). Estimated total emissions were 2290 tCO2eq, 100% of which were offset by Meta’s sustainability program.
Time (GPU hours) Power Consumption (W) Carbon Emitted(tCO2eq)
Llama 3 8B 1.3M 700 390
Llama 3 70B 6.4M 700 1900
Total 7.7M 2290
**CO2 emissions during pre-training**. Time: total GPU time required for training each model. Power Consumption: peak power capacity per GPU device for the GPUs used adjusted for power usage efficiency. 100% of the emissions are directly offset by Meta's sustainability program, and because we are openly releasing these models, the pretraining costs do not need to be incurred by others. ## Training Data **Overview** Llama 3 was pretrained on over 15 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over 10M human-annotated examples. Neither the pretraining nor the fine-tuning datasets include Meta user data. **Data Freshness** The pretraining data has a cutoff of March 2023 for the 8B and December 2023 for the 70B models respectively. ## Benchmarks In this section, we report the results for Llama 3 models on standard automatic benchmarks. For all the evaluations, we use our internal evaluations library. For details on the methodology see here. ### Base pretrained models
Category Benchmark Llama 3 8B Llama2 7B Llama2 13B Llama 3 70B Llama2 70B
General MMLU (5-shot) 66.6 45.7 53.8 79.5 69.7
AGIEval English (3-5 shot) 45.9 28.8 38.7 63.0 54.8
CommonSenseQA (7-shot) 72.6 57.6 67.6 83.8 78.7
Winogrande (5-shot) 76.1 73.3 75.4 83.1 81.8
BIG-Bench Hard (3-shot, CoT) 61.1 38.1 47.0 81.3 65.7
ARC-Challenge (25-shot) 78.6 53.7 67.6 93.0 85.3
Knowledge reasoning TriviaQA-Wiki (5-shot) 78.5 72.1 79.6 89.7 87.5
Reading comprehension SQuAD (1-shot) 76.4 72.2 72.1 85.6 82.6
QuAC (1-shot, F1) 44.4 39.6 44.9 51.1 49.4
BoolQ (0-shot) 75.7 65.5 66.9 79.0 73.1
DROP (3-shot, F1) 58.4 37.9 49.8 79.7 70.2
### Instruction tuned models
Benchmark Llama 3 8B Llama 2 7B Llama 2 13B Llama 3 70B Llama 2 70B
MMLU (5-shot) 68.4 34.1 47.8 82.0 52.9
GPQA (0-shot) 34.2 21.7 22.3 39.5 21.0
HumanEval (0-shot) 62.2 7.9 14.0 81.7 25.6
GSM-8K (8-shot, CoT) 79.6 25.7 77.4 93.0 57.5
MATH (4-shot, CoT) 30.0 3.8 6.7 50.4 11.6
### Responsibility & Safety We believe that an open approach to AI leads to better, safer products, faster innovation, and a bigger overall market. We are committed to Responsible AI development and took a series of steps to limit misuse and harm and support the open source community. Foundation models are widely capable technologies that are built to be used for a diverse range of applications. They are not designed to meet every developer preference on safety levels for all use cases, out-of-the-box, as those by their nature will differ across different applications. Rather, responsible LLM-application deployment is achieved by implementing a series of safety best practices throughout the development of such applications, from the model pre-training, fine-tuning and the deployment of systems composed of safeguards to tailor the safety needs specifically to the use case and audience. As part of the Llama 3 release, we updated our Responsible Use Guide to outline the steps and best practices for developers to implement model and system level safety for their application. We also provide a set of resources including Meta Llama Guard 2 and Code Shield safeguards. These tools have proven to drastically reduce residual risks of LLM Systems, while maintaining a high level of helpfulness. We encourage developers to tune and deploy these safeguards according to their needs and we provide a reference implementation to get you started. #### Llama 3-Instruct As outlined in the Responsible Use Guide, some trade-off between model helpfulness and model alignment is likely unavoidable. Developers should exercise discretion about how to weigh the benefits of alignment and helpfulness for their specific use case and audience. Developers should be mindful of residual risks when using Llama models and leverage additional safety tools as needed to reach the right safety bar for their use case. Safety For our instruction tuned model, we conducted extensive red teaming exercises, performed adversarial evaluations and implemented safety mitigations techniques to lower residual risks. As with any Large Language Model, residual risks will likely remain and we recommend that developers assess these risks in the context of their use case. In parallel, we are working with the community to make AI safety benchmark standards transparent, rigorous and interpretable. Refusals In addition to residual risks, we put a great emphasis on model refusals to benign prompts. Over-refusing not only can impact the user experience but could even be harmful in certain contexts as well. We’ve heard the feedback from the developer community and improved our fine tuning to ensure that Llama 3 is significantly less likely to falsely refuse to answer prompts than Llama 2. We built internal benchmarks and developed mitigations to limit false refusals making Llama 3 our most helpful model to date. #### Responsible release In addition to responsible use considerations outlined above, we followed a rigorous process that requires us to take extra measures against misuse and critical risks before we make our release decision. Misuse If you access or use Llama 3, you agree to the Acceptable Use Policy. The most recent copy of this policy can be found at #### Critical risks CBRNE (Chemical, Biological, Radiological, Nuclear, and high yield Explosives) We have conducted a two fold assessment of the safety of the model in this area: * Iterative testing during model training to assess the safety of responses related to CBRNE threats and other adversarial risks. * Involving external CBRNE experts to conduct an uplift test assessing the ability of the model to accurately provide expert knowledge and reduce barriers to potential CBRNE misuse, by reference to what can be achieved using web search (without the model). ### Cyber Security We have evaluated Llama 3 with CyberSecEval, Meta’s cybersecurity safety eval suite, measuring Llama 3’s propensity to suggest insecure code when used as a coding assistant, and Llama 3’s propensity to comply with requests to help carry out cyber attacks, where attacks are defined by the industry standard MITRE ATT&CK cyber attack ontology. On our insecure coding and cyber attacker helpfulness tests, Llama 3 behaved in the same range or safer than models of equivalent coding capability. ### Child Safety Child Safety risk assessments were conducted using a team of experts, to assess the model’s capability to produce outputs that could result in Child Safety risks and inform on any necessary and appropriate risk mitigations via fine tuning. We leveraged those expert red teaming sessions to expand the coverage of our evaluation benchmarks through Llama 3 model development. For Llama 3, we conducted new in-depth sessions using objective based methodologies to assess the model risks along multiple attack vectors. We also partnered with content specialists to perform red teaming exercises assessing potentially violating content while taking account of market specific nuances or experiences. ### Community Generative AI safety requires expertise and tooling, and we believe in the strength of the open community to accelerate its progress. We are active members of open consortiums, including the AI Alliance, Partnership in AI and MLCommons, actively contributing to safety standardization and transparency. We encourage the community to adopt taxonomies like the MLCommons Proof of Concept evaluation to facilitate collaboration and transparency on safety and content evaluations. Our Purple Llama tools are open sourced for the community to use and widely distributed across ecosystem partners including cloud service providers. We encourage community contributions to our Github repository. Finally, we put in place a set of resources including an output reporting mechanism and bug bounty program to continuously improve the Llama technology with the help of the community. ## Ethical Considerations and Limitations The core values of Llama 3 are openness, inclusivity and helpfulness. It is meant to serve everyone, and to work for a wide range of use cases. It is thus designed to be accessible to people across many different backgrounds, experiences and perspectives. Llama 3 addresses users and their needs as they are, without insertion unnecessary judgment or normativity, while reflecting the understanding that even content that may appear problematic in some cases can serve valuable purposes in others. It respects the dignity and autonomy of all users, especially in terms of the values of free thought and expression that power innovation and progress. But Llama 3 is a new technology, and like any new technology, there are risks associated with its use. Testing conducted to date has been in English, and has not covered, nor could it cover, all scenarios. For these reasons, as with all LLMs, Llama 3’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 3 models, developers should perform safety testing and tuning tailored to their specific applications of the model. As outlined in the Responsible Use Guide, we recommend incorporating Purple Llama solutions into your workflows and specifically Llama Guard which provides a base model to filter input and output prompts to layer system-level safety on top of model-level safety. Please see the Responsible Use Guide available at ## Citation instructions @article{llama3modelcard, title={Llama 3 Model Card}, author={AI@Meta}, year={2024}, url = { } ## Contributors Aaditya Singh; Aaron Grattafiori; Abhimanyu Dubey; Abhinav Jauhri; Abhinav Pandey; Abhishek Kadian; Adam Kelsey; Adi Gangidi; Ahmad Al-Dahle; Ahuva Goldstand; Aiesha Letman; Ajay Menon; Akhil Mathur; Alan Schelten; Alex Vaughan; Amy Yang; Andrei Lupu; Andres Alvarado; Andrew Gallagher; Andrew Gu; Andrew Ho; Andrew Poulton; Andrew Ryan; Angela Fan; Ankit Ramchandani; Anthony Hartshorn; Archi Mitra; Archie Sravankumar; Artem Korenev; Arun Rao; Ashley Gabriel; Ashwin Bharambe; Assaf Eisenman; Aston Zhang; Aurelien Rodriguez; Austen Gregerson; Ava Spataru; Baptiste Roziere; Ben Maurer; Benjamin Leonhardi; Bernie Huang; Bhargavi Paranjape; Bing Liu; Binh Tang; Bobbie Chern; Brani Stojkovic; Brian Fuller; Catalina Mejia Arenas; Chao Zhou; Charlotte Caucheteux; Chaya Nayak; Ching-Hsiang Chu; Chloe Bi; Chris Cai; Chris Cox; Chris Marra; Chris McConnell; Christian Keller; Christoph Feichtenhofer; Christophe Touret; Chunyang Wu; Corinne Wong; Cristian Canton Ferrer; Damien Allonsius; Daniel Kreymer; Daniel Haziza; Daniel Li; Danielle Pintz; Danny Livshits; Danny Wyatt; David Adkins; David Esiobu; David Xu; Davide Testuggine; Delia David; Devi Parikh; Dhruv Choudhary; Dhruv Mahajan; Diana Liskovich; Diego Garcia-Olano; Diego Perino; Dieuwke Hupkes; Dingkang Wang; Dustin Holland; Egor Lakomkin; Elina Lobanova; Xiaoqing Ellen Tan; Emily Dinan; Eric Smith; Erik Brinkman; Esteban Arcaute; Filip Radenovic; Firat Ozgenel; Francesco Caggioni; Frank Seide; Frank Zhang; Gabriel Synnaeve; Gabriella Schwarz; Gabrielle Lee; Gada Badeer; Georgia Anderson; Graeme Nail; Gregoire Mialon; Guan Pang; Guillem Cucurell; Hailey Nguyen; Hannah Korevaar; Hannah Wang; Haroun Habeeb; Harrison Rudolph; Henry Aspegren; Hu Xu; Hugo Touvron; Iga Kozlowska; Igor Molybog; Igor Tufanov; Iliyan Zarov; Imanol Arrieta Ibarra; Irina-Elena Veliche; Isabel Kloumann; Ishan Misra; Ivan Evtimov; Jacob Xu; Jade Copet; Jake Weissman; Jan Geffert; Jana Vranes; Japhet Asher; Jason Park; Jay Mahadeokar; Jean-Baptiste Gaya; Jeet Shah; Jelmer van der Linde; Jennifer Chan; Jenny Hong; Jenya Lee; Jeremy Fu; Jeremy Teboul; Jianfeng Chi; Jianyu Huang; Jie Wang; Jiecao Yu; Joanna Bitton; Joe Spisak; Joelle Pineau; Jon Carvill; Jongsoo Park; Joseph Rocca; Joshua Johnstun; Junteng Jia; Kalyan Vasuden Alwala; Kam Hou U; Kate Plawiak; Kartikeya Upasani; Kaushik Veeraraghavan; Ke Li; Kenneth Heafield; Kevin Stone; Khalid El-Arini; Krithika Iyer; Kshitiz Malik; Kuenley Chiu; Kunal Bhalla; Kyle Huang; Lakshya Garg; Lauren Rantala-Yeary; Laurens van der Maaten; Lawrence Chen; Leandro Silva; Lee Bell; Lei Zhang; Liang Tan; Louis Martin; Lovish Madaan; Luca Wehrstedt; Lukas Blecher; Luke de Oliveira; Madeline Muzzi; Madian Khabsa; Manav Avlani; Mannat Singh; Manohar Paluri; Mark Zuckerberg; Marcin Kardas; Martynas Mankus; Mathew Oldham; Mathieu Rita; Matthew Lennie; Maya Pavlova; Meghan Keneally; Melanie Kambadur; Mihir Patel; Mikayel Samvelyan; Mike Clark; Mike Lewis; Min Si; Mitesh Kumar Singh; Mo Metanat; Mona Hassan; Naman Goyal; Narjes Torabi; Nicolas Usunier; Nikolay Bashlykov; Nikolay Bogoychev; Niladri Chatterji; Ning Dong; Oliver Aobo Yang; Olivier Duchenne; Onur Celebi; Parth Parekh; Patrick Alrassy; Paul Saab; Pavan Balaji; Pedro Rittner; Pengchuan Zhang; Pengwei Li; Petar Vasic; Peter Weng; Polina Zvyagina; Prajjwal Bhargava; Pratik Dubal; Praveen Krishnan; Punit Singh Koura; Qing He; Rachel Rodriguez; Ragavan Srinivasan; Rahul Mitra; Ramon Calderer; Raymond Li; Robert Stojnic; Roberta Raileanu; Robin Battey; Rocky Wang; Rohit Girdhar; Rohit Patel; Romain Sauvestre; Ronnie Polidoro; Roshan Sumbaly; Ross Taylor; Ruan Silva; Rui Hou; Rui Wang; Russ Howes; Ruty Rinott; Saghar Hosseini; Sai Jayesh Bondu; Samyak Datta; Sanjay Singh; Sara Chugh; Sargun Dhillon; Satadru Pan; Sean Bell; Sergey Edunov; Shaoliang Nie; Sharan Narang; Sharath Raparthy; Shaun Lindsay; Sheng Feng; Sheng Shen; Shenghao Lin; Shiva Shankar; Shruti Bhosale; Shun Zhang; Simon Vandenhende; Sinong Wang; Seohyun Sonia Kim; Soumya Batra; Sten Sootla; Steve Kehoe; Suchin Gururangan; Sumit Gupta; Sunny Virk; Sydney Borodinsky; Tamar Glaser; Tamar Herman; Tamara Best; Tara Fowler; Thomas Georgiou; Thomas Scialom; Tianhe Li; Todor Mihaylov; Tong Xiao; Ujjwal Karn; Vedanuj Goswami; Vibhor Gupta; Vignesh Ramanathan; Viktor Kerkez; Vinay Satish Kumar; Vincent Gonguet; Vish Vogeti; Vlad Poenaru; Vlad Tiberiu Mihailescu; Vladan Petrovic; Vladimir Ivanov; Wei Li; Weiwei Chu; Wenhan Xiong; Wenyin Fu; Wes Bouaziz; Whitney Meers; Will Constable; Xavier Martinet; Xiaojian Wu; Xinbo Gao; Xinfeng Xie; Xuchao Jia; Yaelle Goldschlag; Yann LeCun; Yashesh Gaur; Yasmine Babaei; Ye Qi; Yenda Li; Yi Wen; Yiwen Song; Youngjin Nam; Yuchen Hao; Yuchen Zhang; Yun Wang; Yuning Mao; Yuzi He; Zacharie Delpierre Coudert; Zachary DeVito; Zahra Hankir; Zhaoduo Wen; Zheng Yan; Zhengxing Chen; Zhenyu Yang; Zoe Papakipos", + "model_explanation_gemini": "Generates text in English using a large language model architecture (Llama-3) with 8 billion parameters, licensed under Meta's community agreement.\n\nFeatures: \n- **Text-generation** pipeline \n- English language support \n- 8B parameter scale \n- Released under Meta's Llama 3 Community License \n- Requires attribution and compliance with usage policies \n\nComparison (if inferred from description): \nNewer version (Llama-3.1-8B) exists, suggesting iterative improvements" +} \ No newline at end of file diff --git a/model_data_json/minishlab_M2V_base_output.json b/model_data_json/minishlab_M2V_base_output.json new file mode 100644 index 0000000000000000000000000000000000000000..18668febaa9a4cc9fd48a1e482c1e2970f035773 --- /dev/null +++ b/model_data_json/minishlab_M2V_base_output.json @@ -0,0 +1,21 @@ +{ + "model_id": "minishlab/M2V_base_output", + "downloads": 73447, + "tags": [ + "model2vec", + "onnx", + "safetensors", + "embeddings", + "static-embeddings", + "mteb", + "sentence-transformers", + "en", + "base_model:BAAI/bge-base-en-v1.5", + "base_model:quantized:BAAI/bge-base-en-v1.5", + "license:mit", + "model-index", + "region:us" + ], + "description": "--- base_model: - BAAI/bge-base-en-v1.5 language: - en library_name: model2vec license: mit model-index: - name: M2V_base_output results: - dataset: config: en-ext name: MTEB AmazonCounterfactualClassification (en-ext) revision: e8379541af4e31359cca9fbcf4b00f2671dba205 split: test type: mteb/amazon_counterfactual metrics: - type: accuracy value: 69.1904047976012 - type: ap value: 19.610682715583142 - type: ap_weighted value: 19.610682715583142 - type: f1 value: 57.14831247701502 - type: f1_weighted value: 75.0407024695743 - type: main_score value: 69.1904047976012 task: type: Classification - dataset: config: en name: MTEB AmazonCounterfactualClassification (en) revision: e8379541af4e31359cca9fbcf4b00f2671dba205 split: test type: mteb/amazon_counterfactual metrics: - type: accuracy value: 71.1044776119403 - type: ap value: 33.83428171392154 - type: ap_weighted value: 33.83428171392154 - type: f1 value: 65.18431700199532 - type: f1_weighted value: 73.90467162513829 - type: main_score value: 71.1044776119403 task: type: Classification - dataset: config: default name: MTEB AmazonPolarityClassification (default) revision: e2d317d38cd51312af73b3d32a06d1a08b442046 split: test type: mteb/amazon_polarity metrics: - type: accuracy value: 67.328075 - type: ap value: 62.26238067958846 - type: ap_weighted value: 62.26238067958846 - type: f1 value: 66.93195816551996 - type: f1_weighted value: 66.93195816551996 - type: main_score value: 67.328075 task: type: Classification - dataset: config: en name: MTEB AmazonReviewsClassification (en) revision: 1399c76144fd37290681b995c656ef9b2e06e26d split: test type: mteb/amazon_reviews_multi metrics: - type: accuracy value: 32.589999999999996 - type: f1 value: 32.11760053698346 - type: f1_weighted value: 32.11760053698346 - type: main_score value: 32.589999999999996 task: type: Classification - dataset: config: default name: MTEB ArguAna (default) revision: c22ab2a51041ffd869aaddef7af8d8215647e41a split: test type: mteb/arguana metrics: - type: main_score value: 29.183999999999997 - type: map_at_1 value: 14.011000000000001 - type: map_at_10 value: 23.748 - type: map_at_100 value: 24.808 - type: map_at_1000 value: 24.89 - type: map_at_20 value: 24.354 - type: map_at_3 value: 20.721 - type: map_at_5 value: 22.509 - type: mrr_at_1 value: 14.509246088193455 - type: mrr_at_10 value: 23.930067285330413 - type: mrr_at_100 value: 24.990313023015393 - type: mrr_at_1000 value: 25.071881804001343 - type: mrr_at_20 value: 24.53573559987519 - type: mrr_at_3 value: 20.88667614983403 - type: mrr_at_5 value: 22.7038880986249 - type: nauc_map_at_1000_diff1 value: 10.066441521146057 - type: nauc_map_at_1000_max value: -0.5837671794505647 - type: nauc_map_at_1000_std value: 12.356714430015906 - type: nauc_map_at_100_diff1 value: 10.076633271522182 - type: nauc_map_at_100_max value: -0.5731496124067438 - type: nauc_map_at_100_std value: 12.415984202967115 - type: nauc_map_at_10_diff1 value: 9.867302245745831 - type: nauc_map_at_10_max value: -0.8261964947948097 - type: nauc_map_at_10_std value: 11.57502900905332 - type: nauc_map_at_1_diff1 value: 10.389795558592775 - type: nauc_map_at_1_max value: -4.511506238918001 - type: nauc_map_at_1_std value: 9.62435943787401 - type: nauc_map_at_20_diff1 value: 10.114926370948476 - type: nauc_map_at_20_max value: -0.38257232900731064 - type: nauc_map_at_20_std value: 12.070421408069302 - type: nauc_map_at_3_diff1 value: 8.840416555242445 - type: nauc_map_at_3_max value: -2.284214343720665 - type: nauc_map_at_3_std value: 9.41211373407306 - type: nauc_map_at_5_diff1 value: 9.4616046565665 - type: nauc_map_at_5_max value: -1.8580221033457682 - type: nauc_map_at_5_std value: 10.252697423331279 - type: nauc_mrr_at_1000_diff1 value: 8.50590042077137 - type: nauc_mrr_at_1000_max value: -0.9532348980220058 - type: nauc_mrr_at_1000_std value: 11.917718432821042 - type: nauc_mrr_at_100_diff1 value: 8.519603663729045 - type: nauc_mrr_at_100_max value: -0.941843377489153 - type: nauc_mrr_at_100_std value: 11.977460275257405 - type: nauc_mrr_at_10_diff1 value: 8.324129262175067 - type: nauc_mrr_at_10_max value: -1.1819451563051036 - type: nauc_mrr_at_10_std value: 11.143112974385687 - type: nauc_mrr_at_1_diff1 value: 7.923019186157461 - type: nauc_mrr_at_1_max value: -3.8622428906009336 - type: nauc_mrr_at_1_std value: 8.574254762702411 - type: nauc_mrr_at_20_diff1 value: 8.57172824197632 - type: nauc_mrr_at_20_max value: -0.7479018550868611 - type: nauc_mrr_at_20_std value: 11.638538106885681 - type: nauc_mrr_at_3_diff1 value: 7.176947665978892 - type: nauc_mrr_at_3_max value: -2.8140949706898937 - type: nauc_mrr_at_3_std value: 8.966233266672026 - type: nauc_mrr_at_5_diff1 value: 7.921651668561097 - type: nauc_mrr_at_5_max value: -2.1687598838347353 - type: nauc_mrr_at_5_std value: 9.810384238460967 - type: nauc_ndcg_at_1000_diff1 value: 11.09862326017166 - type: nauc_ndcg_at_1000_max value: 1.6567266738852608 - type: nauc_ndcg_at_1000_std value: 16.06391490264334 - type: nauc_ndcg_at_100_diff1 value: 11.372692796637454 - type: nauc_ndcg_at_100_max value: 1.8759976608604172 - type: nauc_ndcg_at_100_std value: 17.653326421438013 - type: nauc_ndcg_at_10_diff1 value: 10.629937509771837 - type: nauc_ndcg_at_10_max value: 1.3739681707601088 - type: nauc_ndcg_at_10_std value: 13.688730163159986 - type: nauc_ndcg_at_1_diff1 value: 10.389795558592775 - type: nauc_ndcg_at_1_max value: -4.511506238918001 - type: nauc_ndcg_at_1_std value: 9.62435943787401 - type: nauc_ndcg_at_20_diff1 value: 11.486521194068173 - type: nauc_ndcg_at_20_max value: 2.855255358038754 - type: nauc_ndcg_at_20_std value: 15.394981206314688 - type: nauc_ndcg_at_3_diff1 value: 8.680000272030385 - type: nauc_ndcg_at_3_max value: -1.6634044566640975 - type: nauc_ndcg_at_3_std value: 9.268472321517171 - type: nauc_ndcg_at_5_diff1 value: 9.711071086647511 - type: nauc_ndcg_at_5_max value: -0.9491120105126298 - type: nauc_ndcg_at_5_std value: 10.68847112511071 - type: nauc_precision_at_1000_diff1 value: 20.67453341943155 - type: nauc_precision_at_1000_max value: 21.6433346658854 - type: nauc_precision_at_1000_std value: 50.563552510430355 - type: nauc_precision_at_100_diff1 value: 17.05138860576984 - type: nauc_precision_at_100_max value: 10.671778777967742 - type: nauc_precision_at_100_std value: 42.815464007080514 - type: nauc_precision_at_10_diff1 value: 12.834245751753656 - type: nauc_precision_at_10_max value: 7.237728992777975 - type: nauc_precision_at_10_std value: 19.637476638724 - type: nauc_precision_at_1_diff1 value: 10.389795558592775 - type: nauc_precision_at_1_max value: -4.511506238918001 - type: nauc_precision_at_1_std value: 9.62435943787401 - type: nauc_precision_at_20_diff1 value: 15.960793242410434 - type: nauc_precision_at_20_max value: 12.642865380113017 - type: nauc_precision_at_20_std value: 25.900201704789065 - type: nauc_precision_at_3_diff1 value: 8.364265704499747 - type: nauc_precision_at_3_max value: -0.20060414550763578 - type: nauc_precision_at_3_std value: 8.910638511394128 - type: nauc_precision_at_5_diff1 value: 10.43686249937682 - type: nauc_precision_at_5_max value: 1.2061629814752834 - type: nauc_precision_at_5_std value: 11.812984132266987 - type: nauc_recall_at_1000_diff1 value: 20.674533419431576 - type: nauc_recall_at_1000_max value: 21.643334665885174 - type: nauc_recall_at_1000_std value: 50.563552510430256 - type: nauc_recall_at_100_diff1 value: 17.05138860576987 - type: nauc_recall_at_100_max value: 10.671778777967747 - type: nauc_recall_at_100_std value: 42.81546400708045 - type: nauc_recall_at_10_diff1 value: 12.83424575175363 - type: nauc_recall_at_10_max value: 7.237728992777978 - type: nauc_recall_at_10_std value: 19.637476638724007 - type: nauc_recall_at_1_diff1 value: 10.389795558592775 - type: nauc_recall_at_1_max value: -4.511506238918001 - type: nauc_recall_at_1_std value: 9.62435943787401 - type: nauc_recall_at_20_diff1 value: 15.960793242410464 - type: nauc_recall_at_20_max value: 12.642865380113033 - type: nauc_recall_at_20_std value: 25.900201704789094 - type: nauc_recall_at_3_diff1 value: 8.364265704499777 - type: nauc_recall_at_3_max value: -0.2006041455076358 - type: nauc_recall_at_3_std value: 8.910638511394144 - type: nauc_recall_at_5_diff1 value: 10.436862499376828 - type: nauc_recall_at_5_max value: 1.2061629814752328 - type: nauc_recall_at_5_std value: 11.81298413226698 - type: ndcg_at_1 value: 14.011000000000001 - type: ndcg_at_10 value: 29.183999999999997 - type: ndcg_at_100 value: 34.618 - type: ndcg_at_1000 value: 37.006 - type: ndcg_at_20 value: 31.371 - type: ndcg_at_3 value: 22.991 - type: ndcg_at_5 value: 26.244 - type: precision_at_1 value: 14.011000000000001 - type: precision_at_10 value: 4.651000000000001 - type: precision_at_100 value: 0.7250000000000001 - type: precision_at_1000 value: 0.092 - type: precision_at_20 value: 2.7560000000000002 - type: precision_at_3 value: 9.862 - type: precision_at_5 value: 7.510999999999999 - type: recall_at_1 value: 14.011000000000001 - type: recall_at_10 value: 46.515 - type: recall_at_100 value: 72.54599999999999 - type: recall_at_1000 value: 91.821 - type: recall_at_20 value: 55.120999999999995 - type: recall_at_3 value: 29.587000000000003 - type: recall_at_5 value: 37.553 task: type: Retrieval - dataset: config: default name: MTEB ArxivClusteringP2P (default) revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d split: test type: mteb/arxiv-clustering-p2p metrics: - type: main_score value: 31.259738106366225 - type: v_measure value: 31.259738106366225 - type: v_measure_std value: 14.320141623571129 task: type: Clustering - dataset: config: default name: MTEB ArxivClusteringS2S (default) revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53 split: test type: mteb/arxiv-clustering-s2s metrics: - type: main_score value: 20.744213693691467 - type: v_measure value: 20.744213693691467 - type: v_measure_std value: 15.404721116239472 task: type: Clustering - dataset: config: default name: MTEB AskUbuntuDupQuestions (default) revision: 2000358ca161889fa9c082cb41daa8dcfb161a54 split: test type: mteb/askubuntudupquestions-reranking metrics: - type: main_score value: 51.62795895312553 - type: map value: 51.62795895312553 - type: mrr value: 65.83135470254582 - type: nAUC_map_diff1 value: 14.141914127697058 - type: nAUC_map_max value: 15.463053892954765 - type: nAUC_map_std value: 6.690591989325812 - type: nAUC_mrr_diff1 value: 17.935217602773022 - type: nAUC_mrr_max value: 20.50394658394339 - type: nAUC_mrr_std value: 11.867431280645176 task: type: Reranking - dataset: config: default name: MTEB BIOSSES (default) revision: d3fb88f8f02e40887cd149695127462bbcf29b4a split: test type: mteb/biosses-sts metrics: - type: cosine_pearson value: 73.32741772202057 - type: cosine_spearman value: 73.42938398170034 - type: euclidean_pearson value: 52.53960842495785 - type: euclidean_spearman value: 55.20186022147138 - type: main_score value: 73.42938398170034 - type: manhattan_pearson value: 51.2857441475548 - type: manhattan_spearman value: 53.75062233475454 - type: pearson value: 73.32741772202057 - type: spearman value: 73.42938398170034 task: type: STS - dataset: config: default name: MTEB Banking77Classification (default) revision: 0fd18e25b25c072e09e0d92ab615fda904d66300 split: test type: mteb/banking77 metrics: - type: accuracy value: 71.90909090909092 - type: f1 value: 71.98225635322173 - type: f1_weighted value: 71.98225635322173 - type: main_score value: 71.90909090909092 task: type: Classification - dataset: config: default name: MTEB BiorxivClusteringP2P (default) revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40 split: test type: mteb/biorxiv-clustering-p2p metrics: - type: main_score value: 26.532893125445977 - type: v_measure value: 26.532893125445977 - type: v_measure_std value: 0.6837586171917341 task: type: Clustering - dataset: config: default name: MTEB BiorxivClusteringS2S (default) revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908 split: test type: mteb/biorxiv-clustering-s2s metrics: - type: main_score value: 14.036948167749145 - type: v_measure value: 14.036948167749145 - type: v_measure_std value: 0.5714236374163745 task: type: Clustering - dataset: config: default name: MTEB CQADupstackAndroidRetrieval (default) revision: f46a197baaae43b4f621051089b82a364682dfeb split: test type: mteb/cqadupstack-android metrics: - type: main_score value: 28.679 - type: map_at_1 value: 18.546000000000003 - type: map_at_10 value: 24.42 - type: map_at_100 value: 25.495 - type: map_at_1000 value: 25.633 - type: map_at_20 value: 24.967 - type: map_at_3 value: 22.375 - type: map_at_5 value: 23.369999999999997 - type: mrr_at_1 value: 23.74821173104435 - type: mrr_at_10 value: 29.62997025228784 - type: mrr_at_100 value: 30.509005070582297 - type: mrr_at_1000 value: 30.57992301494201 - type: mrr_at_20 value: 30.087957677199494 - type: mrr_at_3 value: 27.944682880305205 - type: mrr_at_5 value: 28.70290891750119 - type: nauc_map_at_1000_diff1 value: 41.91741127467118 - type: nauc_map_at_1000_max value: 29.343811648500857 - type: nauc_map_at_1000_std value: -10.94124792488155 - type: nauc_map_at_100_diff1 value: 41.9257059722684 - type: nauc_map_at_100_max value: 29.312977236968447 - type: nauc_map_at_100_std value: -10.964994215476203 - type: nauc_map_at_10_diff1 value: 42.23276701935884 - type: nauc_map_at_10_max value: 28.927475882624865 - type: nauc_map_at_10_std value: -11.387774428133683 - type: nauc_map_at_1_diff1 value: 47.30172597053699 - type: nauc_map_at_1_max value: 29.662552695406873 - type: nauc_map_at_1_std value: -11.737219447429663 - type: nauc_map_at_20_diff1 value: 41.92458662433504 - type: nauc_map_at_20_max value: 29.174781873350845 - type: nauc_map_at_20_std value: -11.124043543527577 - type: nauc_map_at_3_diff1 value: 43.129372455872165 - type: nauc_map_at_3_max value: 28.848842418769422 - type: nauc_map_at_3_std value: -12.285962277168842 - type: nauc_map_at_5_diff1 value: 42.83044499601317 - type: nauc_map_at_5_max value: 28.98993975777227 - type: nauc_map_at_5_std value: -11.92018253024468 - type: nauc_mrr_at_1000_diff1 value: 40.82041172984889 - type: nauc_mrr_at_1000_max value: 30.480885490296473 - type: nauc_mrr_at_1000_std value: -12.106796913247855 - type: nauc_mrr_at_100_diff1 value: 40.80133713998306 - type: nauc_mrr_at_100_max value: 30.47453951479006 - type: nauc_mrr_at_100_std value: -12.124703479791053 - type: nauc_mrr_at_10_diff1 value: 41.09211981274445 - type: nauc_mrr_at_10_max value: 30.497262535612556 - type: nauc_mrr_at_10_std value: -12.563263045952947 - type: nauc_mrr_at_1_diff1 value: 45.0389906310178 - type: nauc_mrr_at_1_max value: 32.16914824564583 - type: nauc_mrr_at_1_std value: -13.19897745721674 - type: nauc_mrr_at_20_diff1 value: 40.821901422240764 - type: nauc_mrr_at_20_max value: 30.545295646645254 - type: nauc_mrr_at_20_std value: -12.196074023168364 - type: nauc_mrr_at_3_diff1 value: 41.57196675439484 - type: nauc_mrr_at_3_max value: 30.700923825692193 - type: nauc_mrr_at_3_std value: -13.269209066277213 - type: nauc_mrr_at_5_diff1 value: 41.591753620602994 - type: nauc_mrr_at_5_max value: 30.63135138641901 - type: nauc_mrr_at_5_std value: -12.87020601984748 - type: nauc_ndcg_at_1000_diff1 value: 38.92537692516828 - type: nauc_ndcg_at_1000_max value: 29.68260722943582 - type: nauc_ndcg_at_1000_std value: -8.602092840233484 - type: nauc_ndcg_at_100_diff1 value: 38.64203362764584 - type: nauc_ndcg_at_100_max value: 29.393224511276372 - type: nauc_ndcg_at_100_std value: -9.191485720275928 - type: nauc_ndcg_at_10_diff1 value: 39.88534566732229 - type: nauc_ndcg_at_10_max value: 28.986279143641227 - type: nauc_ndcg_at_10_std value: -11.318342616747607 - type: nauc_ndcg_at_1_diff1 value: 45.0389906310178 - type: nauc_ndcg_at_1_max value: 32.16914824564583 - type: nauc_ndcg_at_1_std value: -13.19897745721674 - type: nauc_ndcg_at_20_diff1 value: 38.94952491835268 - type: nauc_ndcg_at_20_max value: 29.206603792767904 - type: nauc_ndcg_at_20_std value: -10.304566017193741 - type: nauc_ndcg_at_3_diff1 value: 40.7977929353434 - type: nauc_ndcg_at_3_max value: 29.580955663728076 - type: nauc_ndcg_at_3_std value: -12.648223472095015 - type: nauc_ndcg_at_5_diff1 value: 40.74984554791671 - type: nauc_ndcg_at_5_max value: 29.59605805593679 - type: nauc_ndcg_at_5_std value: -12.139160076565458 - type: nauc_precision_at_1000_diff1 value: 4.7568680155941925 - type: nauc_precision_at_1000_max value: 7.5355032131826984 - type: nauc_precision_at_1000_std value: -2.0414131984483914 - type: nauc_precision_at_100_diff1 value: 11.527472092658552 - type: nauc_precision_at_100_max value: 21.514326888623554 - type: nauc_precision_at_100_std value: -2.625060194142745 - type: nauc_precision_at_10_diff1 value: 24.503150439921896 - type: nauc_precision_at_10_max value: 28.670536590094265 - type: nauc_precision_at_10_std value: -8.197131538769034 - type: nauc_precision_at_1_diff1 value: 45.0389906310178 - type: nauc_precision_at_1_max value: 32.16914824564583 - type: nauc_precision_at_1_std value: -13.19897745721674 - type: nauc_precision_at_20_diff1 value: 17.864116269261178 - type: nauc_precision_at_20_max value: 27.6641030785838 - type: nauc_precision_at_20_std value: -7.076744708977724 - type: nauc_precision_at_3_diff1 value: 33.5854284842399 - type: nauc_precision_at_3_max value: 29.14301466077523 - type: nauc_precision_at_3_std value: -13.269490261877111 - type: nauc_precision_at_5_diff1 value: 29.98097033677175 - type: nauc_precision_at_5_max value: 29.294311210263995 - type: nauc_precision_at_5_std value: -10.994820836992847 - type: nauc_recall_at_1000_diff1 value: 23.22014562996405 - type: nauc_recall_at_1000_max value: 27.193319559932988 - type: nauc_recall_at_1000_std value: 12.472685466473857 - type: nauc_recall_at_100_diff1 value: 25.23024173971804 - type: nauc_recall_at_100_max value: 25.082403028027738 - type: nauc_recall_at_100_std value: -0.052423861070247414 - type: nauc_recall_at_10_diff1 value: 33.12106610160164 - type: nauc_recall_at_10_max value: 24.918229663001544 - type: nauc_recall_at_10_std value: -8.549535177480411 - type: nauc_recall_at_1_diff1 value: 47.30172597053699 - type: nauc_recall_at_1_max value: 29.662552695406873 - type: nauc_recall_at_1_std value: -11.737219447429663 - type: nauc_recall_at_20_diff1 value: 28.81435708597515 - type: nauc_recall_at_20_max value: 25.47943694144538 - type: nauc_recall_at_20_std value: -5.307500208427278 - type: nauc_recall_at_3_diff1 value: 36.830405146866575 - type: nauc_recall_at_3_max value: 26.435300017685588 - type: nauc_recall_at_3_std value: -12.224084159115286 - type: nauc_recall_at_5_diff1 value: 36.17592797525086 - type: nauc_recall_at_5_max value: 26.135745335293564 - type: nauc_recall_at_5_std value: -10.854448931576895 - type: ndcg_at_1 value: 23.748 - type: ndcg_at_10 value: 28.679 - type: ndcg_at_100 value: 33.849000000000004 - type: ndcg_at_1000 value: 36.903999999999996 - type: ndcg_at_20 value: 30.389 - type: ndcg_at_3 value: 25.602999999999998 - type: ndcg_at_5 value: 26.66 - type: precision_at_1 value: 23.748 - type: precision_at_10 value: 5.479 - type: precision_at_100 value: 1.0070000000000001 - type: precision_at_1000 value: 0.156 - type: precision_at_20 value: 3.3689999999999998 - type: precision_at_3 value: 12.303 - type: precision_at_5 value: 8.784 - type: recall_at_1 value: 18.546000000000003 - type: recall_at_10 value: 36.062 - type: recall_at_100 value: 59.622 - type: recall_at_1000 value: 80.49199999999999 - type: recall_at_20 value: 42.459 - type: recall_at_3 value: 26.346000000000004 - type: recall_at_5 value: 29.685 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackEnglishRetrieval (default) revision: ad9991cb51e31e31e430383c75ffb2885547b5f0 split: test type: mteb/cqadupstack-english metrics: - type: main_score value: 24.201 - type: map_at_1 value: 15.659 - type: map_at_10 value: 20.72 - type: map_at_100 value: 21.494 - type: map_at_1000 value: 21.61 - type: map_at_20 value: 21.118000000000002 - type: map_at_3 value: 19.112000000000002 - type: map_at_5 value: 20.018 - type: mrr_at_1 value: 20.191082802547772 - type: mrr_at_10 value: 25.214639571327467 - type: mrr_at_100 value: 25.923135895788356 - type: mrr_at_1000 value: 25.99481688491863 - type: mrr_at_20 value: 25.587003181612815 - type: mrr_at_3 value: 23.736730360934178 - type: mrr_at_5 value: 24.590233545647543 - type: nauc_map_at_1000_diff1 value: 43.16887932091616 - type: nauc_map_at_1000_max value: 13.001793350069521 - type: nauc_map_at_1000_std value: -3.240745072009945 - type: nauc_map_at_100_diff1 value: 43.186513856436335 - type: nauc_map_at_100_max value: 12.974985819420635 - type: nauc_map_at_100_std value: -3.2702208916272513 - type: nauc_map_at_10_diff1 value: 43.564640578903344 - type: nauc_map_at_10_max value: 13.229537802390597 - type: nauc_map_at_10_std value: -3.7960991209188033 - type: nauc_map_at_1_diff1 value: 49.188047470455324 - type: nauc_map_at_1_max value: 12.622228914711336 - type: nauc_map_at_1_std value: -5.079814609778495 - type: nauc_map_at_20_diff1 value: 43.34504671504679 - type: nauc_map_at_20_max value: 13.053303288029316 - type: nauc_map_at_20_std value: -3.53357011925504 - type: nauc_map_at_3_diff1 value: 44.804892782636394 - type: nauc_map_at_3_max value: 13.58725707185815 - type: nauc_map_at_3_std value: -3.8777357887480894 - type: nauc_map_at_5_diff1 value: 43.72391951178523 - type: nauc_map_at_5_max value: 13.568707067556259 - type: nauc_map_at_5_std value: -4.038106969015966 - type: nauc_mrr_at_1000_diff1 value: 40.667038144431636 - type: nauc_mrr_at_1000_max value: 14.384125598011202 - type: nauc_mrr_at_1000_std value: -2.444399832932607 - type: nauc_mrr_at_100_diff1 value: 40.65910143040065 - type: nauc_mrr_at_100_max value: 14.375036584618234 - type: nauc_mrr_at_100_std value: -2.4274195136508547 - type: nauc_mrr_at_10_diff1 value: 40.89131817246553 - type: nauc_mrr_at_10_max value: 14.581024560636887 - type: nauc_mrr_at_10_std value: -2.703373098942388 - type: nauc_mrr_at_1_diff1 value: 45.09051009190851 - type: nauc_mrr_at_1_max value: 15.831915244565245 - type: nauc_mrr_at_1_std value: -4.310101948715212 - type: nauc_mrr_at_20_diff1 value: 40.78860474631307 - type: nauc_mrr_at_20_max value: 14.4782017138514 - type: nauc_mrr_at_20_std value: -2.5161572751678998 - type: nauc_mrr_at_3_diff1 value: 41.68191255304641 - type: nauc_mrr_at_3_max value: 15.041970652494102 - type: nauc_mrr_at_3_std value: -2.865017831776156 - type: nauc_mrr_at_5_diff1 value: 40.93732895812152 - type: nauc_mrr_at_5_max value: 14.810999495708327 - type: nauc_mrr_at_5_std value: -2.922166723623921 - type: nauc_ndcg_at_1000_diff1 value: 39.4110066143245 - type: nauc_ndcg_at_1000_max value: 12.821827433441005 - type: nauc_ndcg_at_1000_std value: -0.8108384214632934 - type: nauc_ndcg_at_100_diff1 value: 39.62118270064326 - type: nauc_ndcg_at_100_max value: 12.037720650973109 - type: nauc_ndcg_at_100_std value: -0.9362771831617082 - type: nauc_ndcg_at_10_diff1 value: 40.95447674096302 - type: nauc_ndcg_at_10_max value: 13.154418607273124 - type: nauc_ndcg_at_10_std value: -2.8988540864843886 - type: nauc_ndcg_at_1_diff1 value: 45.09051009190851 - type: nauc_ndcg_at_1_max value: 15.831915244565245 - type: nauc_ndcg_at_1_std value: -4.310101948715212 - type: nauc_ndcg_at_20_diff1 value: 40.63851149738437 - type: nauc_ndcg_at_20_max value: 12.604171957141656 - type: nauc_ndcg_at_20_std value: -2.1910058415334763 - type: nauc_ndcg_at_3_diff1 value: 42.10101502571804 - type: nauc_ndcg_at_3_max value: 14.519710397645364 - type: nauc_ndcg_at_3_std value: -3.1565026643410667 - type: nauc_ndcg_at_5_diff1 value: 40.94273285512494 - type: nauc_ndcg_at_5_max value: 14.054440556480834 - type: nauc_ndcg_at_5_std value: -3.442189925092899 - type: nauc_precision_at_1000_diff1 value: -0.9565223011446182 - type: nauc_precision_at_1000_max value: 11.675006301584128 - type: nauc_precision_at_1000_std value: 8.093690013766537 - type: nauc_precision_at_100_diff1 value: 11.288302809626888 - type: nauc_precision_at_100_max value: 10.960387422561148 - type: nauc_precision_at_100_std value: 8.591223668593777 - type: nauc_precision_at_10_diff1 value: 25.64615042863472 - type: nauc_precision_at_10_max value: 14.069756217267985 - type: nauc_precision_at_10_std value: 0.08978592105584715 - type: nauc_precision_at_1_diff1 value: 45.09051009190851 - type: nauc_precision_at_1_max value: 15.831915244565245 - type: nauc_precision_at_1_std value: -4.310101948715212 - type: nauc_precision_at_20_diff1 value: 22.097468653407866 - type: nauc_precision_at_20_max value: 12.949212539250343 - type: nauc_precision_at_20_std value: 2.868048305908803 - type: nauc_precision_at_3_diff1 value: 33.24608090774321 - type: nauc_precision_at_3_max value: 16.588047560522053 - type: nauc_precision_at_3_std value: -1.2432725324047462 - type: nauc_precision_at_5_diff1 value: 28.89668943912206 - type: nauc_precision_at_5_max value: 16.25456580555215 - type: nauc_precision_at_5_std value: -2.0273998006444134 - type: nauc_recall_at_1000_diff1 value: 24.86548627119768 - type: nauc_recall_at_1000_max value: 10.68002967962002 - type: nauc_recall_at_1000_std value: 8.076769436730153 - type: nauc_recall_at_100_diff1 value: 28.204939299147387 - type: nauc_recall_at_100_max value: 6.159717806964745 - type: nauc_recall_at_100_std value: 6.145682430435217 - type: nauc_recall_at_10_diff1 value: 35.339197660807436 - type: nauc_recall_at_10_max value: 10.955842694171421 - type: nauc_recall_at_10_std value: -2.050234322464136 - type: nauc_recall_at_1_diff1 value: 49.188047470455324 - type: nauc_recall_at_1_max value: 12.622228914711336 - type: nauc_recall_at_1_std value: -5.079814609778495 - type: nauc_recall_at_20_diff1 value: 33.66153319489103 - type: nauc_recall_at_20_max value: 9.045136466332934 - type: nauc_recall_at_20_std value: 0.6362560055945043 - type: nauc_recall_at_3_diff1 value: 39.33078959934067 - type: nauc_recall_at_3_max value: 12.943838756532871 - type: nauc_recall_at_3_std value: -2.617759316161476 - type: nauc_recall_at_5_diff1 value: 36.121619339589245 - type: nauc_recall_at_5_max value: 12.417874949270544 - type: nauc_recall_at_5_std value: -3.091748807456823 - type: ndcg_at_1 value: 20.191 - type: ndcg_at_10 value: 24.201 - type: ndcg_at_100 value: 27.955999999999996 - type: ndcg_at_1000 value: 30.773 - type: ndcg_at_20 value: 25.44 - type: ndcg_at_3 value: 21.806 - type: ndcg_at_5 value: 22.905 - type: precision_at_1 value: 20.191 - type: precision_at_10 value: 4.573 - type: precision_at_100 value: 0.8059999999999999 - type: precision_at_1000 value: 0.13 - type: precision_at_20 value: 2.7449999999999997 - type: precision_at_3 value: 10.679 - type: precision_at_5 value: 7.580000000000001 - type: recall_at_1 value: 15.659 - type: recall_at_10 value: 29.968 - type: recall_at_100 value: 46.98 - type: recall_at_1000 value: 66.286 - type: recall_at_20 value: 34.621 - type: recall_at_3 value: 22.572 - type: recall_at_5 value: 25.787 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackGamingRetrieval (default) revision: 4885aa143210c98657558c04aaf3dc47cfb54340 split: test type: mteb/cqadupstack-gaming metrics: - type: main_score value: 34.102 - type: map_at_1 value: 22.269 - type: map_at_10 value: 29.754 - type: map_at_100 value: 30.692999999999998 - type: map_at_1000 value: 30.786 - type: map_at_20 value: 30.225 - type: map_at_3 value: 27.392 - type: map_at_5 value: 28.831 - type: mrr_at_1 value: 25.956112852664575 - type: mrr_at_10 value: 32.77869831318104 - type: mrr_at_100 value: 33.60378795088834 - type: mrr_at_1000 value: 33.66340064366992 - type: mrr_at_20 value: 33.18375173610909 - type: mrr_at_3 value: 30.647857889237173 - type: mrr_at_5 value: 31.980146290491067 - type: nauc_map_at_1000_diff1 value: 42.023422411516016 - type: nauc_map_at_1000_max value: 24.046890902960552 - type: nauc_map_at_1000_std value: -6.94632372002679 - type: nauc_map_at_100_diff1 value: 42.00488415137851 - type: nauc_map_at_100_max value: 24.029258386148577 - type: nauc_map_at_100_std value: -7.013947866427552 - type: nauc_map_at_10_diff1 value: 42.060086712211344 - type: nauc_map_at_10_max value: 23.998218675756625 - type: nauc_map_at_10_std value: -7.599227449673994 - type: nauc_map_at_1_diff1 value: 45.27837491202271 - type: nauc_map_at_1_max value: 23.873436707472766 - type: nauc_map_at_1_std value: -10.458746042802577 - type: nauc_map_at_20_diff1 value: 41.98597500237269 - type: nauc_map_at_20_max value: 24.07819180945319 - type: nauc_map_at_20_std value: -7.320963413971682 - type: nauc_map_at_3_diff1 value: 42.69809960018882 - type: nauc_map_at_3_max value: 23.63846349891855 - type: nauc_map_at_3_std value: -8.732892056046317 - type: nauc_map_at_5_diff1 value: 42.23446934702989 - type: nauc_map_at_5_max value: 23.905384542219803 - type: nauc_map_at_5_std value: -7.643670989026166 - type: nauc_mrr_at_1000_diff1 value: 42.122071790378016 - type: nauc_mrr_at_1000_max value: 25.86760736591077 - type: nauc_mrr_at_1000_std value: -5.266317827181621 - type: nauc_mrr_at_100_diff1 value: 42.10647973553166 - type: nauc_mrr_at_100_max value: 25.85687545921025 - type: nauc_mrr_at_100_std value: -5.270766368901785 - type: nauc_mrr_at_10_diff1 value: 42.24735092990674 - type: nauc_mrr_at_10_max value: 25.994930434678004 - type: nauc_mrr_at_10_std value: -5.6601281070075355 - type: nauc_mrr_at_1_diff1 value: 46.582933896071864 - type: nauc_mrr_at_1_max value: 27.228911381467753 - type: nauc_mrr_at_1_std value: -8.734962232415343 - type: nauc_mrr_at_20_diff1 value: 42.07873815943869 - type: nauc_mrr_at_20_max value: 25.963756082386645 - type: nauc_mrr_at_20_std value: -5.478617831866867 - type: nauc_mrr_at_3_diff1 value: 42.98246412395152 - type: nauc_mrr_at_3_max value: 26.158635453239686 - type: nauc_mrr_at_3_std value: -6.3931010500997125 - type: nauc_mrr_at_5_diff1 value: 42.43712298159192 - type: nauc_mrr_at_5_max value: 26.20143695371023 - type: nauc_mrr_at_5_std value: -5.622650253873388 - type: nauc_ndcg_at_1000_diff1 value: 40.40682446150754 - type: nauc_ndcg_at_1000_max value: 23.975034312446894 - type: nauc_ndcg_at_1000_std value: -2.645144894917121 - type: nauc_ndcg_at_100_diff1 value: 39.96263062735843 - type: nauc_ndcg_at_100_max value: 23.583706441511858 - type: nauc_ndcg_at_100_std value: -3.3869912444384114 - type: nauc_ndcg_at_10_diff1 value: 40.39533814272208 - type: nauc_ndcg_at_10_max value: 24.293062837455782 - type: nauc_ndcg_at_10_std value: -6.100075124875855 - type: nauc_ndcg_at_1_diff1 value: 46.582933896071864 - type: nauc_ndcg_at_1_max value: 27.228911381467753 - type: nauc_ndcg_at_1_std value: -8.734962232415343 - type: nauc_ndcg_at_20_diff1 value: 39.9687058773172 - type: nauc_ndcg_at_20_max value: 24.316546572139725 - type: nauc_ndcg_at_20_std value: -5.284472590592323 - type: nauc_ndcg_at_3_diff1 value: 41.76544027471963 - type: nauc_ndcg_at_3_max value: 24.275838336051923 - type: nauc_ndcg_at_3_std value: -7.5019513901932715 - type: nauc_ndcg_at_5_diff1 value: 40.90262427804706 - type: nauc_ndcg_at_5_max value: 24.491396294279173 - type: nauc_ndcg_at_5_std value: -6.148208697652546 - type: nauc_precision_at_1000_diff1 value: 8.310979675445102 - type: nauc_precision_at_1000_max value: 10.177503506631384 - type: nauc_precision_at_1000_std value: 27.06496193087599 - type: nauc_precision_at_100_diff1 value: 19.055469058991463 - type: nauc_precision_at_100_max value: 15.143082019798745 - type: nauc_precision_at_100_std value: 17.5613526737176 - type: nauc_precision_at_10_diff1 value: 30.60558520635145 - type: nauc_precision_at_10_max value: 23.899102367494276 - type: nauc_precision_at_10_std value: 1.49034477139435 - type: nauc_precision_at_1_diff1 value: 46.582933896071864 - type: nauc_precision_at_1_max value: 27.228911381467753 - type: nauc_precision_at_1_std value: -8.734962232415343 - type: nauc_precision_at_20_diff1 value: 27.34257473822076 - type: nauc_precision_at_20_max value: 23.166488954967583 - type: nauc_precision_at_20_std value: 5.306163418928192 - type: nauc_precision_at_3_diff1 value: 36.77034283418537 - type: nauc_precision_at_3_max value: 24.9271454504654 - type: nauc_precision_at_3_std value: -3.396946642230245 - type: nauc_precision_at_5_diff1 value: 34.27058913291088 - type: nauc_precision_at_5_max value: 24.976805100785057 - type: nauc_precision_at_5_std value: 0.7181940371896616 - type: nauc_recall_at_1000_diff1 value: 30.723778900213063 - type: nauc_recall_at_1000_max value: 18.638473722404548 - type: nauc_recall_at_1000_std value: 24.489955439065092 - type: nauc_recall_at_100_diff1 value: 29.354618599167313 - type: nauc_recall_at_100_max value: 16.731640777347838 - type: nauc_recall_at_100_std value: 9.835673177366234 - type: nauc_recall_at_10_diff1 value: 33.89120435058154 - type: nauc_recall_at_10_max value: 22.177696671277435 - type: nauc_recall_at_10_std value: -3.7985335869625865 - type: nauc_recall_at_1_diff1 value: 45.27837491202271 - type: nauc_recall_at_1_max value: 23.873436707472766 - type: nauc_recall_at_1_std value: -10.458746042802577 - type: nauc_recall_at_20_diff1 value: 31.924635267680696 - type: nauc_recall_at_20_max value: 22.051909242943513 - type: nauc_recall_at_20_std value: -0.8097713224498396 - type: nauc_recall_at_3_diff1 value: 38.150042036072456 - type: nauc_recall_at_3_max value: 22.400370920900837 - type: nauc_recall_at_3_std value: -6.80660585255143 - type: nauc_recall_at_5_diff1 value: 36.054052572950056 - type: nauc_recall_at_5_max value: 23.311864516504208 - type: nauc_recall_at_5_std value: -3.9369960666204302 - type: ndcg_at_1 value: 25.956000000000003 - type: ndcg_at_10 value: 34.102 - type: ndcg_at_100 value: 38.815 - type: ndcg_at_1000 value: 41.091 - type: ndcg_at_20 value: 35.616 - type: ndcg_at_3 value: 29.757 - type: ndcg_at_5 value: 32.054 - type: precision_at_1 value: 25.956000000000003 - type: precision_at_10 value: 5.5489999999999995 - type: precision_at_100 value: 0.8699999999999999 - type: precision_at_1000 value: 0.11499999999999999 - type: precision_at_20 value: 3.1850000000000005 - type: precision_at_3 value: 13.375 - type: precision_at_5 value: 9.492 - type: recall_at_1 value: 22.269 - type: recall_at_10 value: 44.487 - type: recall_at_100 value: 66.065 - type: recall_at_1000 value: 82.711 - type: recall_at_20 value: 50.002 - type: recall_at_3 value: 32.769999999999996 - type: recall_at_5 value: 38.411 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackGisRetrieval (default) revision: 5003b3064772da1887988e05400cf3806fe491f2 split: test type: mteb/cqadupstack-gis metrics: - type: main_score value: 15.504999999999999 - type: map_at_1 value: 9.322 - type: map_at_10 value: 13.215 - type: map_at_100 value: 13.995 - type: map_at_1000 value: 14.088999999999999 - type: map_at_20 value: 13.669 - type: map_at_3 value: 11.939 - type: map_at_5 value: 12.809000000000001 - type: mrr_at_1 value: 9.830508474576272 - type: mrr_at_10 value: 14.040713837324004 - type: mrr_at_100 value: 14.834528048636288 - type: mrr_at_1000 value: 14.922257469530024 - type: mrr_at_20 value: 14.506623042541603 - type: mrr_at_3 value: 12.749529190207154 - type: mrr_at_5 value: 13.574387947269303 - type: nauc_map_at_1000_diff1 value: 30.529210655419025 - type: nauc_map_at_1000_max value: 17.46667029242714 - type: nauc_map_at_1000_std value: -14.147892794296949 - type: nauc_map_at_100_diff1 value: 30.556449269344828 - type: nauc_map_at_100_max value: 17.43087617092459 - type: nauc_map_at_100_std value: -14.144527288140976 - type: nauc_map_at_10_diff1 value: 31.239447478533798 - type: nauc_map_at_10_max value: 18.11892096077464 - type: nauc_map_at_10_std value: -14.62039767845138 - type: nauc_map_at_1_diff1 value: 37.4160233030578 - type: nauc_map_at_1_max value: 17.016815986268263 - type: nauc_map_at_1_std value: -17.425864228691612 - type: nauc_map_at_20_diff1 value: 30.719067002092494 - type: nauc_map_at_20_max value: 17.39019156487201 - type: nauc_map_at_20_std value: -14.270868979007783 - type: nauc_map_at_3_diff1 value: 32.92206407639439 - type: nauc_map_at_3_max value: 17.835953557611468 - type: nauc_map_at_3_std value: -16.031495528857608 - type: nauc_map_at_5_diff1 value: 31.584976124274416 - type: nauc_map_at_5_max value: 18.197826240384625 - type: nauc_map_at_5_std value: -15.441753419032448 - type: nauc_mrr_at_1000_diff1 value: 30.046493575807283 - type: nauc_mrr_at_1000_max value: 19.498590306501473 - type: nauc_mrr_at_1000_std value: -13.402207800669682 - type: nauc_mrr_at_100_diff1 value: 30.05195056678437 - type: nauc_mrr_at_100_max value: 19.47592918375251 - type: nauc_mrr_at_100_std value: -13.39845392251263 - type: nauc_mrr_at_10_diff1 value: 30.791850768635605 - type: nauc_mrr_at_10_max value: 20.313444672627128 - type: nauc_mrr_at_10_std value: -13.935914370733792 - type: nauc_mrr_at_1_diff1 value: 37.50338859979288 - type: nauc_mrr_at_1_max value: 19.34649504621331 - type: nauc_mrr_at_1_std value: -17.116234597672054 - type: nauc_mrr_at_20_diff1 value: 30.18891093563501 - type: nauc_mrr_at_20_max value: 19.5511248509084 - type: nauc_mrr_at_20_std value: -13.5427820185682 - type: nauc_mrr_at_3_diff1 value: 32.07301084185587 - type: nauc_mrr_at_3_max value: 20.191966663668733 - type: nauc_mrr_at_3_std value: -15.04405001225193 - type: nauc_mrr_at_5_diff1 value: 31.086216757720575 - type: nauc_mrr_at_5_max value: 20.277903224593523 - type: nauc_mrr_at_5_std value: -14.65307477545357 - type: nauc_ndcg_at_1000_diff1 value: 25.991511686309938 - type: nauc_ndcg_at_1000_max value: 16.945396948437562 - type: nauc_ndcg_at_1000_std value: -11.694443736831037 - type: nauc_ndcg_at_100_diff1 value: 25.980124756057325 - type: nauc_ndcg_at_100_max value: 15.99158676356653 - type: nauc_ndcg_at_100_std value: -11.398279572216548 - type: nauc_ndcg_at_10_diff1 value: 28.892093125361416 - type: nauc_ndcg_at_10_max value: 18.71513717736543 - type: nauc_ndcg_at_10_std value: -12.779856403033296 - type: nauc_ndcg_at_1_diff1 value: 37.50338859979288 - type: nauc_ndcg_at_1_max value: 19.34649504621331 - type: nauc_ndcg_at_1_std value: -17.116234597672054 - type: nauc_ndcg_at_20_diff1 value: 27.25547422800403 - type: nauc_ndcg_at_20_max value: 16.331067486313643 - type: nauc_ndcg_at_20_std value: -11.8817415790308 - type: nauc_ndcg_at_3_diff1 value: 31.29985811872621 - type: nauc_ndcg_at_3_max value: 18.454751997552098 - type: nauc_ndcg_at_3_std value: -15.465471013016707 - type: nauc_ndcg_at_5_diff1 value: 29.493811341594938 - type: nauc_ndcg_at_5_max value: 18.707691943439258 - type: nauc_ndcg_at_5_std value: -14.433807530159697 - type: nauc_precision_at_1000_diff1 value: 3.2218446674363315 - type: nauc_precision_at_1000_max value: 17.12404764623586 - type: nauc_precision_at_1000_std value: -2.3264575552583064 - type: nauc_precision_at_100_diff1 value: 11.214321152283215 - type: nauc_precision_at_100_max value: 13.446815680526637 - type: nauc_precision_at_100_std value: -4.987683964997388 - type: nauc_precision_at_10_diff1 value: 21.60303059518948 - type: nauc_precision_at_10_max value: 21.596284543805293 - type: nauc_precision_at_10_std value: -8.082031256092737 - type: nauc_precision_at_1_diff1 value: 37.50338859979288 - type: nauc_precision_at_1_max value: 19.34649504621331 - type: nauc_precision_at_1_std value: -17.116234597672054 - type: nauc_precision_at_20_diff1 value: 17.14261533352261 - type: nauc_precision_at_20_max value: 15.233652532515793 - type: nauc_precision_at_20_std value: -6.647366892331011 - type: nauc_precision_at_3_diff1 value: 27.367720137858804 - type: nauc_precision_at_3_max value: 21.090504124786534 - type: nauc_precision_at_3_std value: -13.306298994563972 - type: nauc_precision_at_5_diff1 value: 23.025359281940467 - type: nauc_precision_at_5_max value: 21.211267803594005 - type: nauc_precision_at_5_std value: -11.168476315146755 - type: nauc_recall_at_1000_diff1 value: 14.071492953686532 - type: nauc_recall_at_1000_max value: 13.218836154700018 - type: nauc_recall_at_1000_std value: -7.084593581668587 - type: nauc_recall_at_100_diff1 value: 15.116097572344975 - type: nauc_recall_at_100_max value: 9.307648936748858 - type: nauc_recall_at_100_std value: -5.895474365275694 - type: nauc_recall_at_10_diff1 value: 23.78022904015662 - type: nauc_recall_at_10_max value: 17.85241980538583 - type: nauc_recall_at_10_std value: -8.86829209909465 - type: nauc_recall_at_1_diff1 value: 37.4160233030578 - type: nauc_recall_at_1_max value: 17.016815986268263 - type: nauc_recall_at_1_std value: -17.425864228691612 - type: nauc_recall_at_20_diff1 value: 19.986562587692415 - type: nauc_recall_at_20_max value: 11.164823338332948 - type: nauc_recall_at_20_std value: -6.930345717632131 - type: nauc_recall_at_3_diff1 value: 28.54476711344997 - type: nauc_recall_at_3_max value: 17.72982315112114 - type: nauc_recall_at_3_std value: -14.767629842747809 - type: nauc_recall_at_5_diff1 value: 25.47233991251944 - type: nauc_recall_at_5_max value: 17.87587712457761 - type: nauc_recall_at_5_std value: -12.93959613517254 - type: ndcg_at_1 value: 9.831 - type: ndcg_at_10 value: 15.504999999999999 - type: ndcg_at_100 value: 19.721 - type: ndcg_at_1000 value: 22.746 - type: ndcg_at_20 value: 17.177 - type: ndcg_at_3 value: 13.020999999999999 - type: ndcg_at_5 value: 14.517 - type: precision_at_1 value: 9.831 - type: precision_at_10 value: 2.475 - type: precision_at_100 value: 0.484 - type: precision_at_1000 value: 0.079 - type: precision_at_20 value: 1.6099999999999999 - type: precision_at_3 value: 5.612 - type: precision_at_5 value: 4.202999999999999 - type: recall_at_1 value: 9.322 - type: recall_at_10 value: 21.706 - type: recall_at_100 value: 41.837 - type: recall_at_1000 value: 65.78500000000001 - type: recall_at_20 value: 28.173 - type: recall_at_3 value: 15.167 - type: recall_at_5 value: 18.765 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackMathematicaRetrieval (default) revision: 90fceea13679c63fe563ded68f3b6f06e50061de split: test type: mteb/cqadupstack-mathematica metrics: - type: main_score value: 10.506 - type: map_at_1 value: 5.097 - type: map_at_10 value: 8.123 - type: map_at_100 value: 8.81 - type: map_at_1000 value: 8.921999999999999 - type: map_at_20 value: 8.445 - type: map_at_3 value: 7.058000000000001 - type: map_at_5 value: 7.509 - type: mrr_at_1 value: 6.343283582089552 - type: mrr_at_10 value: 9.93554055121219 - type: mrr_at_100 value: 10.666175665806625 - type: mrr_at_1000 value: 10.766018190354108 - type: mrr_at_20 value: 10.29873635991652 - type: mrr_at_3 value: 8.706467661691544 - type: mrr_at_5 value: 9.20398009950249 - type: nauc_map_at_1000_diff1 value: 16.71617773929961 - type: nauc_map_at_1000_max value: 12.836522190717654 - type: nauc_map_at_1000_std value: -1.5931324340434574 - type: nauc_map_at_100_diff1 value: 16.708450779332075 - type: nauc_map_at_100_max value: 12.864987872701173 - type: nauc_map_at_100_std value: -1.7749974459670648 - type: nauc_map_at_10_diff1 value: 16.889847999156434 - type: nauc_map_at_10_max value: 13.821580163360652 - type: nauc_map_at_10_std value: -1.5413513436151478 - type: nauc_map_at_1_diff1 value: 27.620803823439566 - type: nauc_map_at_1_max value: 9.946991002221708 - type: nauc_map_at_1_std value: -2.4262680356943087 - type: nauc_map_at_20_diff1 value: 16.674045845919565 - type: nauc_map_at_20_max value: 13.011303701592054 - type: nauc_map_at_20_std value: -1.6544743278320506 - type: nauc_map_at_3_diff1 value: 17.421817131869087 - type: nauc_map_at_3_max value: 13.332677540146523 - type: nauc_map_at_3_std value: -3.3965199354497257 - type: nauc_map_at_5_diff1 value: 16.472303139269965 - type: nauc_map_at_5_max value: 12.957712628879412 - type: nauc_map_at_5_std value: -2.5301777339577662 - type: nauc_mrr_at_1000_diff1 value: 16.593077375065533 - type: nauc_mrr_at_1000_max value: 15.24914299560567 - type: nauc_mrr_at_1000_std value: -0.8209741469268466 - type: nauc_mrr_at_100_diff1 value: 16.526988857467 - type: nauc_mrr_at_100_max value: 15.282403514335552 - type: nauc_mrr_at_100_std value: -0.9336495531936128 - type: nauc_mrr_at_10_diff1 value: 17.112649116787765 - type: nauc_mrr_at_10_max value: 15.998287559296745 - type: nauc_mrr_at_10_std value: -0.4972479310956255 - type: nauc_mrr_at_1_diff1 value: 26.390769480452008 - type: nauc_mrr_at_1_max value: 12.666086436004754 - type: nauc_mrr_at_1_std value: -0.8290506693110757 - type: nauc_mrr_at_20_diff1 value: 16.570263118716873 - type: nauc_mrr_at_20_max value: 15.41609638468375 - type: nauc_mrr_at_20_std value: -0.7638194854818602 - type: nauc_mrr_at_3_diff1 value: 17.337541518148672 - type: nauc_mrr_at_3_max value: 16.054253099766 - type: nauc_mrr_at_3_std value: -2.4609668986558098 - type: nauc_mrr_at_5_diff1 value: 16.631292764650276 - type: nauc_mrr_at_5_max value: 15.260426520080111 - type: nauc_mrr_at_5_std value: -1.7287159836379042 - type: nauc_ndcg_at_1000_diff1 value: 14.52634093171492 - type: nauc_ndcg_at_1000_max value: 12.171421632471498 - type: nauc_ndcg_at_1000_std value: 2.162794094660827 - type: nauc_ndcg_at_100_diff1 value: 13.83799208322184 - type: nauc_ndcg_at_100_max value: 12.724714757328384 - type: nauc_ndcg_at_100_std value: -0.6192472371565176 - type: nauc_ndcg_at_10_diff1 value: 14.905057185135432 - type: nauc_ndcg_at_10_max value: 15.671185607261256 - type: nauc_ndcg_at_10_std value: 0.4794457018671312 - type: nauc_ndcg_at_1_diff1 value: 26.390769480452008 - type: nauc_ndcg_at_1_max value: 12.666086436004754 - type: nauc_ndcg_at_1_std value: -0.8290506693110757 - type: nauc_ndcg_at_20_diff1 value: 14.177586694378425 - type: nauc_ndcg_at_20_max value: 13.309923186895894 - type: nauc_ndcg_at_20_std value: 0.08485334685153047 - type: nauc_ndcg_at_3_diff1 value: 14.464832485633236 - type: nauc_ndcg_at_3_max value: 15.376082832680266 - type: nauc_ndcg_at_3_std value: -3.4289150318270947 - type: nauc_ndcg_at_5_diff1 value: 13.479314775515663 - type: nauc_ndcg_at_5_max value: 14.170795142756146 - type: nauc_ndcg_at_5_std value: -1.8374279611217414 - type: nauc_precision_at_1000_diff1 value: 5.5461386139543984 - type: nauc_precision_at_1000_max value: 8.173550020362248 - type: nauc_precision_at_1000_std value: 4.711143690664535 - type: nauc_precision_at_100_diff1 value: 5.8834541815278945 - type: nauc_precision_at_100_max value: 11.091665205495271 - type: nauc_precision_at_100_std value: -2.7393617901866123 - type: nauc_precision_at_10_diff1 value: 10.751011614623913 - type: nauc_precision_at_10_max value: 17.777588721031616 - type: nauc_precision_at_10_std value: 3.707970494956657 - type: nauc_precision_at_1_diff1 value: 26.390769480452008 - type: nauc_precision_at_1_max value: 12.666086436004754 - type: nauc_precision_at_1_std value: -0.8290506693110757 - type: nauc_precision_at_20_diff1 value: 8.974996734936457 - type: nauc_precision_at_20_max value: 12.402565300274947 - type: nauc_precision_at_20_std value: 0.937988804595429 - type: nauc_precision_at_3_diff1 value: 8.383569631006118 - type: nauc_precision_at_3_max value: 18.173716740568526 - type: nauc_precision_at_3_std value: -3.3910150432001407 - type: nauc_precision_at_5_diff1 value: 6.544996015375691 - type: nauc_precision_at_5_max value: 16.558965673469203 - type: nauc_precision_at_5_std value: -0.21836542541876836 - type: nauc_recall_at_1000_diff1 value: 12.181437110921324 - type: nauc_recall_at_1000_max value: 6.492207340417336 - type: nauc_recall_at_1000_std value: 12.361749723553077 - type: nauc_recall_at_100_diff1 value: 9.509974939632396 - type: nauc_recall_at_100_max value: 9.708468343147787 - type: nauc_recall_at_100_std value: 0.8884655075061652 - type: nauc_recall_at_10_diff1 value: 11.525566414075342 - type: nauc_recall_at_10_max value: 17.809901918053168 - type: nauc_recall_at_10_std value: 3.6346449630612487 - type: nauc_recall_at_1_diff1 value: 27.620803823439566 - type: nauc_recall_at_1_max value: 9.946991002221708 - type: nauc_recall_at_1_std value: -2.4262680356943087 - type: nauc_recall_at_20_diff1 value: 10.29372538890332 - type: nauc_recall_at_20_max value: 11.376497210412108 - type: nauc_recall_at_20_std value: 2.7197830344340495 - type: nauc_recall_at_3_diff1 value: 8.828406903593496 - type: nauc_recall_at_3_max value: 16.884533927784844 - type: nauc_recall_at_3_std value: -2.9509441866607156 - type: nauc_recall_at_5_diff1 value: 7.436661727727917 - type: nauc_recall_at_5_max value: 14.355008485341797 - type: nauc_recall_at_5_std value: -0.38874284690465266 - type: ndcg_at_1 value: 6.343 - type: ndcg_at_10 value: 10.506 - type: ndcg_at_100 value: 14.41 - type: ndcg_at_1000 value: 17.698 - type: ndcg_at_20 value: 11.73 - type: ndcg_at_3 value: 8.257 - type: ndcg_at_5 value: 8.996 - type: precision_at_1 value: 6.343 - type: precision_at_10 value: 2.114 - type: precision_at_100 value: 0.48 - type: precision_at_1000 value: 0.08800000000000001 - type: precision_at_20 value: 1.393 - type: precision_at_3 value: 4.146 - type: precision_at_5 value: 3.0349999999999997 - type: recall_at_1 value: 5.097 - type: recall_at_10 value: 16.204 - type: recall_at_100 value: 34.223 - type: recall_at_1000 value: 58.553999999999995 - type: recall_at_20 value: 20.76 - type: recall_at_3 value: 9.809 - type: recall_at_5 value: 11.623 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackPhysicsRetrieval (default) revision: 79531abbd1fb92d06c6d6315a0cbbbf5bb247ea4 split: test type: mteb/cqadupstack-physics metrics: - type: main_score value: 22.942999999999998 - type: map_at_1 value: 14.446 - type: map_at_10 value: 19.377 - type: map_at_100 value: 20.482 - type: map_at_1000 value: 20.626 - type: map_at_20 value: 19.991 - type: map_at_3 value: 17.714 - type: map_at_5 value: 18.665000000000003 - type: mrr_at_1 value: 17.613089509143407 - type: mrr_at_10 value: 23.370533327222446 - type: mrr_at_100 value: 24.317061726895137 - type: mrr_at_1000 value: 24.406259672604996 - type: mrr_at_20 value: 23.908661597522798 - type: mrr_at_3 value: 21.607314725697798 - type: mrr_at_5 value: 22.632338787295485 - type: nauc_map_at_1000_diff1 value: 34.036197058710755 - type: nauc_map_at_1000_max value: 22.301224424803703 - type: nauc_map_at_1000_std value: -2.9723475399352406 - type: nauc_map_at_100_diff1 value: 34.0267334259839 - type: nauc_map_at_100_max value: 22.263450935087985 - type: nauc_map_at_100_std value: -3.0314992417234246 - type: nauc_map_at_10_diff1 value: 34.242902742320005 - type: nauc_map_at_10_max value: 21.903826727642166 - type: nauc_map_at_10_std value: -3.542446159080337 - type: nauc_map_at_1_diff1 value: 42.19086537913616 - type: nauc_map_at_1_max value: 25.7185835567139 - type: nauc_map_at_1_std value: -5.103024290179066 - type: nauc_map_at_20_diff1 value: 34.125155980117164 - type: nauc_map_at_20_max value: 22.087294494234595 - type: nauc_map_at_20_std value: -3.36424948880508 - type: nauc_map_at_3_diff1 value: 35.64286121660639 - type: nauc_map_at_3_max value: 22.590710495131557 - type: nauc_map_at_3_std value: -4.624441576366177 - type: nauc_map_at_5_diff1 value: 34.14898021930562 - type: nauc_map_at_5_max value: 22.026354986165444 - type: nauc_map_at_5_std value: -3.783104198258874 - type: nauc_mrr_at_1000_diff1 value: 32.070052005179754 - type: nauc_mrr_at_1000_max value: 25.5635144887676 - type: nauc_mrr_at_1000_std value: -2.5922525361206037 - type: nauc_mrr_at_100_diff1 value: 32.02165293253879 - type: nauc_mrr_at_100_max value: 25.569836435013784 - type: nauc_mrr_at_100_std value: -2.598052553655546 - type: nauc_mrr_at_10_diff1 value: 32.11316242036246 - type: nauc_mrr_at_10_max value: 25.54775740017834 - type: nauc_mrr_at_10_std value: -2.9438839044554044 - type: nauc_mrr_at_1_diff1 value: 40.40685592638284 - type: nauc_mrr_at_1_max value: 30.0134595827404 - type: nauc_mrr_at_1_std value: -3.9985970334007477 - type: nauc_mrr_at_20_diff1 value: 32.120466540461 - type: nauc_mrr_at_20_max value: 25.549273895185305 - type: nauc_mrr_at_20_std value: -2.6763999823702553 - type: nauc_mrr_at_3_diff1 value: 33.66614272732434 - type: nauc_mrr_at_3_max value: 26.430879923148343 - type: nauc_mrr_at_3_std value: -4.0205614730618215 - type: nauc_mrr_at_5_diff1 value: 32.166578904190416 - type: nauc_mrr_at_5_max value: 25.776645936774095 - type: nauc_mrr_at_5_std value: -3.302080351323094 - type: nauc_ndcg_at_1000_diff1 value: 30.266233773630375 - type: nauc_ndcg_at_1000_max value: 22.08745825058941 - type: nauc_ndcg_at_1000_std value: 0.7729160122149865 - type: nauc_ndcg_at_100_diff1 value: 29.84343294166904 - type: nauc_ndcg_at_100_max value: 21.578448258757316 - type: nauc_ndcg_at_100_std value: 0.11264370081458419 - type: nauc_ndcg_at_10_diff1 value: 31.11895748690149 - type: nauc_ndcg_at_10_max value: 20.84767764918772 - type: nauc_ndcg_at_10_std value: -2.325520203137333 - type: nauc_ndcg_at_1_diff1 value: 40.40685592638284 - type: nauc_ndcg_at_1_max value: 30.0134595827404 - type: nauc_ndcg_at_1_std value: -3.9985970334007477 - type: nauc_ndcg_at_20_diff1 value: 30.76844239689582 - type: nauc_ndcg_at_20_max value: 21.158453354191884 - type: nauc_ndcg_at_20_std value: -1.6168879431876966 - type: nauc_ndcg_at_3_diff1 value: 33.47831071332028 - type: nauc_ndcg_at_3_max value: 23.430301462229234 - type: nauc_ndcg_at_3_std value: -4.236230987770694 - type: nauc_ndcg_at_5_diff1 value: 31.118155990902537 - type: nauc_ndcg_at_5_max value: 21.836987185909415 - type: nauc_ndcg_at_5_std value: -2.9140434980631045 - type: nauc_precision_at_1000_diff1 value: 0.9998952314883321 - type: nauc_precision_at_1000_max value: 15.224526827087908 - type: nauc_precision_at_1000_std value: 12.857731911721679 - type: nauc_precision_at_100_diff1 value: 9.206315178802491 - type: nauc_precision_at_100_max value: 23.2931840220031 - type: nauc_precision_at_100_std value: 10.762622088086484 - type: nauc_precision_at_10_diff1 value: 21.76866798095069 - type: nauc_precision_at_10_max value: 22.882457871450608 - type: nauc_precision_at_10_std value: 1.4688800239255935 - type: nauc_precision_at_1_diff1 value: 40.40685592638284 - type: nauc_precision_at_1_max value: 30.0134595827404 - type: nauc_precision_at_1_std value: -3.9985970334007477 - type: nauc_precision_at_20_diff1 value: 18.273394428921403 - type: nauc_precision_at_20_max value: 24.006501989084022 - type: nauc_precision_at_20_std value: 3.992091565975308 - type: nauc_precision_at_3_diff1 value: 27.442581369093507 - type: nauc_precision_at_3_max value: 24.691098910221115 - type: nauc_precision_at_3_std value: -2.5539232493084634 - type: nauc_precision_at_5_diff1 value: 22.309274572791644 - type: nauc_precision_at_5_max value: 23.275965057073243 - type: nauc_precision_at_5_std value: -0.30106646052885566 - type: nauc_recall_at_1000_diff1 value: 15.627777661804606 - type: nauc_recall_at_1000_max value: 12.338415154004217 - type: nauc_recall_at_1000_std value: 19.86929715112502 - type: nauc_recall_at_100_diff1 value: 16.96732400913716 - type: nauc_recall_at_100_max value: 12.701326286720368 - type: nauc_recall_at_100_std value: 9.758216731399271 - type: nauc_recall_at_10_diff1 value: 23.87551744396225 - type: nauc_recall_at_10_max value: 14.166646301822277 - type: nauc_recall_at_10_std value: 0.1988619766549251 - type: nauc_recall_at_1_diff1 value: 42.19086537913616 - type: nauc_recall_at_1_max value: 25.7185835567139 - type: nauc_recall_at_1_std value: -5.103024290179066 - type: nauc_recall_at_20_diff1 value: 22.82940257737179 - type: nauc_recall_at_20_max value: 14.380915615760875 - type: nauc_recall_at_20_std value: 2.254636975248318 - type: nauc_recall_at_3_diff1 value: 28.766021778938168 - type: nauc_recall_at_3_max value: 17.976609326976067 - type: nauc_recall_at_3_std value: -3.702494785254991 - type: nauc_recall_at_5_diff1 value: 23.908633564651726 - type: nauc_recall_at_5_max value: 15.914031250219566 - type: nauc_recall_at_5_std value: -1.1174655358936727 - type: ndcg_at_1 value: 17.613 - type: ndcg_at_10 value: 22.942999999999998 - type: ndcg_at_100 value: 28.433999999999997 - type: ndcg_at_1000 value: 31.757 - type: ndcg_at_20 value: 24.98 - type: ndcg_at_3 value: 20.048 - type: ndcg_at_5 value: 21.477 - type: precision_at_1 value: 17.613 - type: precision_at_10 value: 4.196 - type: precision_at_100 value: 0.857 - type: precision_at_1000 value: 0.134 - type: precision_at_20 value: 2.738 - type: precision_at_3 value: 9.4 - type: precision_at_5 value: 6.795 - type: recall_at_1 value: 14.446 - type: recall_at_10 value: 29.834 - type: recall_at_100 value: 54.201 - type: recall_at_1000 value: 77.404 - type: recall_at_20 value: 37.076 - type: recall_at_3 value: 21.634 - type: recall_at_5 value: 25.354 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackProgrammersRetrieval (default) revision: 6184bc1440d2dbc7612be22b50686b8826d22b32 split: test type: mteb/cqadupstack-programmers metrics: - type: main_score value: 16.134999999999998 - type: map_at_1 value: 9.081 - type: map_at_10 value: 13.055 - type: map_at_100 value: 13.983 - type: map_at_1000 value: 14.121 - type: map_at_20 value: 13.572999999999999 - type: map_at_3 value: 11.356 - type: map_at_5 value: 12.374 - type: mrr_at_1 value: 11.643835616438356 - type: mrr_at_10 value: 16.14947995941146 - type: mrr_at_100 value: 16.997282843006918 - type: mrr_at_1000 value: 17.102618713993394 - type: mrr_at_20 value: 16.618159190946415 - type: mrr_at_3 value: 14.193302891933024 - type: mrr_at_5 value: 15.431887366818875 - type: nauc_map_at_1000_diff1 value: 32.88432231391895 - type: nauc_map_at_1000_max value: 21.28374560046097 - type: nauc_map_at_1000_std value: 0.7870656250135109 - type: nauc_map_at_100_diff1 value: 32.88983126213636 - type: nauc_map_at_100_max value: 21.22458501691853 - type: nauc_map_at_100_std value: 0.7745608524382627 - type: nauc_map_at_10_diff1 value: 33.3173574829305 - type: nauc_map_at_10_max value: 21.367334815121904 - type: nauc_map_at_10_std value: -0.2609532870073169 - type: nauc_map_at_1_diff1 value: 41.49865569300206 - type: nauc_map_at_1_max value: 23.806102705106763 - type: nauc_map_at_1_std value: -1.0768247293103315 - type: nauc_map_at_20_diff1 value: 32.88467310993085 - type: nauc_map_at_20_max value: 21.327573738016785 - type: nauc_map_at_20_std value: 0.316045501648052 - type: nauc_map_at_3_diff1 value: 35.81885176476419 - type: nauc_map_at_3_max value: 22.563058822026658 - type: nauc_map_at_3_std value: -0.8297325146016894 - type: nauc_map_at_5_diff1 value: 33.77483401790263 - type: nauc_map_at_5_max value: 22.13376627990081 - type: nauc_map_at_5_std value: -1.298858688329888 - type: nauc_mrr_at_1000_diff1 value: 30.18097667250221 - type: nauc_mrr_at_1000_max value: 23.047341870142613 - type: nauc_mrr_at_1000_std value: -0.7406764235969188 - type: nauc_mrr_at_100_diff1 value: 30.137374263969996 - type: nauc_mrr_at_100_max value: 23.00586275774131 - type: nauc_mrr_at_100_std value: -0.7248089045016322 - type: nauc_mrr_at_10_diff1 value: 30.5170004176012 - type: nauc_mrr_at_10_max value: 23.164562505110673 - type: nauc_mrr_at_10_std value: -1.337649573306133 - type: nauc_mrr_at_1_diff1 value: 37.46155722071317 - type: nauc_mrr_at_1_max value: 25.00725832122006 - type: nauc_mrr_at_1_std value: -1.0496408564552728 - type: nauc_mrr_at_20_diff1 value: 30.072298950513115 - type: nauc_mrr_at_20_max value: 23.12382481107441 - type: nauc_mrr_at_20_std value: -1.0529732263666112 - type: nauc_mrr_at_3_diff1 value: 33.48101600704272 - type: nauc_mrr_at_3_max value: 24.320755907805154 - type: nauc_mrr_at_3_std value: -1.1307908969215423 - type: nauc_mrr_at_5_diff1 value: 31.18888034831575 - type: nauc_mrr_at_5_max value: 24.06227117989202 - type: nauc_mrr_at_5_std value: -1.6797432122873692 - type: nauc_ndcg_at_1000_diff1 value: 28.432037664372412 - type: nauc_ndcg_at_1000_max value: 20.70102200502625 - type: nauc_ndcg_at_1000_std value: 4.336326682724843 - type: nauc_ndcg_at_100_diff1 value: 28.34454571794967 - type: nauc_ndcg_at_100_max value: 19.24223569564877 - type: nauc_ndcg_at_100_std value: 4.362280599906417 - type: nauc_ndcg_at_10_diff1 value: 29.501926407603296 - type: nauc_ndcg_at_10_max value: 20.201609309464548 - type: nauc_ndcg_at_10_std value: 0.24089058436514194 - type: nauc_ndcg_at_1_diff1 value: 37.46155722071317 - type: nauc_ndcg_at_1_max value: 25.00725832122006 - type: nauc_ndcg_at_1_std value: -1.0496408564552728 - type: nauc_ndcg_at_20_diff1 value: 28.16170312615381 - type: nauc_ndcg_at_20_max value: 19.972996583494282 - type: nauc_ndcg_at_20_std value: 1.7952491904498078 - type: nauc_ndcg_at_3_diff1 value: 33.81225087762225 - type: nauc_ndcg_at_3_max value: 22.806027738516985 - type: nauc_ndcg_at_3_std value: -0.3936571571120077 - type: nauc_ndcg_at_5_diff1 value: 30.443042638323213 - type: nauc_ndcg_at_5_max value: 22.16102145420267 - type: nauc_ndcg_at_5_std value: -1.406251026694119 - type: nauc_precision_at_1000_diff1 value: 0.4741273357484423 - type: nauc_precision_at_1000_max value: 11.280228116288542 - type: nauc_precision_at_1000_std value: 2.6901820584724363 - type: nauc_precision_at_100_diff1 value: 12.332309998132743 - type: nauc_precision_at_100_max value: 13.961289532548982 - type: nauc_precision_at_100_std value: 11.085111649559586 - type: nauc_precision_at_10_diff1 value: 19.283822581631675 - type: nauc_precision_at_10_max value: 18.7473146500872 - type: nauc_precision_at_10_std value: 0.35093524054436415 - type: nauc_precision_at_1_diff1 value: 37.46155722071317 - type: nauc_precision_at_1_max value: 25.00725832122006 - type: nauc_precision_at_1_std value: -1.0496408564552728 - type: nauc_precision_at_20_diff1 value: 16.254451730745757 - type: nauc_precision_at_20_max value: 17.364228546817166 - type: nauc_precision_at_20_std value: 5.773553761500332 - type: nauc_precision_at_3_diff1 value: 28.259681463765514 - type: nauc_precision_at_3_max value: 22.873732037017984 - type: nauc_precision_at_3_std value: -0.8527795522416294 - type: nauc_precision_at_5_diff1 value: 21.21485623284622 - type: nauc_precision_at_5_max value: 23.001097117924832 - type: nauc_precision_at_5_std value: -3.108687061513337 - type: nauc_recall_at_1000_diff1 value: 18.73323624525636 - type: nauc_recall_at_1000_max value: 18.04287551295194 - type: nauc_recall_at_1000_std value: 17.786418942777992 - type: nauc_recall_at_100_diff1 value: 19.919117945258254 - type: nauc_recall_at_100_max value: 11.16087760657872 - type: nauc_recall_at_100_std value: 14.566488048537535 - type: nauc_recall_at_10_diff1 value: 22.078090142518782 - type: nauc_recall_at_10_max value: 14.941344831772128 - type: nauc_recall_at_10_std value: 2.4737147250843186 - type: nauc_recall_at_1_diff1 value: 41.49865569300206 - type: nauc_recall_at_1_max value: 23.806102705106763 - type: nauc_recall_at_1_std value: -1.0768247293103315 - type: nauc_recall_at_20_diff1 value: 18.725160188681265 - type: nauc_recall_at_20_max value: 14.46073981800366 - type: nauc_recall_at_20_std value: 6.133778343325667 - type: nauc_recall_at_3_diff1 value: 30.297623023451 - type: nauc_recall_at_3_max value: 20.126404905370183 - type: nauc_recall_at_3_std value: -0.03367599947778304 - type: nauc_recall_at_5_diff1 value: 24.35960314497861 - type: nauc_recall_at_5_max value: 19.26030564870987 - type: nauc_recall_at_5_std value: -1.3740373839056597 - type: ndcg_at_1 value: 11.644 - type: ndcg_at_10 value: 16.134999999999998 - type: ndcg_at_100 value: 20.696 - type: ndcg_at_1000 value: 24.43 - type: ndcg_at_20 value: 17.861 - type: ndcg_at_3 value: 12.842999999999998 - type: ndcg_at_5 value: 14.618 - type: precision_at_1 value: 11.644 - type: precision_at_10 value: 3.116 - type: precision_at_100 value: 0.6459999999999999 - type: precision_at_1000 value: 0.11499999999999999 - type: precision_at_20 value: 2.032 - type: precision_at_3 value: 6.012 - type: precision_at_5 value: 4.84 - type: recall_at_1 value: 9.081 - type: recall_at_10 value: 22.554 - type: recall_at_100 value: 42.531 - type: recall_at_1000 value: 69.706 - type: recall_at_20 value: 28.743999999999996 - type: recall_at_3 value: 13.977 - type: recall_at_5 value: 18.169 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackRetrieval (default) revision: CQADupstackRetrieval_is_a_combined_dataset split: test type: CQADupstackRetrieval_is_a_combined_dataset metrics: - type: main_score value: 19.09025 - type: ndcg_at_10 value: 19.09025 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackStatsRetrieval (default) revision: 65ac3a16b8e91f9cee4c9828cc7c335575432a2a split: test type: mteb/cqadupstack-stats metrics: - type: main_score value: 14.333000000000002 - type: map_at_1 value: 8.547 - type: map_at_10 value: 11.93 - type: map_at_100 value: 12.684000000000001 - type: map_at_1000 value: 12.78 - type: map_at_20 value: 12.337 - type: map_at_3 value: 10.588000000000001 - type: map_at_5 value: 11.323 - type: mrr_at_1 value: 10.122699386503067 - type: mrr_at_10 value: 13.70039682539683 - type: mrr_at_100 value: 14.482783387597786 - type: mrr_at_1000 value: 14.570668290032126 - type: mrr_at_20 value: 14.13394542086446 - type: mrr_at_3 value: 12.295501022494888 - type: mrr_at_5 value: 13.02402862985685 - type: nauc_map_at_1000_diff1 value: 29.11861352274418 - type: nauc_map_at_1000_max value: 22.56892542189849 - type: nauc_map_at_1000_std value: -4.54004259281444 - type: nauc_map_at_100_diff1 value: 29.19511912412574 - type: nauc_map_at_100_max value: 22.603999738779653 - type: nauc_map_at_100_std value: -4.52144665211894 - type: nauc_map_at_10_diff1 value: 29.29630212567994 - type: nauc_map_at_10_max value: 23.196908629656825 - type: nauc_map_at_10_std value: -5.360014721454885 - type: nauc_map_at_1_diff1 value: 35.230641193187914 - type: nauc_map_at_1_max value: 24.41472808203692 - type: nauc_map_at_1_std value: -5.821919201147339 - type: nauc_map_at_20_diff1 value: 29.12584787266228 - type: nauc_map_at_20_max value: 22.709732890112623 - type: nauc_map_at_20_std value: -4.817143968480547 - type: nauc_map_at_3_diff1 value: 31.57056394160851 - type: nauc_map_at_3_max value: 25.825155522604348 - type: nauc_map_at_3_std value: -6.542610906472262 - type: nauc_map_at_5_diff1 value: 29.231256950912947 - type: nauc_map_at_5_max value: 24.156600281291105 - type: nauc_map_at_5_std value: -5.997363843824488 - type: nauc_mrr_at_1000_diff1 value: 28.820090702123135 - type: nauc_mrr_at_1000_max value: 24.263679220309907 - type: nauc_mrr_at_1000_std value: -2.0607371843303057 - type: nauc_mrr_at_100_diff1 value: 28.847376535658203 - type: nauc_mrr_at_100_max value: 24.272497169069386 - type: nauc_mrr_at_100_std value: -2.03120306488999 - type: nauc_mrr_at_10_diff1 value: 28.911942194319707 - type: nauc_mrr_at_10_max value: 25.035362298602738 - type: nauc_mrr_at_10_std value: -2.5392409774079616 - type: nauc_mrr_at_1_diff1 value: 34.94101066582577 - type: nauc_mrr_at_1_max value: 26.610522376067564 - type: nauc_mrr_at_1_std value: -2.6534637926597697 - type: nauc_mrr_at_20_diff1 value: 28.743299849543636 - type: nauc_mrr_at_20_max value: 24.32003719178884 - type: nauc_mrr_at_20_std value: -2.3279080552115117 - type: nauc_mrr_at_3_diff1 value: 31.04805054489257 - type: nauc_mrr_at_3_max value: 27.616725290738337 - type: nauc_mrr_at_3_std value: -2.9076820433664667 - type: nauc_mrr_at_5_diff1 value: 28.865005001724242 - type: nauc_mrr_at_5_max value: 26.103439275448775 - type: nauc_mrr_at_5_std value: -3.0680396311703184 - type: nauc_ndcg_at_1000_diff1 value: 25.539587234316798 - type: nauc_ndcg_at_1000_max value: 18.820788497321356 - type: nauc_ndcg_at_1000_std value: -1.9960462357498938 - type: nauc_ndcg_at_100_diff1 value: 27.14578048184198 - type: nauc_ndcg_at_100_max value: 19.4851567283788 - type: nauc_ndcg_at_100_std value: -1.3749625199938715 - type: nauc_ndcg_at_10_diff1 value: 26.87364502097816 - type: nauc_ndcg_at_10_max value: 21.485890355501102 - type: nauc_ndcg_at_10_std value: -3.894125413998458 - type: nauc_ndcg_at_1_diff1 value: 34.94101066582577 - type: nauc_ndcg_at_1_max value: 26.610522376067564 - type: nauc_ndcg_at_1_std value: -2.6534637926597697 - type: nauc_ndcg_at_20_diff1 value: 26.341280976454417 - type: nauc_ndcg_at_20_max value: 19.721515866258724 - type: nauc_ndcg_at_20_std value: -2.9319224524709053 - type: nauc_ndcg_at_3_diff1 value: 30.74558316757148 - type: nauc_ndcg_at_3_max value: 26.40338736609146 - type: nauc_ndcg_at_3_std value: -5.561920759375321 - type: nauc_ndcg_at_5_diff1 value: 26.881257783685893 - type: nauc_ndcg_at_5_max value: 23.650417322561335 - type: nauc_ndcg_at_5_std value: -5.175161111887432 - type: nauc_precision_at_1000_diff1 value: 10.873007976618808 - type: nauc_precision_at_1000_max value: 12.030734880352934 - type: nauc_precision_at_1000_std value: 6.381355734825803 - type: nauc_precision_at_100_diff1 value: 22.401720874248873 - type: nauc_precision_at_100_max value: 16.830307472250432 - type: nauc_precision_at_100_std value: 8.759369364769308 - type: nauc_precision_at_10_diff1 value: 22.232090186151392 - type: nauc_precision_at_10_max value: 21.76539255838225 - type: nauc_precision_at_10_std value: 0.8339023471047621 - type: nauc_precision_at_1_diff1 value: 34.94101066582577 - type: nauc_precision_at_1_max value: 26.610522376067564 - type: nauc_precision_at_1_std value: -2.6534637926597697 - type: nauc_precision_at_20_diff1 value: 20.89118655581876 - type: nauc_precision_at_20_max value: 17.90131790057651 - type: nauc_precision_at_20_std value: 2.415900033993976 - type: nauc_precision_at_3_diff1 value: 29.199646185693584 - type: nauc_precision_at_3_max value: 28.71542597359216 - type: nauc_precision_at_3_std value: -3.324231635814199 - type: nauc_precision_at_5_diff1 value: 21.160582630949655 - type: nauc_precision_at_5_max value: 25.038152290365755 - type: nauc_precision_at_5_std value: -1.5349945089073667 - type: nauc_recall_at_1000_diff1 value: 12.802585970938592 - type: nauc_recall_at_1000_max value: 4.416753173413089 - type: nauc_recall_at_1000_std value: 0.3242070873030706 - type: nauc_recall_at_100_diff1 value: 23.177406841712212 - type: nauc_recall_at_100_max value: 10.168312031699122 - type: nauc_recall_at_100_std value: 3.5202491358448733 - type: nauc_recall_at_10_diff1 value: 21.179523585586825 - type: nauc_recall_at_10_max value: 14.696510947366045 - type: nauc_recall_at_10_std value: -2.77056987911708 - type: nauc_recall_at_1_diff1 value: 35.230641193187914 - type: nauc_recall_at_1_max value: 24.41472808203692 - type: nauc_recall_at_1_std value: -5.821919201147339 - type: nauc_recall_at_20_diff1 value: 19.96336555562722 - type: nauc_recall_at_20_max value: 10.265926044858517 - type: nauc_recall_at_20_std value: -0.785259475171776 - type: nauc_recall_at_3_diff1 value: 27.2666741731745 - type: nauc_recall_at_3_max value: 24.921261035370843 - type: nauc_recall_at_3_std value: -6.520343024542523 - type: nauc_recall_at_5_diff1 value: 20.830145482657233 - type: nauc_recall_at_5_max value: 19.70605027355368 - type: nauc_recall_at_5_std value: -5.524187480821078 - type: ndcg_at_1 value: 10.123 - type: ndcg_at_10 value: 14.333000000000002 - type: ndcg_at_100 value: 18.242 - type: ndcg_at_1000 value: 21.185000000000002 - type: ndcg_at_20 value: 15.795 - type: ndcg_at_3 value: 11.737 - type: ndcg_at_5 value: 12.875 - type: precision_at_1 value: 10.123 - type: precision_at_10 value: 2.5 - type: precision_at_100 value: 0.48 - type: precision_at_1000 value: 0.079 - type: precision_at_20 value: 1.595 - type: precision_at_3 value: 5.164 - type: precision_at_5 value: 3.8649999999999998 - type: recall_at_1 value: 8.547 - type: recall_at_10 value: 20.152 - type: recall_at_100 value: 38.274 - type: recall_at_1000 value: 61.097 - type: recall_at_20 value: 25.672 - type: recall_at_3 value: 12.866 - type: recall_at_5 value: 15.717999999999998 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackTexRetrieval (default) revision: 46989137a86843e03a6195de44b09deda022eec7 split: test type: mteb/cqadupstack-tex metrics: - type: main_score value: 10.942 - type: map_at_1 value: 5.989 - type: map_at_10 value: 8.927 - type: map_at_100 value: 9.539 - type: map_at_1000 value: 9.649000000000001 - type: map_at_20 value: 9.225 - type: map_at_3 value: 8.126 - type: map_at_5 value: 8.541 - type: mrr_at_1 value: 7.5361321403991735 - type: mrr_at_10 value: 10.92452397338838 - type: mrr_at_100 value: 11.592043752074652 - type: mrr_at_1000 value: 11.685734091169346 - type: mrr_at_20 value: 11.258548816571706 - type: mrr_at_3 value: 10.002294104152327 - type: mrr_at_5 value: 10.47373250745583 - type: nauc_map_at_1000_diff1 value: 29.70785865819864 - type: nauc_map_at_1000_max value: 15.814071189887855 - type: nauc_map_at_1000_std value: -5.60669413451568 - type: nauc_map_at_100_diff1 value: 29.74148513545459 - type: nauc_map_at_100_max value: 15.779846885727725 - type: nauc_map_at_100_std value: -5.76117228213773 - type: nauc_map_at_10_diff1 value: 30.592884755466276 - type: nauc_map_at_10_max value: 16.2821343956555 - type: nauc_map_at_10_std value: -6.31218509821274 - type: nauc_map_at_1_diff1 value: 40.50426071077868 - type: nauc_map_at_1_max value: 17.251422635161674 - type: nauc_map_at_1_std value: -7.3319940985741505 - type: nauc_map_at_20_diff1 value: 30.161839436701044 - type: nauc_map_at_20_max value: 15.822488590611552 - type: nauc_map_at_20_std value: -6.216851050714664 - type: nauc_map_at_3_diff1 value: 32.80171391466759 - type: nauc_map_at_3_max value: 16.819928931516028 - type: nauc_map_at_3_std value: -6.8482887648089354 - type: nauc_map_at_5_diff1 value: 31.68769336125935 - type: nauc_map_at_5_max value: 16.551544446521724 - type: nauc_map_at_5_std value: -6.610571158449323 - type: nauc_mrr_at_1000_diff1 value: 30.032074926432294 - type: nauc_mrr_at_1000_max value: 17.31359992105279 - type: nauc_mrr_at_1000_std value: -5.051808537297404 - type: nauc_mrr_at_100_diff1 value: 30.04561152587057 - type: nauc_mrr_at_100_max value: 17.31510991897843 - type: nauc_mrr_at_100_std value: -5.112171695450907 - type: nauc_mrr_at_10_diff1 value: 30.7919588089329 - type: nauc_mrr_at_10_max value: 17.747307159609047 - type: nauc_mrr_at_10_std value: -5.620809668288058 - type: nauc_mrr_at_1_diff1 value: 40.37820294671241 - type: nauc_mrr_at_1_max value: 19.515958567233955 - type: nauc_mrr_at_1_std value: -7.6128600910040385 - type: nauc_mrr_at_20_diff1 value: 30.43076360217774 - type: nauc_mrr_at_20_max value: 17.42098412102074 - type: nauc_mrr_at_20_std value: -5.502999723295499 - type: nauc_mrr_at_3_diff1 value: 33.35073752739181 - type: nauc_mrr_at_3_max value: 18.381225141274406 - type: nauc_mrr_at_3_std value: -6.281341542296808 - type: nauc_mrr_at_5_diff1 value: 32.01975103837218 - type: nauc_mrr_at_5_max value: 18.248575553624875 - type: nauc_mrr_at_5_std value: -5.975629240088075 - type: nauc_ndcg_at_1000_diff1 value: 23.480424968554473 - type: nauc_ndcg_at_1000_max value: 14.422280226661046 - type: nauc_ndcg_at_1000_std value: 0.037198763992900716 - type: nauc_ndcg_at_100_diff1 value: 23.74556359447292 - type: nauc_ndcg_at_100_max value: 14.02306375423822 - type: nauc_ndcg_at_100_std value: -2.7832737496474014 - type: nauc_ndcg_at_10_diff1 value: 27.151201274571196 - type: nauc_ndcg_at_10_max value: 15.7704175776716 - type: nauc_ndcg_at_10_std value: -5.561215786484417 - type: nauc_ndcg_at_1_diff1 value: 40.37820294671241 - type: nauc_ndcg_at_1_max value: 19.515958567233955 - type: nauc_ndcg_at_1_std value: -7.6128600910040385 - type: nauc_ndcg_at_20_diff1 value: 26.066768096454577 - type: nauc_ndcg_at_20_max value: 14.454961291556554 - type: nauc_ndcg_at_20_std value: -5.335984929547714 - type: nauc_ndcg_at_3_diff1 value: 31.42782503500614 - type: nauc_ndcg_at_3_max value: 17.29083202850581 - type: nauc_ndcg_at_3_std value: -6.593661694626304 - type: nauc_ndcg_at_5_diff1 value: 29.47414868567076 - type: nauc_ndcg_at_5_max value: 16.6743658195434 - type: nauc_ndcg_at_5_std value: -6.167442909277885 - type: nauc_precision_at_1000_diff1 value: 11.55307594597712 - type: nauc_precision_at_1000_max value: 16.664194862533392 - type: nauc_precision_at_1000_std value: 15.574570590140713 - type: nauc_precision_at_100_diff1 value: 14.135107624877794 - type: nauc_precision_at_100_max value: 15.965921007390795 - type: nauc_precision_at_100_std value: 5.476527761120489 - type: nauc_precision_at_10_diff1 value: 20.463049463514587 - type: nauc_precision_at_10_max value: 17.478921279030477 - type: nauc_precision_at_10_std value: -3.491880161936641 - type: nauc_precision_at_1_diff1 value: 40.37820294671241 - type: nauc_precision_at_1_max value: 19.515958567233955 - type: nauc_precision_at_1_std value: -7.6128600910040385 - type: nauc_precision_at_20_diff1 value: 19.35511923391385 - type: nauc_precision_at_20_max value: 16.263201003355583 - type: nauc_precision_at_20_std value: -2.9385217665021464 - type: nauc_precision_at_3_diff1 value: 28.582340371655384 - type: nauc_precision_at_3_max value: 18.900608189348805 - type: nauc_precision_at_3_std value: -6.399404023285527 - type: nauc_precision_at_5_diff1 value: 24.787631563267876 - type: nauc_precision_at_5_max value: 18.566603285207357 - type: nauc_precision_at_5_std value: -4.640787291262861 - type: nauc_recall_at_1000_diff1 value: 10.018323646441084 - type: nauc_recall_at_1000_max value: 8.971012069559492 - type: nauc_recall_at_1000_std value: 14.894521585422476 - type: nauc_recall_at_100_diff1 value: 11.541962873194024 - type: nauc_recall_at_100_max value: 8.63730681762965 - type: nauc_recall_at_100_std value: 3.1924288769214333 - type: nauc_recall_at_10_diff1 value: 18.978099632146687 - type: nauc_recall_at_10_max value: 12.346880908932615 - type: nauc_recall_at_10_std value: -4.380348720563592 - type: nauc_recall_at_1_diff1 value: 40.50426071077868 - type: nauc_recall_at_1_max value: 17.251422635161674 - type: nauc_recall_at_1_std value: -7.3319940985741505 - type: nauc_recall_at_20_diff1 value: 17.100697713020164 - type: nauc_recall_at_20_max value: 9.171514181265934 - type: nauc_recall_at_20_std value: -3.88622450782914 - type: nauc_recall_at_3_diff1 value: 26.205669264205483 - type: nauc_recall_at_3_max value: 15.070286201579593 - type: nauc_recall_at_3_std value: -6.677898784102143 - type: nauc_recall_at_5_diff1 value: 23.246637868945513 - type: nauc_recall_at_5_max value: 13.77133525419923 - type: nauc_recall_at_5_std value: -6.145815156982035 - type: ndcg_at_1 value: 7.536 - type: ndcg_at_10 value: 10.942 - type: ndcg_at_100 value: 14.376 - type: ndcg_at_1000 value: 17.66 - type: ndcg_at_20 value: 12.012 - type: ndcg_at_3 value: 9.39 - type: ndcg_at_5 value: 10.036000000000001 - type: precision_at_1 value: 7.536 - type: precision_at_10 value: 2.051 - type: precision_at_100 value: 0.455 - type: precision_at_1000 value: 0.08800000000000001 - type: precision_at_20 value: 1.321 - type: precision_at_3 value: 4.565 - type: precision_at_5 value: 3.2620000000000005 - type: recall_at_1 value: 5.989 - type: recall_at_10 value: 15.112 - type: recall_at_100 value: 31.176 - type: recall_at_1000 value: 55.789 - type: recall_at_20 value: 19.139 - type: recall_at_3 value: 10.610999999999999 - type: recall_at_5 value: 12.302 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackUnixRetrieval (default) revision: 6c6430d3a6d36f8d2a829195bc5dc94d7e063e53 split: test type: mteb/cqadupstack-unix metrics: - type: main_score value: 17.887 - type: map_at_1 value: 10.67 - type: map_at_10 value: 14.939 - type: map_at_100 value: 15.573999999999998 - type: map_at_1000 value: 15.684999999999999 - type: map_at_20 value: 15.223999999999998 - type: map_at_3 value: 13.799 - type: map_at_5 value: 14.332 - type: mrr_at_1 value: 12.87313432835821 - type: mrr_at_10 value: 17.731469142383315 - type: mrr_at_100 value: 18.402040803136185 - type: mrr_at_1000 value: 18.494079224741526 - type: mrr_at_20 value: 18.03607538846422 - type: mrr_at_3 value: 16.52674129353234 - type: mrr_at_5 value: 17.067786069651746 - type: nauc_map_at_1000_diff1 value: 36.76823844116977 - type: nauc_map_at_1000_max value: 27.94572035162958 - type: nauc_map_at_1000_std value: -4.388884836241252 - type: nauc_map_at_100_diff1 value: 36.79665558424098 - type: nauc_map_at_100_max value: 27.941945040312184 - type: nauc_map_at_100_std value: -4.490876272760181 - type: nauc_map_at_10_diff1 value: 37.41372310249964 - type: nauc_map_at_10_max value: 27.990160762136163 - type: nauc_map_at_10_std value: -4.795820943975423 - type: nauc_map_at_1_diff1 value: 49.98661453454098 - type: nauc_map_at_1_max value: 31.818725397600282 - type: nauc_map_at_1_std value: -5.925260594592982 - type: nauc_map_at_20_diff1 value: 36.92254435057548 - type: nauc_map_at_20_max value: 28.05940570041997 - type: nauc_map_at_20_std value: -4.780726661563721 - type: nauc_map_at_3_diff1 value: 38.36192130512335 - type: nauc_map_at_3_max value: 28.858454287203205 - type: nauc_map_at_3_std value: -5.755340070358489 - type: nauc_map_at_5_diff1 value: 37.9499946188146 - type: nauc_map_at_5_max value: 28.32627898637564 - type: nauc_map_at_5_std value: -4.900005922457714 - type: nauc_mrr_at_1000_diff1 value: 34.880119065480486 - type: nauc_mrr_at_1000_max value: 29.378414409768304 - type: nauc_mrr_at_1000_std value: -3.704128162920646 - type: nauc_mrr_at_100_diff1 value: 34.89626548489115 - type: nauc_mrr_at_100_max value: 29.389675765322654 - type: nauc_mrr_at_100_std value: -3.7494327131452114 - type: nauc_mrr_at_10_diff1 value: 35.32636142642244 - type: nauc_mrr_at_10_max value: 29.471010072155597 - type: nauc_mrr_at_10_std value: -3.996358264478763 - type: nauc_mrr_at_1_diff1 value: 47.90151427745923 - type: nauc_mrr_at_1_max value: 34.205811607428565 - type: nauc_mrr_at_1_std value: -4.355308541351635 - type: nauc_mrr_at_20_diff1 value: 34.939420448282924 - type: nauc_mrr_at_20_max value: 29.514028377508296 - type: nauc_mrr_at_20_std value: -4.031940468517912 - type: nauc_mrr_at_3_diff1 value: 36.37081496865121 - type: nauc_mrr_at_3_max value: 30.668982407799405 - type: nauc_mrr_at_3_std value: -4.8471296781069 - type: nauc_mrr_at_5_diff1 value: 35.88679509747269 - type: nauc_mrr_at_5_max value: 29.912822412299523 - type: nauc_mrr_at_5_std value: -4.38559630771512 - type: nauc_ndcg_at_1000_diff1 value: 30.958587739289474 - type: nauc_ndcg_at_1000_max value: 25.494852381225762 - type: nauc_ndcg_at_1000_std value: -0.1616411394145601 - type: nauc_ndcg_at_100_diff1 value: 31.25494182437728 - type: nauc_ndcg_at_100_max value: 25.646920642171217 - type: nauc_ndcg_at_100_std value: -2.4513554125960777 - type: nauc_ndcg_at_10_diff1 value: 33.12811377055696 - type: nauc_ndcg_at_10_max value: 26.464639927253046 - type: nauc_ndcg_at_10_std value: -4.2881824335959395 - type: nauc_ndcg_at_1_diff1 value: 47.90151427745923 - type: nauc_ndcg_at_1_max value: 34.205811607428565 - type: nauc_ndcg_at_1_std value: -4.355308541351635 - type: nauc_ndcg_at_20_diff1 value: 31.675421489903904 - type: nauc_ndcg_at_20_max value: 26.522154184809644 - type: nauc_ndcg_at_20_std value: -4.284414659369125 - type: nauc_ndcg_at_3_diff1 value: 34.46164089418861 - type: nauc_ndcg_at_3_max value: 28.686854091455782 - type: nauc_ndcg_at_3_std value: -5.695127299581537 - type: nauc_ndcg_at_5_diff1 value: 34.06268264335981 - type: nauc_ndcg_at_5_max value: 27.41462353998668 - type: nauc_ndcg_at_5_std value: -4.615408130218053 - type: nauc_precision_at_1000_diff1 value: 6.773761974306285 - type: nauc_precision_at_1000_max value: 13.042896531679865 - type: nauc_precision_at_1000_std value: 18.281508859789664 - type: nauc_precision_at_100_diff1 value: 15.924429001932866 - type: nauc_precision_at_100_max value: 20.457743309047 - type: nauc_precision_at_100_std value: 6.6080283991640005 - type: nauc_precision_at_10_diff1 value: 22.315260994226936 - type: nauc_precision_at_10_max value: 25.482668659182643 - type: nauc_precision_at_10_std value: -2.8176143138195253 - type: nauc_precision_at_1_diff1 value: 47.90151427745923 - type: nauc_precision_at_1_max value: 34.205811607428565 - type: nauc_precision_at_1_std value: -4.355308541351635 - type: nauc_precision_at_20_diff1 value: 18.144557360729884 - type: nauc_precision_at_20_max value: 25.204053277572914 - type: nauc_precision_at_20_std value: -1.7144259829502586 - type: nauc_precision_at_3_diff1 value: 25.226557583051363 - type: nauc_precision_at_3_max value: 28.206981851597934 - type: nauc_precision_at_3_std value: -6.113164957842488 - type: nauc_precision_at_5_diff1 value: 24.106572466155253 - type: nauc_precision_at_5_max value: 26.92943346868415 - type: nauc_precision_at_5_std value: -3.554052142651989 - type: nauc_recall_at_1000_diff1 value: 14.589862576218504 - type: nauc_recall_at_1000_max value: 12.701428869835372 - type: nauc_recall_at_1000_std value: 17.690486024774614 - type: nauc_recall_at_100_diff1 value: 18.41380873008093 - type: nauc_recall_at_100_max value: 16.44325189297761 - type: nauc_recall_at_100_std value: 2.5676496026723403 - type: nauc_recall_at_10_diff1 value: 23.836353717326148 - type: nauc_recall_at_10_max value: 20.488533672940427 - type: nauc_recall_at_10_std value: -3.6412149465751766 - type: nauc_recall_at_1_diff1 value: 49.98661453454098 - type: nauc_recall_at_1_max value: 31.818725397600282 - type: nauc_recall_at_1_std value: -5.925260594592982 - type: nauc_recall_at_20_diff1 value: 19.95415850300592 - type: nauc_recall_at_20_max value: 20.479489049982334 - type: nauc_recall_at_20_std value: -3.748045332843713 - type: nauc_recall_at_3_diff1 value: 27.51330288671981 - type: nauc_recall_at_3_max value: 25.233366697694947 - type: nauc_recall_at_3_std value: -6.416003335135423 - type: nauc_recall_at_5_diff1 value: 25.92079220648793 - type: nauc_recall_at_5_max value: 22.63598503654417 - type: nauc_recall_at_5_std value: -4.241913243082138 - type: ndcg_at_1 value: 12.873000000000001 - type: ndcg_at_10 value: 17.887 - type: ndcg_at_100 value: 21.487000000000002 - type: ndcg_at_1000 value: 24.596 - type: ndcg_at_20 value: 18.891 - type: ndcg_at_3 value: 15.65 - type: ndcg_at_5 value: 16.438 - type: precision_at_1 value: 12.873000000000001 - type: precision_at_10 value: 3.06 - type: precision_at_100 value: 0.549 - type: precision_at_1000 value: 0.091 - type: precision_at_20 value: 1.81 - type: precision_at_3 value: 7.338 - type: precision_at_5 value: 4.925 - type: recall_at_1 value: 10.67 - type: recall_at_10 value: 24.332 - type: recall_at_100 value: 41.046 - type: recall_at_1000 value: 64.17399999999999 - type: recall_at_20 value: 27.894999999999996 - type: recall_at_3 value: 17.807000000000002 - type: recall_at_5 value: 20.003 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackWebmastersRetrieval (default) revision: 160c094312a0e1facb97e55eeddb698c0abe3571 split: test type: mteb/cqadupstack-webmasters metrics: - type: main_score value: 20.995 - type: map_at_1 value: 12.667 - type: map_at_10 value: 17.408 - type: map_at_100 value: 18.318 - type: map_at_1000 value: 18.512 - type: map_at_20 value: 17.829 - type: map_at_3 value: 15.676000000000002 - type: map_at_5 value: 16.799 - type: mrr_at_1 value: 16.007905138339922 - type: mrr_at_10 value: 20.940460505677894 - type: mrr_at_100 value: 21.69143441032759 - type: mrr_at_1000 value: 21.80204722464657 - type: mrr_at_20 value: 21.26643246728539 - type: mrr_at_3 value: 19.400527009222657 - type: mrr_at_5 value: 20.25032938076416 - type: nauc_map_at_1000_diff1 value: 32.47046839846832 - type: nauc_map_at_1000_max value: 27.66353273195947 - type: nauc_map_at_1000_std value: -8.343684598764236 - type: nauc_map_at_100_diff1 value: 32.50616188089244 - type: nauc_map_at_100_max value: 27.73747518277514 - type: nauc_map_at_100_std value: -8.490341693700493 - type: nauc_map_at_10_diff1 value: 33.1288027398037 - type: nauc_map_at_10_max value: 28.241224798228394 - type: nauc_map_at_10_std value: -9.02951560885345 - type: nauc_map_at_1_diff1 value: 40.80511648851529 - type: nauc_map_at_1_max value: 30.162361476463918 - type: nauc_map_at_1_std value: -9.17387155813208 - type: nauc_map_at_20_diff1 value: 32.48172255854906 - type: nauc_map_at_20_max value: 27.83782940642731 - type: nauc_map_at_20_std value: -9.00070653423497 - type: nauc_map_at_3_diff1 value: 35.33220107117886 - type: nauc_map_at_3_max value: 28.50773685929629 - type: nauc_map_at_3_std value: -9.774728282721654 - type: nauc_map_at_5_diff1 value: 33.86289360627357 - type: nauc_map_at_5_max value: 28.381915762969662 - type: nauc_map_at_5_std value: -8.724453393790323 - type: nauc_mrr_at_1000_diff1 value: 34.76838356916287 - type: nauc_mrr_at_1000_max value: 25.973225541804773 - type: nauc_mrr_at_1000_std value: -7.858456592729439 - type: nauc_mrr_at_100_diff1 value: 34.75307312922653 - type: nauc_mrr_at_100_max value: 25.954557036353204 - type: nauc_mrr_at_100_std value: -7.871516537547969 - type: nauc_mrr_at_10_diff1 value: 35.358922810256225 - type: nauc_mrr_at_10_max value: 26.210843274361505 - type: nauc_mrr_at_10_std value: -8.214417147884735 - type: nauc_mrr_at_1_diff1 value: 43.64078480508238 - type: nauc_mrr_at_1_max value: 29.370755170129257 - type: nauc_mrr_at_1_std value: -9.550993425629777 - type: nauc_mrr_at_20_diff1 value: 34.78113781898558 - type: nauc_mrr_at_20_max value: 25.928518389732314 - type: nauc_mrr_at_20_std value: -8.239044555830915 - type: nauc_mrr_at_3_diff1 value: 37.21568768239239 - type: nauc_mrr_at_3_max value: 26.65360940386168 - type: nauc_mrr_at_3_std value: -8.62478361674192 - type: nauc_mrr_at_5_diff1 value: 35.82015322695793 - type: nauc_mrr_at_5_max value: 26.501248513872365 - type: nauc_mrr_at_5_std value: -8.032258195748593 - type: nauc_ndcg_at_1000_diff1 value: 28.090450093791965 - type: nauc_ndcg_at_1000_max value: 25.0164560812251 - type: nauc_ndcg_at_1000_std value: -4.751492810050399 - type: nauc_ndcg_at_100_diff1 value: 28.006371102918905 - type: nauc_ndcg_at_100_max value: 24.96330175676876 - type: nauc_ndcg_at_100_std value: -5.191836591721473 - type: nauc_ndcg_at_10_diff1 value: 29.742355909419842 - type: nauc_ndcg_at_10_max value: 25.560202565798097 - type: nauc_ndcg_at_10_std value: -8.454314293252533 - type: nauc_ndcg_at_1_diff1 value: 43.64078480508238 - type: nauc_ndcg_at_1_max value: 29.370755170129257 - type: nauc_ndcg_at_1_std value: -9.550993425629777 - type: nauc_ndcg_at_20_diff1 value: 27.8673417820888 - type: nauc_ndcg_at_20_max value: 24.752636378864707 - type: nauc_ndcg_at_20_std value: -8.142335775441563 - type: nauc_ndcg_at_3_diff1 value: 34.12983925479383 - type: nauc_ndcg_at_3_max value: 25.383414977354768 - type: nauc_ndcg_at_3_std value: -9.314125871147313 - type: nauc_ndcg_at_5_diff1 value: 31.25517374528034 - type: nauc_ndcg_at_5_max value: 25.48971100051719 - type: nauc_ndcg_at_5_std value: -7.833719979552164 - type: nauc_precision_at_1000_diff1 value: 3.50440341668897 - type: nauc_precision_at_1000_max value: -3.087157205676272 - type: nauc_precision_at_1000_std value: 15.957120100116718 - type: nauc_precision_at_100_diff1 value: 8.365977140200012 - type: nauc_precision_at_100_max value: 5.265673590955992 - type: nauc_precision_at_100_std value: 10.094715416109812 - type: nauc_precision_at_10_diff1 value: 20.119798486812844 - type: nauc_precision_at_10_max value: 16.84346414214358 - type: nauc_precision_at_10_std value: -6.134362396350626 - type: nauc_precision_at_1_diff1 value: 43.64078480508238 - type: nauc_precision_at_1_max value: 29.370755170129257 - type: nauc_precision_at_1_std value: -9.550993425629777 - type: nauc_precision_at_20_diff1 value: 16.458988589576688 - type: nauc_precision_at_20_max value: 13.882029306822776 - type: nauc_precision_at_20_std value: -2.349385052523666 - type: nauc_precision_at_3_diff1 value: 30.425979660093866 - type: nauc_precision_at_3_max value: 20.960392518113437 - type: nauc_precision_at_3_std value: -9.122507391265795 - type: nauc_precision_at_5_diff1 value: 23.711336481938176 - type: nauc_precision_at_5_max value: 17.785091688656124 - type: nauc_precision_at_5_std value: -5.953830939145774 - type: nauc_recall_at_1000_diff1 value: 4.652161838596426 - type: nauc_recall_at_1000_max value: 13.427480897667563 - type: nauc_recall_at_1000_std value: 13.162281305962134 - type: nauc_recall_at_100_diff1 value: 12.940141763574056 - type: nauc_recall_at_100_max value: 17.133363434036806 - type: nauc_recall_at_100_std value: 5.929516195308144 - type: nauc_recall_at_10_diff1 value: 19.28025711155888 - type: nauc_recall_at_10_max value: 21.611600359640324 - type: nauc_recall_at_10_std value: -7.3225233954700055 - type: nauc_recall_at_1_diff1 value: 40.80511648851529 - type: nauc_recall_at_1_max value: 30.162361476463918 - type: nauc_recall_at_1_std value: -9.17387155813208 - type: nauc_recall_at_20_diff1 value: 13.45385867706307 - type: nauc_recall_at_20_max value: 17.79542474505384 - type: nauc_recall_at_20_std value: -6.804718967301025 - type: nauc_recall_at_3_diff1 value: 28.00830001315124 - type: nauc_recall_at_3_max value: 23.903118754205703 - type: nauc_recall_at_3_std value: -9.446774660465353 - type: nauc_recall_at_5_diff1 value: 22.969782395962362 - type: nauc_recall_at_5_max value: 23.293961742969984 - type: nauc_recall_at_5_std value: -5.613990704144268 - type: ndcg_at_1 value: 16.008 - type: ndcg_at_10 value: 20.995 - type: ndcg_at_100 value: 25.146 - type: ndcg_at_1000 value: 29.032999999999998 - type: ndcg_at_20 value: 22.149 - type: ndcg_at_3 value: 18.285999999999998 - type: ndcg_at_5 value: 19.725 - type: precision_at_1 value: 16.008 - type: precision_at_10 value: 4.15 - type: precision_at_100 value: 0.881 - type: precision_at_1000 value: 0.18 - type: precision_at_20 value: 2.549 - type: precision_at_3 value: 8.827 - type: precision_at_5 value: 6.561 - type: recall_at_1 value: 12.667 - type: recall_at_10 value: 27.334999999999997 - type: recall_at_100 value: 47.504999999999995 - type: recall_at_1000 value: 74.20400000000001 - type: recall_at_20 value: 32.223 - type: recall_at_3 value: 18.855 - type: recall_at_5 value: 23.031 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackWordpressRetrieval (default) revision: 4ffe81d471b1924886b33c7567bfb200e9eec5c4 split: test type: mteb/cqadupstack-wordpress metrics: - type: main_score value: 12.855 - type: map_at_1 value: 5.952 - type: map_at_10 value: 10.112 - type: map_at_100 value: 10.841000000000001 - type: map_at_1000 value: 10.952 - type: map_at_20 value: 10.485 - type: map_at_3 value: 8.61 - type: map_at_5 value: 9.39 - type: mrr_at_1 value: 6.654343807763401 - type: mrr_at_10 value: 11.145512425549404 - type: mrr_at_100 value: 11.890834621929745 - type: mrr_at_1000 value: 11.98935500535199 - type: mrr_at_20 value: 11.514438747582577 - type: mrr_at_3 value: 9.519408502772645 - type: mrr_at_5 value: 10.35120147874307 - type: nauc_map_at_1000_diff1 value: 28.118324967619934 - type: nauc_map_at_1000_max value: 33.4955396363861 - type: nauc_map_at_1000_std value: -7.124620464882072 - type: nauc_map_at_100_diff1 value: 28.139542997566775 - type: nauc_map_at_100_max value: 33.43234157469899 - type: nauc_map_at_100_std value: -7.243361520044231 - type: nauc_map_at_10_diff1 value: 28.847631060466366 - type: nauc_map_at_10_max value: 33.901023417079976 - type: nauc_map_at_10_std value: -7.925183564546207 - type: nauc_map_at_1_diff1 value: 50.30049199397133 - type: nauc_map_at_1_max value: 51.25572555439384 - type: nauc_map_at_1_std value: -11.015024646365136 - type: nauc_map_at_20_diff1 value: 28.28927518524186 - type: nauc_map_at_20_max value: 33.660096742342866 - type: nauc_map_at_20_std value: -7.818967310155242 - type: nauc_map_at_3_diff1 value: 31.240569537452505 - type: nauc_map_at_3_max value: 36.287067586021514 - type: nauc_map_at_3_std value: -9.672050554710639 - type: nauc_map_at_5_diff1 value: 29.22151929708611 - type: nauc_map_at_5_max value: 35.151163055940096 - type: nauc_map_at_5_std value: -8.08857100327833 - type: nauc_mrr_at_1000_diff1 value: 28.584908252510456 - type: nauc_mrr_at_1000_max value: 32.74580077107513 - type: nauc_mrr_at_1000_std value: -5.4545568607489425 - type: nauc_mrr_at_100_diff1 value: 28.590158729530756 - type: nauc_mrr_at_100_max value: 32.648265058397314 - type: nauc_mrr_at_100_std value: -5.497715715850587 - type: nauc_mrr_at_10_diff1 value: 29.13862424713755 - type: nauc_mrr_at_10_max value: 33.040759537886785 - type: nauc_mrr_at_10_std value: -5.9147477002669815 - type: nauc_mrr_at_1_diff1 value: 50.27722302230953 - type: nauc_mrr_at_1_max value: 49.25905641972045 - type: nauc_mrr_at_1_std value: -8.480294289311937 - type: nauc_mrr_at_20_diff1 value: 28.638536503165835 - type: nauc_mrr_at_20_max value: 32.88659954102282 - type: nauc_mrr_at_20_std value: -5.981819963535813 - type: nauc_mrr_at_3_diff1 value: 31.18159924468885 - type: nauc_mrr_at_3_max value: 35.21727856087969 - type: nauc_mrr_at_3_std value: -7.572707528554651 - type: nauc_mrr_at_5_diff1 value: 29.565525076928186 - type: nauc_mrr_at_5_max value: 34.266818009562066 - type: nauc_mrr_at_5_std value: -6.198634245500832 - type: nauc_ndcg_at_1000_diff1 value: 21.992225333730815 - type: nauc_ndcg_at_1000_max value: 28.17427028625173 - type: nauc_ndcg_at_1000_std value: -1.5499706000360816 - type: nauc_ndcg_at_100_diff1 value: 22.207779666352856 - type: nauc_ndcg_at_100_max value: 27.049600613849627 - type: nauc_ndcg_at_100_std value: -3.3145082009255664 - type: nauc_ndcg_at_10_diff1 value: 23.689293335278357 - type: nauc_ndcg_at_10_max value: 29.430164805550735 - type: nauc_ndcg_at_10_std value: -6.7008075430059915 - type: nauc_ndcg_at_1_diff1 value: 50.27722302230953 - type: nauc_ndcg_at_1_max value: 49.25905641972045 - type: nauc_ndcg_at_1_std value: -8.480294289311937 - type: nauc_ndcg_at_20_diff1 value: 22.2925895362134 - type: nauc_ndcg_at_20_max value: 28.844919103532163 - type: nauc_ndcg_at_20_std value: -6.594295088509034 - type: nauc_ndcg_at_3_diff1 value: 27.12317888260658 - type: nauc_ndcg_at_3_max value: 32.93206493058083 - type: nauc_ndcg_at_3_std value: -8.832021517864137 - type: nauc_ndcg_at_5_diff1 value: 24.20043773979843 - type: nauc_ndcg_at_5_max value: 31.54380198974836 - type: nauc_ndcg_at_5_std value: -6.807495457594366 - type: nauc_precision_at_1000_diff1 value: 5.581741511647604 - type: nauc_precision_at_1000_max value: 4.703458505931627 - type: nauc_precision_at_1000_std value: 10.657124449862566 - type: nauc_precision_at_100_diff1 value: 10.883192976516437 - type: nauc_precision_at_100_max value: 12.752909725063391 - type: nauc_precision_at_100_std value: 5.477310651451066 - type: nauc_precision_at_10_diff1 value: 13.750559486735126 - type: nauc_precision_at_10_max value: 21.16487005730127 - type: nauc_precision_at_10_std value: -4.531709245413559 - type: nauc_precision_at_1_diff1 value: 50.27722302230953 - type: nauc_precision_at_1_max value: 49.25905641972045 - type: nauc_precision_at_1_std value: -8.480294289311937 - type: nauc_precision_at_20_diff1 value: 11.29346713230963 - type: nauc_precision_at_20_max value: 20.31140492811378 - type: nauc_precision_at_20_std value: -3.028932222489695 - type: nauc_precision_at_3_diff1 value: 18.64174123411719 - type: nauc_precision_at_3_max value: 26.389733577145407 - type: nauc_precision_at_3_std value: -7.942974687482611 - type: nauc_precision_at_5_diff1 value: 14.483776598926971 - type: nauc_precision_at_5_max value: 24.48041754152907 - type: nauc_precision_at_5_std value: -3.7226914654635195 - type: nauc_recall_at_1000_diff1 value: 9.291280130218974 - type: nauc_recall_at_1000_max value: 17.433646542112527 - type: nauc_recall_at_1000_std value: 15.008011633433348 - type: nauc_recall_at_100_diff1 value: 12.803561963798474 - type: nauc_recall_at_100_max value: 14.512899220841478 - type: nauc_recall_at_100_std value: 4.5635363743743405 - type: nauc_recall_at_10_diff1 value: 14.619920557797897 - type: nauc_recall_at_10_max value: 21.580422687110726 - type: nauc_recall_at_10_std value: -5.609327303626449 - type: nauc_recall_at_1_diff1 value: 50.30049199397133 - type: nauc_recall_at_1_max value: 51.25572555439384 - type: nauc_recall_at_1_std value: -11.015024646365136 - type: nauc_recall_at_20_diff1 value: 11.964600349614422 - type: nauc_recall_at_20_max value: 20.588461630785062 - type: nauc_recall_at_20_std value: -5.702130226450261 - type: nauc_recall_at_3_diff1 value: 17.5253237828965 - type: nauc_recall_at_3_max value: 26.185608151458005 - type: nauc_recall_at_3_std value: -9.159514017216269 - type: nauc_recall_at_5_diff1 value: 14.004716307559587 - type: nauc_recall_at_5_max value: 24.584165770910065 - type: nauc_recall_at_5_std value: -5.221167835710616 - type: ndcg_at_1 value: 6.654 - type: ndcg_at_10 value: 12.855 - type: ndcg_at_100 value: 17.012 - type: ndcg_at_1000 value: 20.252 - type: ndcg_at_20 value: 14.161999999999999 - type: ndcg_at_3 value: 9.703000000000001 - type: ndcg_at_5 value: 11.091 - type: precision_at_1 value: 6.654 - type: precision_at_10 value: 2.366 - type: precision_at_100 value: 0.488 - type: precision_at_1000 value: 0.082 - type: precision_at_20 value: 1.488 - type: precision_at_3 value: 4.436 - type: precision_at_5 value: 3.4750000000000005 - type: recall_at_1 value: 5.952 - type: recall_at_10 value: 20.434 - type: recall_at_100 value: 40.579 - type: recall_at_1000 value: 65.872 - type: recall_at_20 value: 25.302000000000003 - type: recall_at_3 value: 11.873000000000001 - type: recall_at_5 value: 15.206 task: type: Retrieval - dataset: config: default name: MTEB ClimateFEVER (default) revision: 47f2ac6acb640fc46020b02a5b59fdda04d39380 split: test type: mteb/climate-fever metrics: - type: main_score value: 12.684000000000001 - type: map_at_1 value: 4.893 - type: map_at_10 value: 8.362 - type: map_at_100 value: 9.366 - type: map_at_1000 value: 9.51 - type: map_at_20 value: 8.89 - type: map_at_3 value: 6.922000000000001 - type: map_at_5 value: 7.55 - type: mrr_at_1 value: 10.879478827361563 - type: mrr_at_10 value: 17.425107285042124 - type: mrr_at_100 value: 18.451707469189756 - type: mrr_at_1000 value: 18.52126392525071 - type: mrr_at_20 value: 18.048165607672363 - type: mrr_at_3 value: 14.820846905537465 - type: mrr_at_5 value: 16.114006514657984 - type: nauc_map_at_1000_diff1 value: 23.84006264934629 - type: nauc_map_at_1000_max value: 4.910831067499504 - type: nauc_map_at_1000_std value: 21.87335313820886 - type: nauc_map_at_100_diff1 value: 23.778524081332208 - type: nauc_map_at_100_max value: 4.859800424394481 - type: nauc_map_at_100_std value: 21.531388522921386 - type: nauc_map_at_10_diff1 value: 23.87487289096816 - type: nauc_map_at_10_max value: 4.401846074458388 - type: nauc_map_at_10_std value: 18.73423239612392 - type: nauc_map_at_1_diff1 value: 35.758137361986876 - type: nauc_map_at_1_max value: 6.168314369703521 - type: nauc_map_at_1_std value: 16.65271803089269 - type: nauc_map_at_20_diff1 value: 24.039157788253018 - type: nauc_map_at_20_max value: 4.643267356141666 - type: nauc_map_at_20_std value: 20.226255685740064 - type: nauc_map_at_3_diff1 value: 26.37074453742451 - type: nauc_map_at_3_max value: 4.413149278221099 - type: nauc_map_at_3_std value: 15.63665704099623 - type: nauc_map_at_5_diff1 value: 25.039889457178184 - type: nauc_map_at_5_max value: 4.116798702870248 - type: nauc_map_at_5_std value: 16.483869796310607 - type: nauc_mrr_at_1000_diff1 value: 22.547131264817562 - type: nauc_mrr_at_1000_max value: 6.808785488910689 - type: nauc_mrr_at_1000_std value: 21.975345310790907 - type: nauc_mrr_at_100_diff1 value: 22.549641093971683 - type: nauc_mrr_at_100_max value: 6.81168274530983 - type: nauc_mrr_at_100_std value: 21.974004923300384 - type: nauc_mrr_at_10_diff1 value: 22.312721592779024 - type: nauc_mrr_at_10_max value: 6.679578080791881 - type: nauc_mrr_at_10_std value: 21.07740647837007 - type: nauc_mrr_at_1_diff1 value: 28.173217916679285 - type: nauc_mrr_at_1_max value: 7.01737786335727 - type: nauc_mrr_at_1_std value: 17.343185290003337 - type: nauc_mrr_at_20_diff1 value: 22.453660248688838 - type: nauc_mrr_at_20_max value: 6.729779393524216 - type: nauc_mrr_at_20_std value: 21.7064323041105 - type: nauc_mrr_at_3_diff1 value: 23.318290537345305 - type: nauc_mrr_at_3_max value: 6.85330097005025 - type: nauc_mrr_at_3_std value: 18.579666587768532 - type: nauc_mrr_at_5_diff1 value: 22.776029548753908 - type: nauc_mrr_at_5_max value: 6.774526393574311 - type: nauc_mrr_at_5_std value: 19.493348467671495 - type: nauc_ndcg_at_1000_diff1 value: 21.24051083897338 - type: nauc_ndcg_at_1000_max value: 5.47512915602942 - type: nauc_ndcg_at_1000_std value: 33.891319842379175 - type: nauc_ndcg_at_100_diff1 value: 20.419435436717333 - type: nauc_ndcg_at_100_max value: 5.39782089893606 - type: nauc_ndcg_at_100_std value: 30.159229347157506 - type: nauc_ndcg_at_10_diff1 value: 20.733063242245937 - type: nauc_ndcg_at_10_max value: 4.730118140766257 - type: nauc_ndcg_at_10_std value: 22.15611978743939 - type: nauc_ndcg_at_1_diff1 value: 28.173217916679285 - type: nauc_ndcg_at_1_max value: 7.01737786335727 - type: nauc_ndcg_at_1_std value: 17.343185290003337 - type: nauc_ndcg_at_20_diff1 value: 21.193054968270157 - type: nauc_ndcg_at_20_max value: 5.042507849955366 - type: nauc_ndcg_at_20_std value: 25.574905139811683 - type: nauc_ndcg_at_3_diff1 value: 23.84494482915719 - type: nauc_ndcg_at_3_max value: 5.487614479078213 - type: nauc_ndcg_at_3_std value: 17.257041665670382 - type: nauc_ndcg_at_5_diff1 value: 22.335981246975596 - type: nauc_ndcg_at_5_max value: 4.51930579751092 - type: nauc_ndcg_at_5_std value: 18.146324563164686 - type: nauc_precision_at_1000_diff1 value: 10.732854051903242 - type: nauc_precision_at_1000_max value: 6.906169025482474 - type: nauc_precision_at_1000_std value: 48.29990501127646 - type: nauc_precision_at_100_diff1 value: 11.335367835686048 - type: nauc_precision_at_100_max value: 8.33486931679638 - type: nauc_precision_at_100_std value: 44.02335918155949 - type: nauc_precision_at_10_diff1 value: 12.734140898903185 - type: nauc_precision_at_10_max value: 7.345403114877788 - type: nauc_precision_at_10_std value: 29.786495191603628 - type: nauc_precision_at_1_diff1 value: 28.173217916679285 - type: nauc_precision_at_1_max value: 7.01737786335727 - type: nauc_precision_at_1_std value: 17.343185290003337 - type: nauc_precision_at_20_diff1 value: 14.578686218208455 - type: nauc_precision_at_20_max value: 8.31600884554527 - type: nauc_precision_at_20_std value: 35.57944755395991 - type: nauc_precision_at_3_diff1 value: 17.424902975218114 - type: nauc_precision_at_3_max value: 7.173711594974116 - type: nauc_precision_at_3_std value: 18.881971193903073 - type: nauc_precision_at_5_diff1 value: 14.71989380091471 - type: nauc_precision_at_5_max value: 6.747106177114406 - type: nauc_precision_at_5_std value: 22.565140813543476 - type: nauc_recall_at_1000_diff1 value: 14.018742326454056 - type: nauc_recall_at_1000_max value: 1.5532125941851942 - type: nauc_recall_at_1000_std value: 48.0359073551386 - type: nauc_recall_at_100_diff1 value: 11.782399018197935 - type: nauc_recall_at_100_max value: 2.2870655024097513 - type: nauc_recall_at_100_std value: 37.97352959084523 - type: nauc_recall_at_10_diff1 value: 14.345879239147546 - type: nauc_recall_at_10_max value: 2.0087919399778515 - type: nauc_recall_at_10_std value: 24.59372608521495 - type: nauc_recall_at_1_diff1 value: 35.758137361986876 - type: nauc_recall_at_1_max value: 6.168314369703521 - type: nauc_recall_at_1_std value: 16.65271803089269 - type: nauc_recall_at_20_diff1 value: 14.6032045058713 - type: nauc_recall_at_20_max value: 2.192258051272998 - type: nauc_recall_at_20_std value: 30.200979930961648 - type: nauc_recall_at_3_diff1 value: 21.450459178725765 - type: nauc_recall_at_3_max value: 2.6687225558746217 - type: nauc_recall_at_3_std value: 15.62001953924645 - type: nauc_recall_at_5_diff1 value: 17.872642384652647 - type: nauc_recall_at_5_max value: 1.7062840921304248 - type: nauc_recall_at_5_std value: 17.238197751224522 - type: ndcg_at_1 value: 10.879 - type: ndcg_at_10 value: 12.684000000000001 - type: ndcg_at_100 value: 17.636 - type: ndcg_at_1000 value: 20.931 - type: ndcg_at_20 value: 14.557999999999998 - type: ndcg_at_3 value: 9.666 - type: ndcg_at_5 value: 10.592 - type: precision_at_1 value: 10.879 - type: precision_at_10 value: 4.215 - type: precision_at_100 value: 0.935 - type: precision_at_1000 value: 0.154 - type: precision_at_20 value: 2.8930000000000002 - type: precision_at_3 value: 7.166 - type: precision_at_5 value: 5.694 - type: recall_at_1 value: 4.893 - type: recall_at_10 value: 16.148 - type: recall_at_100 value: 33.826 - type: recall_at_1000 value: 52.91400000000001 - type: recall_at_20 value: 21.568 - type: recall_at_3 value: 8.984 - type: recall_at_5 value: 11.417 task: type: Retrieval - dataset: config: default name: MTEB DBPedia (default) revision: c0f706b76e590d620bd6618b3ca8efdd34e2d659 split: test type: mteb/dbpedia metrics: - type: main_score value: 18.714 - type: map_at_1 value: 3.6290000000000004 - type: map_at_10 value: 7.344 - type: map_at_100 value: 10.174999999999999 - type: map_at_1000 value: 10.89 - type: map_at_20 value: 8.439 - type: map_at_3 value: 5.609999999999999 - type: map_at_5 value: 6.337 - type: mrr_at_1 value: 37.0 - type: mrr_at_10 value: 46.09295634920637 - type: mrr_at_100 value: 46.88963947930081 - type: mrr_at_1000 value: 46.921566120401955 - type: mrr_at_20 value: 46.52364089084293 - type: mrr_at_3 value: 43.33333333333334 - type: mrr_at_5 value: 44.90833333333335 - type: nauc_map_at_1000_diff1 value: 15.332578307626383 - type: nauc_map_at_1000_max value: 19.591409700798067 - type: nauc_map_at_1000_std value: 26.787357729943086 - type: nauc_map_at_100_diff1 value: 15.241772873921782 - type: nauc_map_at_100_max value: 18.342574948282497 - type: nauc_map_at_100_std value: 23.631531457963924 - type: nauc_map_at_10_diff1 value: 17.295256116074693 - type: nauc_map_at_10_max value: 10.62161320889349 - type: nauc_map_at_10_std value: 9.528015695519017 - type: nauc_map_at_1_diff1 value: 16.446542483531125 - type: nauc_map_at_1_max value: 4.979934347581338 - type: nauc_map_at_1_std value: 0.8028896220717383 - type: nauc_map_at_20_diff1 value: 16.81602502338933 - type: nauc_map_at_20_max value: 13.113289648729024 - type: nauc_map_at_20_std value: 14.351215296062362 - type: nauc_map_at_3_diff1 value: 14.907096937119139 - type: nauc_map_at_3_max value: 7.35444839341772 - type: nauc_map_at_3_std value: 3.56181101379306 - type: nauc_map_at_5_diff1 value: 17.310165177414458 - type: nauc_map_at_5_max value: 9.029713690770615 - type: nauc_map_at_5_std value: 5.483712783452527 - type: nauc_mrr_at_1000_diff1 value: 21.637726685501068 - type: nauc_mrr_at_1000_max value: 30.207538155542647 - type: nauc_mrr_at_1000_std value: 23.29384324216765 - type: nauc_mrr_at_100_diff1 value: 21.635718406960365 - type: nauc_mrr_at_100_max value: 30.21626999781084 - type: nauc_mrr_at_100_std value: 23.315552275404077 - type: nauc_mrr_at_10_diff1 value: 21.63149126393632 - type: nauc_mrr_at_10_max value: 30.19460995864985 - type: nauc_mrr_at_10_std value: 23.162647549161143 - type: nauc_mrr_at_1_diff1 value: 23.364434113790995 - type: nauc_mrr_at_1_max value: 29.16236827328641 - type: nauc_mrr_at_1_std value: 20.444573577612672 - type: nauc_mrr_at_20_diff1 value: 21.500850583557057 - type: nauc_mrr_at_20_max value: 30.20831775659985 - type: nauc_mrr_at_20_std value: 23.200255998287243 - type: nauc_mrr_at_3_diff1 value: 21.12636914240847 - type: nauc_mrr_at_3_max value: 28.8554344421751 - type: nauc_mrr_at_3_std value: 22.971981931510907 - type: nauc_mrr_at_5_diff1 value: 21.25759448565056 - type: nauc_mrr_at_5_max value: 29.949582847543653 - type: nauc_mrr_at_5_std value: 22.60218450418408 - type: nauc_ndcg_at_1000_diff1 value: 18.808237293933672 - type: nauc_ndcg_at_1000_max value: 21.383496457619863 - type: nauc_ndcg_at_1000_std value: 41.576194502603904 - type: nauc_ndcg_at_100_diff1 value: 17.221887092074635 - type: nauc_ndcg_at_100_max value: 17.701739166467814 - type: nauc_ndcg_at_100_std value: 32.68960425363178 - type: nauc_ndcg_at_10_diff1 value: 18.532709672848732 - type: nauc_ndcg_at_10_max value: 17.09971249017414 - type: nauc_ndcg_at_10_std value: 24.640964891301568 - type: nauc_ndcg_at_1_diff1 value: 20.909544791732714 - type: nauc_ndcg_at_1_max value: 19.966081278133522 - type: nauc_ndcg_at_1_std value: 16.467816838901918 - type: nauc_ndcg_at_20_diff1 value: 17.17581137257012 - type: nauc_ndcg_at_20_max value: 15.286085887063514 - type: nauc_ndcg_at_20_std value: 24.382832522939328 - type: nauc_ndcg_at_3_diff1 value: 16.33752617073797 - type: nauc_ndcg_at_3_max value: 17.80070987939365 - type: nauc_ndcg_at_3_std value: 21.901487508713668 - type: nauc_ndcg_at_5_diff1 value: 17.66213503429926 - type: nauc_ndcg_at_5_max value: 18.315036078788523 - type: nauc_ndcg_at_5_std value: 22.196869148981882 - type: nauc_precision_at_1000_diff1 value: 3.153755654841115 - type: nauc_precision_at_1000_max value: 23.826422759712194 - type: nauc_precision_at_1000_std value: 38.32310024626058 - type: nauc_precision_at_100_diff1 value: 5.254703196587399 - type: nauc_precision_at_100_max value: 31.23694387267914 - type: nauc_precision_at_100_std value: 46.615222544239785 - type: nauc_precision_at_10_diff1 value: 9.171988505302384 - type: nauc_precision_at_10_max value: 26.89906129794692 - type: nauc_precision_at_10_std value: 36.25236215404761 - type: nauc_precision_at_1_diff1 value: 23.364434113790995 - type: nauc_precision_at_1_max value: 29.16236827328641 - type: nauc_precision_at_1_std value: 20.444573577612672 - type: nauc_precision_at_20_diff1 value: 6.816222235055836 - type: nauc_precision_at_20_max value: 28.05552431582458 - type: nauc_precision_at_20_std value: 39.041946684417596 - type: nauc_precision_at_3_diff1 value: 12.440898759477614 - type: nauc_precision_at_3_max value: 25.53095697663368 - type: nauc_precision_at_3_std value: 26.29306114437138 - type: nauc_precision_at_5_diff1 value: 12.961933144163579 - type: nauc_precision_at_5_max value: 28.8551662840494 - type: nauc_precision_at_5_std value: 28.98920116163561 - type: nauc_recall_at_1000_diff1 value: 10.46665439274001 - type: nauc_recall_at_1000_max value: 9.12732640867415 - type: nauc_recall_at_1000_std value: 42.420396816639986 - type: nauc_recall_at_100_diff1 value: 7.630795440733252 - type: nauc_recall_at_100_max value: 9.497703777492731 - type: nauc_recall_at_100_std value: 30.3239668986987 - type: nauc_recall_at_10_diff1 value: 15.472483341738865 - type: nauc_recall_at_10_max value: 3.6641891638054798 - type: nauc_recall_at_10_std value: 4.57953087809313 - type: nauc_recall_at_1_diff1 value: 16.446542483531125 - type: nauc_recall_at_1_max value: 4.979934347581338 - type: nauc_recall_at_1_std value: 0.8028896220717383 - type: nauc_recall_at_20_diff1 value: 9.043285621876421 - type: nauc_recall_at_20_max value: 2.799814278881547 - type: nauc_recall_at_20_std value: 9.488589268742839 - type: nauc_recall_at_3_diff1 value: 11.070041224495936 - type: nauc_recall_at_3_max value: 3.058997523275269 - type: nauc_recall_at_3_std value: -0.31088660397764756 - type: nauc_recall_at_5_diff1 value: 15.280147039490439 - type: nauc_recall_at_5_max value: 3.8735984736389604 - type: nauc_recall_at_5_std value: -0.03652249815937461 - type: ndcg_at_1 value: 26.0 - type: ndcg_at_10 value: 18.714 - type: ndcg_at_100 value: 21.972 - type: ndcg_at_1000 value: 27.908 - type: ndcg_at_20 value: 18.666 - type: ndcg_at_3 value: 21.593 - type: ndcg_at_5 value: 19.89 - type: precision_at_1 value: 37.0 - type: precision_at_10 value: 16.175 - type: precision_at_100 value: 5.405 - type: precision_at_1000 value: 1.1119999999999999 - type: precision_at_20 value: 12.45 - type: precision_at_3 value: 26.25 - type: precision_at_5 value: 21.3 - type: recall_at_1 value: 3.6290000000000004 - type: recall_at_10 value: 11.074 - type: recall_at_100 value: 27.508 - type: recall_at_1000 value: 48.478 - type: recall_at_20 value: 15.765 - type: recall_at_3 value: 6.679 - type: recall_at_5 value: 8.272 task: type: Retrieval - dataset: config: default name: MTEB EmotionClassification (default) revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37 split: test type: mteb/emotion metrics: - type: accuracy value: 37.085 - type: f1 value: 33.85927583699898 - type: f1_weighted value: 39.200474117393966 - type: main_score value: 37.085 task: type: Classification - dataset: config: default name: MTEB FEVER (default) revision: bea83ef9e8fb933d90a2f1d5515737465d613e12 split: test type: mteb/fever metrics: - type: main_score value: 22.016 - type: map_at_1 value: 12.193 - type: map_at_10 value: 18.082 - type: map_at_100 value: 19.041 - type: map_at_1000 value: 19.127 - type: map_at_20 value: 18.614 - type: map_at_3 value: 15.791 - type: map_at_5 value: 17.074 - type: mrr_at_1 value: 12.946294629462946 - type: mrr_at_10 value: 19.172619642916665 - type: mrr_at_100 value: 20.154909631396883 - type: mrr_at_1000 value: 20.23555740317628 - type: mrr_at_20 value: 19.71354143370259 - type: mrr_at_3 value: 16.76167616761678 - type: mrr_at_5 value: 18.12756275627569 - type: nauc_map_at_1000_diff1 value: 20.290997144547806 - type: nauc_map_at_1000_max value: 11.450991275708125 - type: nauc_map_at_1000_std value: -10.04517962568564 - type: nauc_map_at_100_diff1 value: 20.286419962395446 - type: nauc_map_at_100_max value: 11.425096874032468 - type: nauc_map_at_100_std value: -10.065217561013961 - type: nauc_map_at_10_diff1 value: 20.352678660604802 - type: nauc_map_at_10_max value: 11.01767996890229 - type: nauc_map_at_10_std value: -10.707087936088575 - type: nauc_map_at_1_diff1 value: 25.032419107186094 - type: nauc_map_at_1_max value: 12.369813614872736 - type: nauc_map_at_1_std value: -14.118939916139569 - type: nauc_map_at_20_diff1 value: 20.389612922681682 - type: nauc_map_at_20_max value: 11.353929159428661 - type: nauc_map_at_20_std value: -10.297859728424513 - type: nauc_map_at_3_diff1 value: 21.10704599787224 - type: nauc_map_at_3_max value: 10.930500499862571 - type: nauc_map_at_3_std value: -12.150965535677678 - type: nauc_map_at_5_diff1 value: 20.842777278284128 - type: nauc_map_at_5_max value: 10.827383306142737 - type: nauc_map_at_5_std value: -11.221709408333618 - type: nauc_mrr_at_1000_diff1 value: 20.318256054389476 - type: nauc_mrr_at_1000_max value: 11.796117937558172 - type: nauc_mrr_at_1000_std value: -10.287039413450211 - type: nauc_mrr_at_100_diff1 value: 20.30841620615174 - type: nauc_mrr_at_100_max value: 11.779189553888532 - type: nauc_mrr_at_100_std value: -10.294866807046127 - type: nauc_mrr_at_10_diff1 value: 20.374243995449877 - type: nauc_mrr_at_10_max value: 11.378404399185833 - type: nauc_mrr_at_10_std value: -10.875685274480453 - type: nauc_mrr_at_1_diff1 value: 25.100637371748824 - type: nauc_mrr_at_1_max value: 12.75349173425225 - type: nauc_mrr_at_1_std value: -14.395108761279237 - type: nauc_mrr_at_20_diff1 value: 20.39503308580974 - type: nauc_mrr_at_20_max value: 11.68589575755117 - type: nauc_mrr_at_20_std value: -10.492915215640092 - type: nauc_mrr_at_3_diff1 value: 21.15981004354575 - type: nauc_mrr_at_3_max value: 11.28231678901033 - type: nauc_mrr_at_3_std value: -12.354174511822121 - type: nauc_mrr_at_5_diff1 value: 20.799863945954275 - type: nauc_mrr_at_5_max value: 11.185632335820825 - type: nauc_mrr_at_5_std value: -11.469723683281297 - type: nauc_ndcg_at_1000_diff1 value: 18.464587317922547 - type: nauc_ndcg_at_1000_max value: 13.008062904816914 - type: nauc_ndcg_at_1000_std value: -5.664914582345968 - type: nauc_ndcg_at_100_diff1 value: 18.16644191513211 - type: nauc_ndcg_at_100_max value: 12.562444143891966 - type: nauc_ndcg_at_100_std value: -6.1441260439999 - type: nauc_ndcg_at_10_diff1 value: 18.686352401538496 - type: nauc_ndcg_at_10_max value: 10.869744096886084 - type: nauc_ndcg_at_10_std value: -8.944207877220036 - type: nauc_ndcg_at_1_diff1 value: 25.100637371748824 - type: nauc_ndcg_at_1_max value: 12.75349173425225 - type: nauc_ndcg_at_1_std value: -14.395108761279237 - type: nauc_ndcg_at_20_diff1 value: 18.771980400862198 - type: nauc_ndcg_at_20_max value: 11.905846688294329 - type: nauc_ndcg_at_20_std value: -7.692989490709515 - type: nauc_ndcg_at_3_diff1 value: 20.08654674967674 - type: nauc_ndcg_at_3_max value: 10.663033509421721 - type: nauc_ndcg_at_3_std value: -11.574039012307594 - type: nauc_ndcg_at_5_diff1 value: 19.6605128392337 - type: nauc_ndcg_at_5_max value: 10.508598217516415 - type: nauc_ndcg_at_5_std value: -10.065510128768713 - type: nauc_precision_at_1000_diff1 value: 7.843686893129402 - type: nauc_precision_at_1000_max value: 21.12867481889994 - type: nauc_precision_at_1000_std value: 17.397771341896146 - type: nauc_precision_at_100_diff1 value: 10.964367718664041 - type: nauc_precision_at_100_max value: 18.134742533867346 - type: nauc_precision_at_100_std value: 7.826000941250076 - type: nauc_precision_at_10_diff1 value: 15.105380802537063 - type: nauc_precision_at_10_max value: 11.285261334237703 - type: nauc_precision_at_10_std value: -4.37944714089422 - type: nauc_precision_at_1_diff1 value: 25.100637371748824 - type: nauc_precision_at_1_max value: 12.75349173425225 - type: nauc_precision_at_1_std value: -14.395108761279237 - type: nauc_precision_at_20_diff1 value: 15.077505620030765 - type: nauc_precision_at_20_max value: 14.539549230107863 - type: nauc_precision_at_20_std value: -0.3542706803956202 - type: nauc_precision_at_3_diff1 value: 17.885365023585084 - type: nauc_precision_at_3_max value: 10.292960240507334 - type: nauc_precision_at_3_std value: -10.022232347175288 - type: nauc_precision_at_5_diff1 value: 17.139957329877934 - type: nauc_precision_at_5_max value: 10.26986709887834 - type: nauc_precision_at_5_std value: -7.222300752002702 - type: nauc_recall_at_1000_diff1 value: 10.939852794630156 - type: nauc_recall_at_1000_max value: 20.445200176227928 - type: nauc_recall_at_1000_std value: 17.423637451714775 - type: nauc_recall_at_100_diff1 value: 11.453503005311378 - type: nauc_recall_at_100_max value: 15.652758603853172 - type: nauc_recall_at_100_std value: 6.527801869334319 - type: nauc_recall_at_10_diff1 value: 14.432828795666774 - type: nauc_recall_at_10_max value: 9.917611920139953 - type: nauc_recall_at_10_std value: -4.640402932242214 - type: nauc_recall_at_1_diff1 value: 25.032419107186094 - type: nauc_recall_at_1_max value: 12.369813614872736 - type: nauc_recall_at_1_std value: -14.118939916139569 - type: nauc_recall_at_20_diff1 value: 14.649940175342705 - type: nauc_recall_at_20_max value: 12.839139966470082 - type: nauc_recall_at_20_std value: -1.1007068094900396 - type: nauc_recall_at_3_diff1 value: 17.369984220537575 - type: nauc_recall_at_3_max value: 9.706157288728694 - type: nauc_recall_at_3_std value: -9.933996418659476 - type: nauc_recall_at_5_diff1 value: 16.73461655268465 - type: nauc_recall_at_5_max value: 9.307482112802237 - type: nauc_recall_at_5_std value: -7.03240216549824 - type: ndcg_at_1 value: 12.946 - type: ndcg_at_10 value: 22.016 - type: ndcg_at_100 value: 27.1 - type: ndcg_at_1000 value: 29.608 - type: ndcg_at_20 value: 23.949 - type: ndcg_at_3 value: 17.254 - type: ndcg_at_5 value: 19.572 - type: precision_at_1 value: 12.946 - type: precision_at_10 value: 3.614 - type: precision_at_100 value: 0.632 - type: precision_at_1000 value: 0.086 - type: precision_at_20 value: 2.2190000000000003 - type: precision_at_3 value: 7.335999999999999 - type: precision_at_5 value: 5.62 - type: recall_at_1 value: 12.193 - type: recall_at_10 value: 33.477000000000004 - type: recall_at_100 value: 57.653 - type: recall_at_1000 value: 77.331 - type: recall_at_20 value: 40.967 - type: recall_at_3 value: 20.524 - type: recall_at_5 value: 26.049 task: type: Retrieval - dataset: config: default name: MTEB FiQA2018 (default) revision: 27a168819829fe9bcd655c2df245fb19452e8e06 split: test type: mteb/fiqa metrics: - type: main_score value: 10.489999999999998 - type: map_at_1 value: 4.324999999999999 - type: map_at_10 value: 7.2620000000000005 - type: map_at_100 value: 8.049000000000001 - type: map_at_1000 value: 8.219999999999999 - type: map_at_20 value: 7.61 - type: map_at_3 value: 5.973 - type: map_at_5 value: 6.691 - type: mrr_at_1 value: 8.950617283950617 - type: mrr_at_10 value: 13.708602782676858 - type: mrr_at_100 value: 14.590661251603459 - type: mrr_at_1000 value: 14.700261572617254 - type: mrr_at_20 value: 14.11123716025319 - type: mrr_at_3 value: 12.062757201646086 - type: mrr_at_5 value: 13.127572016460906 - type: nauc_map_at_1000_diff1 value: 29.868612329177928 - type: nauc_map_at_1000_max value: 1.8204575427341532 - type: nauc_map_at_1000_std value: -4.185357333535049 - type: nauc_map_at_100_diff1 value: 29.946178213759282 - type: nauc_map_at_100_max value: 1.610360929666458 - type: nauc_map_at_100_std value: -4.324079540013444 - type: nauc_map_at_10_diff1 value: 30.399813155198824 - type: nauc_map_at_10_max value: 1.8115464824069072 - type: nauc_map_at_10_std value: -4.737607209968629 - type: nauc_map_at_1_diff1 value: 37.53493767190502 - type: nauc_map_at_1_max value: 6.343933558239079 - type: nauc_map_at_1_std value: -8.230966082922905 - type: nauc_map_at_20_diff1 value: 30.308094557427058 - type: nauc_map_at_20_max value: 1.7031539908608901 - type: nauc_map_at_20_std value: -4.596734035205173 - type: nauc_map_at_3_diff1 value: 32.8951312020134 - type: nauc_map_at_3_max value: 1.5535854126023998 - type: nauc_map_at_3_std value: -4.539910426062374 - type: nauc_map_at_5_diff1 value: 30.438220232065543 - type: nauc_map_at_5_max value: 2.0380362092746083 - type: nauc_map_at_5_std value: -4.716253038875689 - type: nauc_mrr_at_1000_diff1 value: 26.097087362103995 - type: nauc_mrr_at_1000_max value: 6.377351302196768 - type: nauc_mrr_at_1000_std value: -8.980609641309028 - type: nauc_mrr_at_100_diff1 value: 26.0420700495144 - type: nauc_mrr_at_100_max value: 6.3133809175339755 - type: nauc_mrr_at_100_std value: -9.000162649179808 - type: nauc_mrr_at_10_diff1 value: 26.535507660887507 - type: nauc_mrr_at_10_max value: 6.381465133195606 - type: nauc_mrr_at_10_std value: -9.191571489530038 - type: nauc_mrr_at_1_diff1 value: 33.21219729698373 - type: nauc_mrr_at_1_max value: 8.117452072894173 - type: nauc_mrr_at_1_std value: -12.844056505931412 - type: nauc_mrr_at_20_diff1 value: 26.119432629408944 - type: nauc_mrr_at_20_max value: 6.142130397600541 - type: nauc_mrr_at_20_std value: -8.969120848763918 - type: nauc_mrr_at_3_diff1 value: 29.213633065227913 - type: nauc_mrr_at_3_max value: 6.158454748584739 - type: nauc_mrr_at_3_std value: -9.312167992788329 - type: nauc_mrr_at_5_diff1 value: 26.853690010476384 - type: nauc_mrr_at_5_max value: 6.607630323087147 - type: nauc_mrr_at_5_std value: -9.16727089175747 - type: nauc_ndcg_at_1000_diff1 value: 24.608991804968696 - type: nauc_ndcg_at_1000_max value: 5.359080584203262 - type: nauc_ndcg_at_1000_std value: -1.4847472953357936 - type: nauc_ndcg_at_100_diff1 value: 24.648632317746273 - type: nauc_ndcg_at_100_max value: 2.1712898966851113 - type: nauc_ndcg_at_100_std value: -3.5369260708070107 - type: nauc_ndcg_at_10_diff1 value: 27.014604913486856 - type: nauc_ndcg_at_10_max value: 2.4695161721048713 - type: nauc_ndcg_at_10_std value: -5.3598766328112735 - type: nauc_ndcg_at_1_diff1 value: 33.21219729698373 - type: nauc_ndcg_at_1_max value: 8.117452072894173 - type: nauc_ndcg_at_1_std value: -12.844056505931412 - type: nauc_ndcg_at_20_diff1 value: 26.348030975637954 - type: nauc_ndcg_at_20_max value: 1.76798660214836 - type: nauc_ndcg_at_20_std value: -4.752973355036493 - type: nauc_ndcg_at_3_diff1 value: 30.08569857797367 - type: nauc_ndcg_at_3_max value: 3.8922869178252917 - type: nauc_ndcg_at_3_std value: -5.983540710713673 - type: nauc_ndcg_at_5_diff1 value: 27.00404833916418 - type: nauc_ndcg_at_5_max value: 3.5093481086647174 - type: nauc_ndcg_at_5_std value: -5.594177739447796 - type: nauc_precision_at_1000_diff1 value: 6.90213731255884 - type: nauc_precision_at_1000_max value: 22.546962761447155 - type: nauc_precision_at_1000_std value: -4.411259743880491 - type: nauc_precision_at_100_diff1 value: 14.110688584366798 - type: nauc_precision_at_100_max value: 10.545246972283675 - type: nauc_precision_at_100_std value: -5.013842584740609 - type: nauc_precision_at_10_diff1 value: 20.259939679291286 - type: nauc_precision_at_10_max value: 6.864599576255598 - type: nauc_precision_at_10_std value: -6.629146983652406 - type: nauc_precision_at_1_diff1 value: 33.21219729698373 - type: nauc_precision_at_1_max value: 8.117452072894173 - type: nauc_precision_at_1_std value: -12.844056505931412 - type: nauc_precision_at_20_diff1 value: 19.290649186490967 - type: nauc_precision_at_20_max value: 5.972515078212738 - type: nauc_precision_at_20_std value: -5.429565238738726 - type: nauc_precision_at_3_diff1 value: 26.615348561686524 - type: nauc_precision_at_3_max value: 4.303529688113032 - type: nauc_precision_at_3_std value: -6.3859133152717575 - type: nauc_precision_at_5_diff1 value: 20.15741104687489 - type: nauc_precision_at_5_max value: 5.829980153393318 - type: nauc_precision_at_5_std value: -7.303750048891929 - type: nauc_recall_at_1000_diff1 value: 12.433553342367036 - type: nauc_recall_at_1000_max value: 4.468200721496133 - type: nauc_recall_at_1000_std value: 14.900182633571784 - type: nauc_recall_at_100_diff1 value: 14.0062702129626 - type: nauc_recall_at_100_max value: -1.7131702012948224 - type: nauc_recall_at_100_std value: 2.2633308267962704 - type: nauc_recall_at_10_diff1 value: 21.690668515787653 - type: nauc_recall_at_10_max value: -0.6937364802491892 - type: nauc_recall_at_10_std value: -3.082925088768182 - type: nauc_recall_at_1_diff1 value: 37.53493767190502 - type: nauc_recall_at_1_max value: 6.343933558239079 - type: nauc_recall_at_1_std value: -8.230966082922905 - type: nauc_recall_at_20_diff1 value: 19.77931628522879 - type: nauc_recall_at_20_max value: -1.8891310482328967 - type: nauc_recall_at_20_std value: -2.116148089873719 - type: nauc_recall_at_3_diff1 value: 29.51744746509749 - type: nauc_recall_at_3_max value: -1.5430112189485936 - type: nauc_recall_at_3_std value: -1.655207409284257 - type: nauc_recall_at_5_diff1 value: 21.71469884887553 - type: nauc_recall_at_5_max value: 0.7546577860370985 - type: nauc_recall_at_5_std value: -1.8445545818566638 - type: ndcg_at_1 value: 8.951 - type: ndcg_at_10 value: 10.489999999999998 - type: ndcg_at_100 value: 15.051 - type: ndcg_at_1000 value: 19.479 - type: ndcg_at_20 value: 11.73 - type: ndcg_at_3 value: 8.407 - type: ndcg_at_5 value: 9.382 - type: precision_at_1 value: 8.951 - type: precision_at_10 value: 3.056 - type: precision_at_100 value: 0.761 - type: precision_at_1000 value: 0.151 - type: precision_at_20 value: 1.991 - type: precision_at_3 value: 5.813 - type: precision_at_5 value: 4.7219999999999995 - type: recall_at_1 value: 4.324999999999999 - type: recall_at_10 value: 13.963999999999999 - type: recall_at_100 value: 32.568999999999996 - type: recall_at_1000 value: 60.873999999999995 - type: recall_at_20 value: 18.044 - type: recall_at_3 value: 7.863 - type: recall_at_5 value: 10.741 task: type: Retrieval - dataset: config: default name: MTEB HotpotQA (default) revision: ab518f4d6fcca38d87c25209f94beba119d02014 split: test type: mteb/hotpotqa metrics: - type: main_score value: 28.296 - type: map_at_1 value: 16.124 - type: map_at_10 value: 22.006999999999998 - type: map_at_100 value: 22.739 - type: map_at_1000 value: 22.831000000000003 - type: map_at_20 value: 22.397 - type: map_at_3 value: 20.343 - type: map_at_5 value: 21.273 - type: mrr_at_1 value: 32.248480756245776 - type: mrr_at_10 value: 38.63598169405064 - type: mrr_at_100 value: 39.30912106800413 - type: mrr_at_1000 value: 39.36706737124047 - type: mrr_at_20 value: 39.01889362753551 - type: mrr_at_3 value: 36.90524420436645 - type: mrr_at_5 value: 37.876884987621 - type: nauc_map_at_1000_diff1 value: 52.56275733949851 - type: nauc_map_at_1000_max value: 15.678119273683258 - type: nauc_map_at_1000_std value: 21.94442763793275 - type: nauc_map_at_100_diff1 value: 52.57779873054535 - type: nauc_map_at_100_max value: 15.675547534713088 - type: nauc_map_at_100_std value: 21.86210645684129 - type: nauc_map_at_10_diff1 value: 53.016128486745004 - type: nauc_map_at_10_max value: 15.782677582200714 - type: nauc_map_at_10_std value: 20.895601911314472 - type: nauc_map_at_1_diff1 value: 62.39324742344811 - type: nauc_map_at_1_max value: 18.922278332305293 - type: nauc_map_at_1_std value: 15.431990044458088 - type: nauc_map_at_20_diff1 value: 52.66735350527932 - type: nauc_map_at_20_max value: 15.720152193572472 - type: nauc_map_at_20_std value: 21.43058845996297 - type: nauc_map_at_3_diff1 value: 54.6666892102859 - type: nauc_map_at_3_max value: 16.731046525278487 - type: nauc_map_at_3_std value: 19.200351760472845 - type: nauc_map_at_5_diff1 value: 53.67302712440124 - type: nauc_map_at_5_max value: 16.14212699563179 - type: nauc_map_at_5_std value: 20.109580390507958 - type: nauc_mrr_at_1000_diff1 value: 57.590587384091286 - type: nauc_mrr_at_1000_max value: 16.955585029521554 - type: nauc_mrr_at_1000_std value: 18.940765599942846 - type: nauc_mrr_at_100_diff1 value: 57.57727053172551 - type: nauc_mrr_at_100_max value: 16.95237066457576 - type: nauc_mrr_at_100_std value: 18.940796857284766 - type: nauc_mrr_at_10_diff1 value: 57.71480130493494 - type: nauc_mrr_at_10_max value: 17.047197537035274 - type: nauc_mrr_at_10_std value: 18.60310516808845 - type: nauc_mrr_at_1_diff1 value: 62.39324742344811 - type: nauc_mrr_at_1_max value: 18.922278332305293 - type: nauc_mrr_at_1_std value: 15.431990044458088 - type: nauc_mrr_at_20_diff1 value: 57.59068015425055 - type: nauc_mrr_at_20_max value: 16.98394919583758 - type: nauc_mrr_at_20_std value: 18.81315221111426 - type: nauc_mrr_at_3_diff1 value: 58.67948717756185 - type: nauc_mrr_at_3_max value: 17.68777692655858 - type: nauc_mrr_at_3_std value: 17.53265364680353 - type: nauc_mrr_at_5_diff1 value: 58.139101763281666 - type: nauc_mrr_at_5_max value: 17.270925196457462 - type: nauc_mrr_at_5_std value: 18.056055685643045 - type: nauc_ndcg_at_1000_diff1 value: 50.592269072101516 - type: nauc_ndcg_at_1000_max value: 14.524760647752915 - type: nauc_ndcg_at_1000_std value: 26.838335704567463 - type: nauc_ndcg_at_100_diff1 value: 50.77465151278066 - type: nauc_ndcg_at_100_max value: 14.54429816135242 - type: nauc_ndcg_at_100_std value: 25.550144005876646 - type: nauc_ndcg_at_10_diff1 value: 52.196099719654995 - type: nauc_ndcg_at_10_max value: 15.021941288342521 - type: nauc_ndcg_at_10_std value: 22.17407528719642 - type: nauc_ndcg_at_1_diff1 value: 62.39324742344811 - type: nauc_ndcg_at_1_max value: 18.922278332305293 - type: nauc_ndcg_at_1_std value: 15.431990044458088 - type: nauc_ndcg_at_20_diff1 value: 51.30002836393829 - type: nauc_ndcg_at_20_max value: 14.814680820356232 - type: nauc_ndcg_at_20_std value: 23.506479941769733 - type: nauc_ndcg_at_3_diff1 value: 54.90780405878355 - type: nauc_ndcg_at_3_max value: 16.648637328318923 - type: nauc_ndcg_at_3_std value: 19.30934390416425 - type: nauc_ndcg_at_5_diff1 value: 53.479799880106086 - type: nauc_ndcg_at_5_max value: 15.738363325622498 - type: nauc_ndcg_at_5_std value: 20.58963012081015 - type: nauc_precision_at_1000_diff1 value: 24.304482939944215 - type: nauc_precision_at_1000_max value: 5.650518835490494 - type: nauc_precision_at_1000_std value: 41.977320321177345 - type: nauc_precision_at_100_diff1 value: 31.210792569116773 - type: nauc_precision_at_100_max value: 7.568305897193786 - type: nauc_precision_at_100_std value: 35.39707853767338 - type: nauc_precision_at_10_diff1 value: 41.43987014969449 - type: nauc_precision_at_10_max value: 10.60950763673837 - type: nauc_precision_at_10_std value: 26.62624496899695 - type: nauc_precision_at_1_diff1 value: 62.39324742344811 - type: nauc_precision_at_1_max value: 18.922278332305293 - type: nauc_precision_at_1_std value: 15.431990044458088 - type: nauc_precision_at_20_diff1 value: 37.555981094379796 - type: nauc_precision_at_20_max value: 9.733917395724056 - type: nauc_precision_at_20_std value: 29.976963378218098 - type: nauc_precision_at_3_diff1 value: 50.27466251846394 - type: nauc_precision_at_3_max value: 15.137975562897834 - type: nauc_precision_at_3_std value: 21.385116394323468 - type: nauc_precision_at_5_diff1 value: 46.22016922464899 - type: nauc_precision_at_5_max value: 12.884011400229156 - type: nauc_precision_at_5_std value: 23.551280371239656 - type: nauc_recall_at_1000_diff1 value: 24.30448293994435 - type: nauc_recall_at_1000_max value: 5.650518835490617 - type: nauc_recall_at_1000_std value: 41.97732032117746 - type: nauc_recall_at_100_diff1 value: 31.21079256911678 - type: nauc_recall_at_100_max value: 7.56830589719377 - type: nauc_recall_at_100_std value: 35.397078537673345 - type: nauc_recall_at_10_diff1 value: 41.43987014969447 - type: nauc_recall_at_10_max value: 10.609507636738407 - type: nauc_recall_at_10_std value: 26.626244968996925 - type: nauc_recall_at_1_diff1 value: 62.39324742344811 - type: nauc_recall_at_1_max value: 18.922278332305293 - type: nauc_recall_at_1_std value: 15.431990044458088 - type: nauc_recall_at_20_diff1 value: 37.5559810943798 - type: nauc_recall_at_20_max value: 9.733917395724083 - type: nauc_recall_at_20_std value: 29.976963378218112 - type: nauc_recall_at_3_diff1 value: 50.27466251846396 - type: nauc_recall_at_3_max value: 15.13797556289784 - type: nauc_recall_at_3_std value: 21.38511639432347 - type: nauc_recall_at_5_diff1 value: 46.220169224649 - type: nauc_recall_at_5_max value: 12.88401140022913 - type: nauc_recall_at_5_std value: 23.551280371239613 - type: ndcg_at_1 value: 32.248 - type: ndcg_at_10 value: 28.296 - type: ndcg_at_100 value: 31.830000000000002 - type: ndcg_at_1000 value: 34.182 - type: ndcg_at_20 value: 29.593000000000004 - type: ndcg_at_3 value: 25.080000000000002 - type: ndcg_at_5 value: 26.641 - type: precision_at_1 value: 32.248 - type: precision_at_10 value: 6.151 - type: precision_at_100 value: 0.898 - type: precision_at_1000 value: 0.121 - type: precision_at_20 value: 3.4939999999999998 - type: precision_at_3 value: 15.665000000000001 - type: precision_at_5 value: 10.633 - type: recall_at_1 value: 16.124 - type: recall_at_10 value: 30.756 - type: recall_at_100 value: 44.895 - type: recall_at_1000 value: 60.655 - type: recall_at_20 value: 34.936 - type: recall_at_3 value: 23.498 - type: recall_at_5 value: 26.583000000000002 task: type: Retrieval - dataset: config: default name: MTEB ImdbClassification (default) revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7 split: test type: mteb/imdb metrics: - type: accuracy value: 65.11479999999999 - type: ap value: 60.16054663114752 - type: ap_weighted value: 60.16054663114752 - type: f1 value: 64.58602077899722 - type: f1_weighted value: 64.58602077899724 - type: main_score value: 65.11479999999999 task: type: Classification - dataset: config: default name: MTEB MSMARCO (default) revision: c5a29a104738b98a9e76336939199e264163d4a0 split: test type: mteb/msmarco metrics: - type: main_score value: 27.705000000000002 - type: map_at_1 value: 0.777 - type: map_at_10 value: 4.274 - type: map_at_100 value: 10.459 - type: map_at_1000 value: 12.995000000000001 - type: map_at_20 value: 6.47 - type: map_at_3 value: 1.8610000000000002 - type: map_at_5 value: 2.606 - type: mrr_at_1 value: 46.51162790697674 - type: mrr_at_10 value: 58.708010335917315 - type: mrr_at_100 value: 59.00751703077284 - type: mrr_at_1000 value: 59.02276496514652 - type: mrr_at_20 value: 58.90180878552971 - type: mrr_at_3 value: 55.81395348837209 - type: mrr_at_5 value: 57.44186046511628 - type: nauc_map_at_1000_diff1 value: 37.892094182998136 - type: nauc_map_at_1000_max value: 61.74117112323522 - type: nauc_map_at_1000_std value: 58.58032442470286 - type: nauc_map_at_100_diff1 value: 40.49245812562701 - type: nauc_map_at_100_max value: 57.01499706917439 - type: nauc_map_at_100_std value: 51.72298891596721 - type: nauc_map_at_10_diff1 value: 38.194743917917116 - type: nauc_map_at_10_max value: 28.735417026530364 - type: nauc_map_at_10_std value: 31.023879510246598 - type: nauc_map_at_1_diff1 value: 32.49931114906685 - type: nauc_map_at_1_max value: 17.671517789719864 - type: nauc_map_at_1_std value: 16.99861035727389 - type: nauc_map_at_20_diff1 value: 36.32556775140449 - type: nauc_map_at_20_max value: 34.68159609940747 - type: nauc_map_at_20_std value: 38.40576232270393 - type: nauc_map_at_3_diff1 value: 28.749285903216972 - type: nauc_map_at_3_max value: 22.471665405120152 - type: nauc_map_at_3_std value: 24.69853700687298 - type: nauc_map_at_5_diff1 value: 31.853910704413547 - type: nauc_map_at_5_max value: 24.263061493565555 - type: nauc_map_at_5_std value: 28.612970147886262 - type: nauc_mrr_at_1000_diff1 value: 38.28674723804615 - type: nauc_mrr_at_1000_max value: 65.31128352347841 - type: nauc_mrr_at_1000_std value: 60.74832369191216 - type: nauc_mrr_at_100_diff1 value: 38.31302530772531 - type: nauc_mrr_at_100_max value: 65.33138728948728 - type: nauc_mrr_at_100_std value: 60.756072020421946 - type: nauc_mrr_at_10_diff1 value: 38.407877536524715 - type: nauc_mrr_at_10_max value: 64.69187029537487 - type: nauc_mrr_at_10_std value: 60.99973125836723 - type: nauc_mrr_at_1_diff1 value: 33.86818356255958 - type: nauc_mrr_at_1_max value: 63.497988338553334 - type: nauc_mrr_at_1_std value: 57.319330794169545 - type: nauc_mrr_at_20_diff1 value: 38.548064176888836 - type: nauc_mrr_at_20_max value: 65.17230095066438 - type: nauc_mrr_at_20_std value: 60.876500917878865 - type: nauc_mrr_at_3_diff1 value: 33.6890627338303 - type: nauc_mrr_at_3_max value: 64.82321215840447 - type: nauc_mrr_at_3_std value: 61.26157086058862 - type: nauc_mrr_at_5_diff1 value: 37.49455502289622 - type: nauc_mrr_at_5_max value: 65.53530465417907 - type: nauc_mrr_at_5_std value: 61.02287299328536 - type: nauc_ndcg_at_1000_diff1 value: 49.55226865832326 - type: nauc_ndcg_at_1000_max value: 61.12649206783223 - type: nauc_ndcg_at_1000_std value: 57.53286905675567 - type: nauc_ndcg_at_100_diff1 value: 45.73981167442622 - type: nauc_ndcg_at_100_max value: 64.82900696367803 - type: nauc_ndcg_at_100_std value: 48.49824360353255 - type: nauc_ndcg_at_10_diff1 value: 44.58241602640944 - type: nauc_ndcg_at_10_max value: 62.58045432730028 - type: nauc_ndcg_at_10_std value: 44.00810752260865 - type: nauc_ndcg_at_1_diff1 value: 35.224578682142635 - type: nauc_ndcg_at_1_max value: 44.63222303780071 - type: nauc_ndcg_at_1_std value: 22.087936224074618 - type: nauc_ndcg_at_20_diff1 value: 41.64314419662495 - type: nauc_ndcg_at_20_max value: 65.3789962064312 - type: nauc_ndcg_at_20_std value: 47.213428209069924 - type: nauc_ndcg_at_3_diff1 value: 36.95443124125196 - type: nauc_ndcg_at_3_max value: 56.10236595509034 - type: nauc_ndcg_at_3_std value: 38.53747582748712 - type: nauc_ndcg_at_5_diff1 value: 39.85878950415295 - type: nauc_ndcg_at_5_max value: 61.567975785495534 - type: nauc_ndcg_at_5_std value: 42.480532442232764 - type: nauc_precision_at_1000_diff1 value: 9.463162430234085 - type: nauc_precision_at_1000_max value: 61.7012187403225 - type: nauc_precision_at_1000_std value: 53.356643761687806 - type: nauc_precision_at_100_diff1 value: 22.507457849227073 - type: nauc_precision_at_100_max value: 74.14227941923573 - type: nauc_precision_at_100_std value: 56.66415918103874 - type: nauc_precision_at_10_diff1 value: 37.11634706297281 - type: nauc_precision_at_10_max value: 64.70246260978291 - type: nauc_precision_at_10_std value: 52.076370670842195 - type: nauc_precision_at_1_diff1 value: 33.86818356255958 - type: nauc_precision_at_1_max value: 63.497988338553334 - type: nauc_precision_at_1_std value: 57.319330794169545 - type: nauc_precision_at_20_diff1 value: 30.464024743782335 - type: nauc_precision_at_20_max value: 67.25613806762661 - type: nauc_precision_at_20_std value: 52.950474527983495 - type: nauc_precision_at_3_diff1 value: 25.67014245501591 - type: nauc_precision_at_3_max value: 64.64109190221811 - type: nauc_precision_at_3_std value: 61.79128083613472 - type: nauc_precision_at_5_diff1 value: 30.728206847540683 - type: nauc_precision_at_5_max value: 63.132851485096175 - type: nauc_precision_at_5_std value: 53.934810596223336 - type: nauc_recall_at_1000_diff1 value: 44.772142334722375 - type: nauc_recall_at_1000_max value: 52.83460479783461 - type: nauc_recall_at_1000_std value: 58.70222029972984 - type: nauc_recall_at_100_diff1 value: 48.17949191462816 - type: nauc_recall_at_100_max value: 51.837404933039686 - type: nauc_recall_at_100_std value: 46.57038195442946 - type: nauc_recall_at_10_diff1 value: 44.70152550284119 - type: nauc_recall_at_10_max value: 25.41255284271965 - type: nauc_recall_at_10_std value: 26.05400058770887 - type: nauc_recall_at_1_diff1 value: 32.49931114906685 - type: nauc_recall_at_1_max value: 17.671517789719864 - type: nauc_recall_at_1_std value: 16.99861035727389 - type: nauc_recall_at_20_diff1 value: 41.61632802348345 - type: nauc_recall_at_20_max value: 29.22885033770648 - type: nauc_recall_at_20_std value: 29.70591175740895 - type: nauc_recall_at_3_diff1 value: 25.408832214219373 - type: nauc_recall_at_3_max value: 20.110088341846414 - type: nauc_recall_at_3_std value: 27.5814549517511 - type: nauc_recall_at_5_diff1 value: 33.87726583518953 - type: nauc_recall_at_5_max value: 21.44640652682217 - type: nauc_recall_at_5_std value: 28.68467500448753 - type: ndcg_at_1 value: 31.008000000000003 - type: ndcg_at_10 value: 27.705000000000002 - type: ndcg_at_100 value: 25.61 - type: ndcg_at_1000 value: 32.81 - type: ndcg_at_20 value: 26.617 - type: ndcg_at_3 value: 29.476000000000003 - type: ndcg_at_5 value: 27.461999999999996 - type: precision_at_1 value: 46.512 - type: precision_at_10 value: 36.047000000000004 - type: precision_at_100 value: 15.86 - type: precision_at_1000 value: 3.519 - type: precision_at_20 value: 31.163 - type: precision_at_3 value: 43.411 - type: precision_at_5 value: 39.07 - type: recall_at_1 value: 0.777 - type: recall_at_10 value: 5.749 - type: recall_at_100 value: 20.636 - type: recall_at_1000 value: 41.509 - type: recall_at_20 value: 9.689 - type: recall_at_3 value: 2.125 - type: recall_at_5 value: 3.1809999999999996 task: type: Retrieval - dataset: config: en name: MTEB MTOPDomainClassification (en) revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf split: test type: mteb/mtop_domain metrics: - type: accuracy value: 84.75604195166439 - type: f1 value: 83.95972384901661 - type: f1_weighted value: 84.89916018023138 - type: main_score value: 84.75604195166439 task: type: Classification - dataset: config: en name: MTEB MTOPIntentClassification (en) revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba split: test type: mteb/mtop_intent metrics: - type: accuracy value: 63.25809393524852 - type: f1 value: 45.891660110133806 - type: f1_weighted value: 67.20838453908303 - type: main_score value: 63.25809393524852 task: type: Classification - dataset: config: en name: MTEB MassiveIntentClassification (en) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 62.66980497646267 - type: f1 value: 60.96054297925082 - type: f1_weighted value: 62.97616683347667 - type: main_score value: 62.66980497646267 task: type: Classification - dataset: config: en name: MTEB MassiveScenarioClassification (en) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 66.69804976462676 - type: f1 value: 65.66281437950263 - type: f1_weighted value: 66.80017206918848 - type: main_score value: 66.69804976462676 task: type: Classification - dataset: config: default name: MTEB MedrxivClusteringP2P (default) revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73 split: test type: mteb/medrxiv-clustering-p2p metrics: - type: main_score value: 24.995363084202875 - type: v_measure value: 24.995363084202875 - type: v_measure_std value: 1.5274247452970715 task: type: Clustering - dataset: config: default name: MTEB MedrxivClusteringS2S (default) revision: 35191c8c0dca72d8ff3efcd72aa802307d469663 split: test type: mteb/medrxiv-clustering-s2s metrics: - type: main_score value: 20.260962789850833 - type: v_measure value: 20.260962789850833 - type: v_measure_std value: 1.5612389984116821 task: type: Clustering - dataset: config: default name: MTEB MindSmallReranking (default) revision: 59042f120c80e8afa9cdbb224f67076cec0fc9a7 split: test type: mteb/mind_small metrics: - type: main_score value: 26.982693878333546 - type: map value: 26.982693878333546 - type: mrr value: 27.234304648772216 - type: nAUC_map_diff1 value: 15.483599146095642 - type: nAUC_map_max value: -31.954865506309687 - type: nAUC_map_std value: -19.352114548188798 - type: nAUC_mrr_diff1 value: 14.897752061307749 - type: nAUC_mrr_max value: -25.96940014108176 - type: nAUC_mrr_std value: -16.128495128181108 task: type: Reranking - dataset: config: default name: MTEB NFCorpus (default) revision: ec0fa4fe99da2ff19ca1214b7966684033a58814 split: test type: mteb/nfcorpus metrics: - type: main_score value: 19.475 - type: map_at_1 value: 2.673 - type: map_at_10 value: 5.7860000000000005 - type: map_at_100 value: 7.434 - type: map_at_1000 value: 8.429 - type: map_at_20 value: 6.394 - type: map_at_3 value: 4.352 - type: map_at_5 value: 5.013999999999999 - type: mrr_at_1 value: 27.24458204334365 - type: mrr_at_10 value: 36.41702294953069 - type: mrr_at_100 value: 37.2489840100607 - type: mrr_at_1000 value: 37.3170804962274 - type: mrr_at_20 value: 36.81253770554204 - type: mrr_at_3 value: 33.797729618163046 - type: mrr_at_5 value: 35.577915376676984 - type: nauc_map_at_1000_diff1 value: 29.712895586376238 - type: nauc_map_at_1000_max value: 26.118684880596003 - type: nauc_map_at_1000_std value: 24.766880316423457 - type: nauc_map_at_100_diff1 value: 31.159834051695544 - type: nauc_map_at_100_max value: 26.800206575448644 - type: nauc_map_at_100_std value: 20.993328557808237 - type: nauc_map_at_10_diff1 value: 34.34909074479394 - type: nauc_map_at_10_max value: 25.23888585073763 - type: nauc_map_at_10_std value: 15.666191671894675 - type: nauc_map_at_1_diff1 value: 54.40851531063473 - type: nauc_map_at_1_max value: 25.79812290419997 - type: nauc_map_at_1_std value: 9.490593216131844 - type: nauc_map_at_20_diff1 value: 32.98428104841538 - type: nauc_map_at_20_max value: 26.274463522342213 - type: nauc_map_at_20_std value: 17.768552660498734 - type: nauc_map_at_3_diff1 value: 40.97296071677192 - type: nauc_map_at_3_max value: 24.256933079739213 - type: nauc_map_at_3_std value: 12.605367264265299 - type: nauc_map_at_5_diff1 value: 39.35136745378991 - type: nauc_map_at_5_max value: 25.24732157901422 - type: nauc_map_at_5_std value: 14.346530622570702 - type: nauc_mrr_at_1000_diff1 value: 25.381479004777763 - type: nauc_mrr_at_1000_max value: 23.575087021020536 - type: nauc_mrr_at_1000_std value: 23.472005406321436 - type: nauc_mrr_at_100_diff1 value: 25.3574395673177 - type: nauc_mrr_at_100_max value: 23.583049296879377 - type: nauc_mrr_at_100_std value: 23.456570812574856 - type: nauc_mrr_at_10_diff1 value: 25.689849758337413 - type: nauc_mrr_at_10_max value: 23.617681843801964 - type: nauc_mrr_at_10_std value: 24.075405363094195 - type: nauc_mrr_at_1_diff1 value: 26.641133846014746 - type: nauc_mrr_at_1_max value: 19.62245594877117 - type: nauc_mrr_at_1_std value: 15.81592525325739 - type: nauc_mrr_at_20_diff1 value: 25.156433096912977 - type: nauc_mrr_at_20_max value: 23.580922123726676 - type: nauc_mrr_at_20_std value: 23.553425708985458 - type: nauc_mrr_at_3_diff1 value: 25.92080426032495 - type: nauc_mrr_at_3_max value: 22.38972437925532 - type: nauc_mrr_at_3_std value: 23.868512198894585 - type: nauc_mrr_at_5_diff1 value: 26.231411975409568 - type: nauc_mrr_at_5_max value: 22.763533805080037 - type: nauc_mrr_at_5_std value: 23.774766628068885 - type: nauc_ndcg_at_1000_diff1 value: 23.768885727339356 - type: nauc_ndcg_at_1000_max value: 29.247599007631937 - type: nauc_ndcg_at_1000_std value: 28.022344377335152 - type: nauc_ndcg_at_100_diff1 value: 23.85335949897677 - type: nauc_ndcg_at_100_max value: 25.697407111528147 - type: nauc_ndcg_at_100_std value: 27.07625187183171 - type: nauc_ndcg_at_10_diff1 value: 20.50707532119363 - type: nauc_ndcg_at_10_max value: 20.857784625493622 - type: nauc_ndcg_at_10_std value: 31.239220591583607 - type: nauc_ndcg_at_1_diff1 value: 26.802222119437737 - type: nauc_ndcg_at_1_max value: 17.38626435465188 - type: nauc_ndcg_at_1_std value: 18.543036819776866 - type: nauc_ndcg_at_20_diff1 value: 22.68036110236631 - type: nauc_ndcg_at_20_max value: 22.127685695906415 - type: nauc_ndcg_at_20_std value: 31.38065283673992 - type: nauc_ndcg_at_3_diff1 value: 21.126779548662377 - type: nauc_ndcg_at_3_max value: 21.257256258583762 - type: nauc_ndcg_at_3_std value: 30.38520412268269 - type: nauc_ndcg_at_5_diff1 value: 20.728997790365923 - type: nauc_ndcg_at_5_max value: 21.136871113511706 - type: nauc_ndcg_at_5_std value: 30.103036943833878 - type: nauc_precision_at_1000_diff1 value: -0.9684850979991009 - type: nauc_precision_at_1000_max value: -1.910073925377927 - type: nauc_precision_at_1000_std value: 42.445075721709244 - type: nauc_precision_at_100_diff1 value: 2.553047959683974 - type: nauc_precision_at_100_max value: 6.706578335145517 - type: nauc_precision_at_100_std value: 42.677614016114795 - type: nauc_precision_at_10_diff1 value: 6.908721977279816 - type: nauc_precision_at_10_max value: 18.524181494610247 - type: nauc_precision_at_10_std value: 38.513766049365444 - type: nauc_precision_at_1_diff1 value: 26.641133846014746 - type: nauc_precision_at_1_max value: 19.62245594877117 - type: nauc_precision_at_1_std value: 15.81592525325739 - type: nauc_precision_at_20_diff1 value: 6.5698441504079135 - type: nauc_precision_at_20_max value: 16.36401526243144 - type: nauc_precision_at_20_std value: 42.15246597563734 - type: nauc_precision_at_3_diff1 value: 13.746590558925318 - type: nauc_precision_at_3_max value: 24.471712487836307 - type: nauc_precision_at_3_std value: 35.07796641303652 - type: nauc_precision_at_5_diff1 value: 10.024055178218116 - type: nauc_precision_at_5_max value: 21.70563811077537 - type: nauc_precision_at_5_std value: 33.549334119957294 - type: nauc_recall_at_1000_diff1 value: 15.516112454483574 - type: nauc_recall_at_1000_max value: 12.812602971232662 - type: nauc_recall_at_1000_std value: 4.9745377100353645 - type: nauc_recall_at_100_diff1 value: 15.727471787207076 - type: nauc_recall_at_100_max value: 14.07072041204842 - type: nauc_recall_at_100_std value: 5.280256534913133 - type: nauc_recall_at_10_diff1 value: 23.54021143821257 - type: nauc_recall_at_10_max value: 16.21143367909769 - type: nauc_recall_at_10_std value: 10.742397069751759 - type: nauc_recall_at_1_diff1 value: 54.40851531063473 - type: nauc_recall_at_1_max value: 25.79812290419997 - type: nauc_recall_at_1_std value: 9.490593216131844 - type: nauc_recall_at_20_diff1 value: 20.56588979224455 - type: nauc_recall_at_20_max value: 19.004784742942014 - type: nauc_recall_at_20_std value: 9.966568259612574 - type: nauc_recall_at_3_diff1 value: 33.468878145564304 - type: nauc_recall_at_3_max value: 18.73787633759768 - type: nauc_recall_at_3_std value: 12.353055019568094 - type: nauc_recall_at_5_diff1 value: 32.89494204767019 - type: nauc_recall_at_5_max value: 19.01998117178556 - type: nauc_recall_at_5_std value: 13.737801318037624 - type: ndcg_at_1 value: 25.541999999999998 - type: ndcg_at_10 value: 19.475 - type: ndcg_at_100 value: 18.815 - type: ndcg_at_1000 value: 27.71 - type: ndcg_at_20 value: 18.212999999999997 - type: ndcg_at_3 value: 22.651 - type: ndcg_at_5 value: 21.516 - type: precision_at_1 value: 27.245 - type: precision_at_10 value: 14.365 - type: precision_at_100 value: 5.384 - type: precision_at_1000 value: 1.772 - type: precision_at_20 value: 11.006 - type: precision_at_3 value: 21.569 - type: precision_at_5 value: 18.947 - type: recall_at_1 value: 2.673 - type: recall_at_10 value: 9.212 - type: recall_at_100 value: 21.549 - type: recall_at_1000 value: 52.617999999999995 - type: recall_at_20 value: 11.705 - type: recall_at_3 value: 5.313 - type: recall_at_5 value: 6.869 task: type: Retrieval - dataset: config: default name: MTEB NQ (default) revision: b774495ed302d8c44a3a7ea25c90dbce03968f31 split: test type: mteb/nq metrics: - type: main_score value: 16.991 - type: map_at_1 value: 7.414 - type: map_at_10 value: 13.291 - type: map_at_100 value: 14.295 - type: map_at_1000 value: 14.389 - type: map_at_20 value: 13.876 - type: map_at_3 value: 11.262 - type: map_at_5 value: 12.339 - type: mrr_at_1 value: 8.516801853997682 - type: mrr_at_10 value: 14.731154242307184 - type: mrr_at_100 value: 15.694198665655856 - type: mrr_at_1000 value: 15.77486181874144 - type: mrr_at_20 value: 15.298086694879798 - type: mrr_at_3 value: 12.659327925840078 - type: mrr_at_5 value: 13.768829663962883 - type: nauc_map_at_1000_diff1 value: 20.28889762069646 - type: nauc_map_at_1000_max value: 11.368502727824952 - type: nauc_map_at_1000_std value: 10.077176659068975 - type: nauc_map_at_100_diff1 value: 20.285666016924328 - type: nauc_map_at_100_max value: 11.352497499093694 - type: nauc_map_at_100_std value: 9.98136423017311 - type: nauc_map_at_10_diff1 value: 20.335416558539237 - type: nauc_map_at_10_max value: 11.091563979136637 - type: nauc_map_at_10_std value: 8.745901277549152 - type: nauc_map_at_1_diff1 value: 24.979719230754476 - type: nauc_map_at_1_max value: 10.972032990843237 - type: nauc_map_at_1_std value: 4.7964267266650955 - type: nauc_map_at_20_diff1 value: 20.302803697684848 - type: nauc_map_at_20_max value: 11.159589961608782 - type: nauc_map_at_20_std value: 9.360825884036176 - type: nauc_map_at_3_diff1 value: 19.863972188782967 - type: nauc_map_at_3_max value: 10.898818486894147 - type: nauc_map_at_3_std value: 6.97496787073755 - type: nauc_map_at_5_diff1 value: 20.44569321324553 - type: nauc_map_at_5_max value: 10.722482919334105 - type: nauc_map_at_5_std value: 7.787226185137379 - type: nauc_mrr_at_1000_diff1 value: 19.746039395864496 - type: nauc_mrr_at_1000_max value: 10.495187770800463 - type: nauc_mrr_at_1000_std value: 10.284862758352 - type: nauc_mrr_at_100_diff1 value: 19.743060052871396 - type: nauc_mrr_at_100_max value: 10.484702853211761 - type: nauc_mrr_at_100_std value: 10.220220019367744 - type: nauc_mrr_at_10_diff1 value: 19.747518214214974 - type: nauc_mrr_at_10_max value: 10.1823356525796 - type: nauc_mrr_at_10_std value: 9.25568601945109 - type: nauc_mrr_at_1_diff1 value: 24.040270890346534 - type: nauc_mrr_at_1_max value: 9.900172534036168 - type: nauc_mrr_at_1_std value: 5.7354869310700245 - type: nauc_mrr_at_20_diff1 value: 19.75060956163397 - type: nauc_mrr_at_20_max value: 10.31776046090269 - type: nauc_mrr_at_20_std value: 9.770741755791374 - type: nauc_mrr_at_3_diff1 value: 19.4775451565507 - type: nauc_mrr_at_3_max value: 9.804429146930495 - type: nauc_mrr_at_3_std value: 7.931570036855481 - type: nauc_mrr_at_5_diff1 value: 19.806308832458882 - type: nauc_mrr_at_5_max value: 9.77292617618666 - type: nauc_mrr_at_5_std value: 8.55195259630072 - type: nauc_ndcg_at_1000_diff1 value: 19.375648509077983 - type: nauc_ndcg_at_1000_max value: 12.688796294165622 - type: nauc_ndcg_at_1000_std value: 17.80793230435146 - type: nauc_ndcg_at_100_diff1 value: 19.343394443678996 - type: nauc_ndcg_at_100_max value: 12.520511876585841 - type: nauc_ndcg_at_100_std value: 15.978861606925918 - type: nauc_ndcg_at_10_diff1 value: 19.42682468753324 - type: nauc_ndcg_at_10_max value: 11.10087572901484 - type: nauc_ndcg_at_10_std value: 10.54992883803028 - type: nauc_ndcg_at_1_diff1 value: 24.318414546738026 - type: nauc_ndcg_at_1_max value: 9.82349827107002 - type: nauc_ndcg_at_1_std value: 5.951156922071484 - type: nauc_ndcg_at_20_diff1 value: 19.41464830610135 - type: nauc_ndcg_at_20_max value: 11.344469897954262 - type: nauc_ndcg_at_20_std value: 12.221787446241533 - type: nauc_ndcg_at_3_diff1 value: 18.641316759283264 - type: nauc_ndcg_at_3_max value: 10.543844267142214 - type: nauc_ndcg_at_3_std value: 7.687890803254003 - type: nauc_ndcg_at_5_diff1 value: 19.45986949428097 - type: nauc_ndcg_at_5_max value: 10.375727437812799 - type: nauc_ndcg_at_5_std value: 8.85624541644588 - type: nauc_precision_at_1000_diff1 value: 11.066860853955465 - type: nauc_precision_at_1000_max value: 12.190880720909412 - type: nauc_precision_at_1000_std value: 35.834721766648705 - type: nauc_precision_at_100_diff1 value: 15.633579933121927 - type: nauc_precision_at_100_max value: 13.900393333698496 - type: nauc_precision_at_100_std value: 30.435998605665272 - type: nauc_precision_at_10_diff1 value: 18.321561255328813 - type: nauc_precision_at_10_max value: 10.71704151142003 - type: nauc_precision_at_10_std value: 14.681070391575767 - type: nauc_precision_at_1_diff1 value: 24.318414546738026 - type: nauc_precision_at_1_max value: 9.82349827107002 - type: nauc_precision_at_1_std value: 5.951156922071484 - type: nauc_precision_at_20_diff1 value: 17.897250659867172 - type: nauc_precision_at_20_max value: 11.178073596260878 - type: nauc_precision_at_20_std value: 18.922339798822485 - type: nauc_precision_at_3_diff1 value: 16.247029796437438 - type: nauc_precision_at_3_max value: 9.403033789602311 - type: nauc_precision_at_3_std value: 9.396827994803164 - type: nauc_precision_at_5_diff1 value: 18.40723036139704 - type: nauc_precision_at_5_max value: 8.984724544333158 - type: nauc_precision_at_5_std value: 11.190725807701849 - type: nauc_recall_at_1000_diff1 value: 17.125181724831485 - type: nauc_recall_at_1000_max value: 17.738235803420288 - type: nauc_recall_at_1000_std value: 47.4670421060216 - type: nauc_recall_at_100_diff1 value: 17.27215401019124 - type: nauc_recall_at_100_max value: 16.00490577182562 - type: nauc_recall_at_100_std value: 30.65356324274426 - type: nauc_recall_at_10_diff1 value: 17.554785599875217 - type: nauc_recall_at_10_max value: 11.381345798386317 - type: nauc_recall_at_10_std value: 13.34173170828859 - type: nauc_recall_at_1_diff1 value: 24.979719230754476 - type: nauc_recall_at_1_max value: 10.972032990843237 - type: nauc_recall_at_1_std value: 4.7964267266650955 - type: nauc_recall_at_20_diff1 value: 17.507273879317893 - type: nauc_recall_at_20_max value: 11.772238504003177 - type: nauc_recall_at_20_std value: 17.00496015114505 - type: nauc_recall_at_3_diff1 value: 15.718069166841971 - type: nauc_recall_at_3_max value: 10.507841411541175 - type: nauc_recall_at_3_std value: 8.362642856838368 - type: nauc_recall_at_5_diff1 value: 17.39920934041924 - type: nauc_recall_at_5_max value: 10.10162321958792 - type: nauc_recall_at_5_std value: 10.260318695226664 - type: ndcg_at_1 value: 8.488 - type: ndcg_at_10 value: 16.991 - type: ndcg_at_100 value: 22.103 - type: ndcg_at_1000 value: 24.708 - type: ndcg_at_20 value: 19.086 - type: ndcg_at_3 value: 12.803999999999998 - type: ndcg_at_5 value: 14.727 - type: precision_at_1 value: 8.488 - type: precision_at_10 value: 3.1780000000000004 - type: precision_at_100 value: 0.607 - type: precision_at_1000 value: 0.086 - type: precision_at_20 value: 2.0650000000000004 - type: precision_at_3 value: 6.151 - type: precision_at_5 value: 4.7620000000000005 - type: recall_at_1 value: 7.414 - type: recall_at_10 value: 27.105 - type: recall_at_100 value: 50.782000000000004 - type: recall_at_1000 value: 70.77799999999999 - type: recall_at_20 value: 35.105 - type: recall_at_3 value: 15.901000000000002 - type: recall_at_5 value: 20.399 task: type: Retrieval - dataset: config: default name: MTEB QuoraRetrieval (default) revision: e4e08e0b7dbe3c8700f0daef558ff32256715259 split: test type: mteb/quora metrics: - type: main_score value: 74.388 - type: map_at_1 value: 57.594 - type: map_at_10 value: 69.411 - type: map_at_100 value: 70.197 - type: map_at_1000 value: 70.23899999999999 - type: map_at_20 value: 69.896 - type: map_at_3 value: 66.50500000000001 - type: map_at_5 value: 68.199 - type: mrr_at_1 value: 66.34 - type: mrr_at_10 value: 74.12798015872983 - type: mrr_at_100 value: 74.45813156051709 - type: mrr_at_1000 value: 74.47054611594581 - type: mrr_at_20 value: 74.34983075339647 - type: mrr_at_3 value: 72.47666666666632 - type: mrr_at_5 value: 73.4861666666661 - type: nauc_map_at_1000_diff1 value: 69.23574495855162 - type: nauc_map_at_1000_max value: 38.326344115314825 - type: nauc_map_at_1000_std value: -9.69190621889919 - type: nauc_map_at_100_diff1 value: 69.23018899929654 - type: nauc_map_at_100_max value: 38.32200052980655 - type: nauc_map_at_100_std value: -9.709873607585722 - type: nauc_map_at_10_diff1 value: 69.11881416442584 - type: nauc_map_at_10_max value: 37.80595474994142 - type: nauc_map_at_10_std value: -10.460350770888079 - type: nauc_map_at_1_diff1 value: 71.29617122119095 - type: nauc_map_at_1_max value: 32.80205937689043 - type: nauc_map_at_1_std value: -13.444125573046852 - type: nauc_map_at_20_diff1 value: 69.19096974069583 - type: nauc_map_at_20_max value: 38.15987972416603 - type: nauc_map_at_20_std value: -10.020269369800706 - type: nauc_map_at_3_diff1 value: 69.12951153560108 - type: nauc_map_at_3_max value: 36.52459750894883 - type: nauc_map_at_3_std value: -12.174854661737818 - type: nauc_map_at_5_diff1 value: 69.0264228661453 - type: nauc_map_at_5_max value: 37.166727350784164 - type: nauc_map_at_5_std value: -11.493776844406158 - type: nauc_mrr_at_1000_diff1 value: 70.68150057700754 - type: nauc_mrr_at_1000_max value: 41.0178466695076 - type: nauc_mrr_at_1000_std value: -8.021358816489824 - type: nauc_mrr_at_100_diff1 value: 70.67856380420632 - type: nauc_mrr_at_100_max value: 41.02236359207632 - type: nauc_mrr_at_100_std value: -8.004727052332067 - type: nauc_mrr_at_10_diff1 value: 70.57476646749362 - type: nauc_mrr_at_10_max value: 40.98353008138954 - type: nauc_mrr_at_10_std value: -8.035083785813892 - type: nauc_mrr_at_1_diff1 value: 72.83106243448691 - type: nauc_mrr_at_1_max value: 40.497226437078496 - type: nauc_mrr_at_1_std value: -10.545921253601675 - type: nauc_mrr_at_20_diff1 value: 70.64698930715971 - type: nauc_mrr_at_20_max value: 41.01991026936206 - type: nauc_mrr_at_20_std value: -8.019248560369828 - type: nauc_mrr_at_3_diff1 value: 70.48136695574067 - type: nauc_mrr_at_3_max value: 40.83575836332353 - type: nauc_mrr_at_3_std value: -8.80652589242081 - type: nauc_mrr_at_5_diff1 value: 70.52447208499292 - type: nauc_mrr_at_5_max value: 40.95085309489185 - type: nauc_mrr_at_5_std value: -8.35502569521486 - type: nauc_ndcg_at_1000_diff1 value: 69.2418574551877 - type: nauc_ndcg_at_1000_max value: 39.85962706323504 - type: nauc_ndcg_at_1000_std value: -6.479667269089863 - type: nauc_ndcg_at_100_diff1 value: 69.13381091149564 - type: nauc_ndcg_at_100_max value: 39.902530291451974 - type: nauc_ndcg_at_100_std value: -6.19261331168395 - type: nauc_ndcg_at_10_diff1 value: 68.49804618931282 - type: nauc_ndcg_at_10_max value: 38.95870794043419 - type: nauc_ndcg_at_10_std value: -7.9554943741526465 - type: nauc_ndcg_at_1_diff1 value: 72.74562116035368 - type: nauc_ndcg_at_1_max value: 40.59003854736593 - type: nauc_ndcg_at_1_std value: -10.371154250660494 - type: nauc_ndcg_at_20_diff1 value: 68.81744480185341 - type: nauc_ndcg_at_20_max value: 39.48036257511071 - type: nauc_ndcg_at_20_std value: -7.288863470178731 - type: nauc_ndcg_at_3_diff1 value: 68.31977162714793 - type: nauc_ndcg_at_3_max value: 38.31785051573491 - type: nauc_ndcg_at_3_std value: -10.002238766651905 - type: nauc_ndcg_at_5_diff1 value: 68.34693163150705 - type: nauc_ndcg_at_5_max value: 38.384529237292085 - type: nauc_ndcg_at_5_std value: -9.504613414918412 - type: nauc_precision_at_1000_diff1 value: -27.886662167224248 - type: nauc_precision_at_1000_max value: -1.2099912726932696 - type: nauc_precision_at_1000_std value: 22.918146835627798 - type: nauc_precision_at_100_diff1 value: -22.32582293591269 - type: nauc_precision_at_100_max value: 4.238909760244244 - type: nauc_precision_at_100_std value: 23.62131900536325 - type: nauc_precision_at_10_diff1 value: -4.400459668224666 - type: nauc_precision_at_10_max value: 14.825184001294167 - type: nauc_precision_at_10_std value: 15.417646122517157 - type: nauc_precision_at_1_diff1 value: 72.74562116035368 - type: nauc_precision_at_1_max value: 40.59003854736593 - type: nauc_precision_at_1_std value: -10.371154250660494 - type: nauc_precision_at_20_diff1 value: -12.423098453024796 - type: nauc_precision_at_20_max value: 11.415547902904635 - type: nauc_precision_at_20_std value: 19.489921263698616 - type: nauc_precision_at_3_diff1 value: 22.682624176435127 - type: nauc_precision_at_3_max value: 25.682155720802452 - type: nauc_precision_at_3_std value: 2.6084400354215935 - type: nauc_precision_at_5_diff1 value: 9.272509130152006 - type: nauc_precision_at_5_max value: 20.36818990716189 - type: nauc_precision_at_5_std value: 8.054265889323238 - type: nauc_recall_at_1000_diff1 value: 60.88815464763635 - type: nauc_recall_at_1000_max value: 43.112146232617725 - type: nauc_recall_at_1000_std value: 50.36464338810094 - type: nauc_recall_at_100_diff1 value: 59.928500788144376 - type: nauc_recall_at_100_max value: 41.21981278373438 - type: nauc_recall_at_100_std value: 24.89653567034821 - type: nauc_recall_at_10_diff1 value: 60.89345811958783 - type: nauc_recall_at_10_max value: 36.2662873716048 - type: nauc_recall_at_10_std value: -1.7478273979841499 - type: nauc_recall_at_1_diff1 value: 71.29617122119095 - type: nauc_recall_at_1_max value: 32.80205937689043 - type: nauc_recall_at_1_std value: -13.444125573046852 - type: nauc_recall_at_20_diff1 value: 60.72735270299192 - type: nauc_recall_at_20_max value: 38.02822016647552 - type: nauc_recall_at_20_std value: 3.7019564772205054 - type: nauc_recall_at_3_diff1 value: 64.16899635037826 - type: nauc_recall_at_3_max value: 34.697022598257874 - type: nauc_recall_at_3_std value: -10.894218643842715 - type: nauc_recall_at_5_diff1 value: 62.56790753908123 - type: nauc_recall_at_5_max value: 35.18512660768109 - type: nauc_recall_at_5_std value: -8.518825484008714 - type: ndcg_at_1 value: 66.38 - type: ndcg_at_10 value: 74.388 - type: ndcg_at_100 value: 76.889 - type: ndcg_at_1000 value: 77.518 - type: ndcg_at_20 value: 75.548 - type: ndcg_at_3 value: 70.513 - type: ndcg_at_5 value: 72.406 - type: precision_at_1 value: 66.38 - type: precision_at_10 value: 11.274000000000001 - type: precision_at_100 value: 1.373 - type: precision_at_1000 value: 0.149 - type: precision_at_20 value: 6.095 - type: precision_at_3 value: 30.42 - type: precision_at_5 value: 20.174 - type: recall_at_1 value: 57.594 - type: recall_at_10 value: 84.09 - type: recall_at_100 value: 94.035 - type: recall_at_1000 value: 97.914 - type: recall_at_20 value: 88.13600000000001 - type: recall_at_3 value: 73.074 - type: recall_at_5 value: 78.29599999999999 task: type: Retrieval - dataset: config: default name: MTEB RedditClustering (default) revision: 24640382cdbf8abc73003fb0fa6d111a705499eb split: test type: mteb/reddit-clustering metrics: - type: main_score value: 23.878842199856606 - type: v_measure value: 23.878842199856606 - type: v_measure_std value: 4.578743173985467 task: type: Clustering - dataset: config: default name: MTEB RedditClusteringP2P (default) revision: 385e3cb46b4cfa89021f56c4380204149d0efe33 split: test type: mteb/reddit-clustering-p2p metrics: - type: main_score value: 37.76655625558288 - type: v_measure value: 37.76655625558288 - type: v_measure_std value: 9.302167236222553 task: type: Clustering - dataset: config: default name: MTEB SCIDOCS (default) revision: f8c2fcf00f625baaa80f62ec5bd9e1fff3b8ae88 split: test type: mteb/scidocs metrics: - type: main_score value: 9.668000000000001 - type: map_at_1 value: 2.395 - type: map_at_10 value: 5.237 - type: map_at_100 value: 6.311999999999999 - type: map_at_1000 value: 6.529 - type: map_at_20 value: 5.742 - type: map_at_3 value: 3.827 - type: map_at_5 value: 4.54 - type: mrr_at_1 value: 11.799999999999999 - type: mrr_at_10 value: 18.01527777777777 - type: mrr_at_100 value: 19.170155944203785 - type: mrr_at_1000 value: 19.281296973485173 - type: mrr_at_20 value: 18.67572073480355 - type: mrr_at_3 value: 15.549999999999988 - type: mrr_at_5 value: 16.92999999999999 - type: nauc_map_at_1000_diff1 value: 15.362749019317306 - type: nauc_map_at_1000_max value: 13.84696529256478 - type: nauc_map_at_1000_std value: 11.013607523301609 - type: nauc_map_at_100_diff1 value: 15.41591399608084 - type: nauc_map_at_100_max value: 13.730140090589293 - type: nauc_map_at_100_std value: 10.455348719140309 - type: nauc_map_at_10_diff1 value: 15.834686627354852 - type: nauc_map_at_10_max value: 13.28911184808523 - type: nauc_map_at_10_std value: 7.254487702527721 - type: nauc_map_at_1_diff1 value: 20.822383776341656 - type: nauc_map_at_1_max value: 9.583343414892674 - type: nauc_map_at_1_std value: 2.8889126256334383 - type: nauc_map_at_20_diff1 value: 15.522358238447422 - type: nauc_map_at_20_max value: 13.479963494201828 - type: nauc_map_at_20_std value: 8.76740668066124 - type: nauc_map_at_3_diff1 value: 18.748084536735927 - type: nauc_map_at_3_max value: 10.620059279509105 - type: nauc_map_at_3_std value: 4.337679139867589 - type: nauc_map_at_5_diff1 value: 17.345202973256 - type: nauc_map_at_5_max value: 12.452658321525504 - type: nauc_map_at_5_std value: 5.549910657395744 - type: nauc_mrr_at_1000_diff1 value: 15.377808587249769 - type: nauc_mrr_at_1000_max value: 10.04139543851182 - type: nauc_mrr_at_1000_std value: 5.4677890792436274 - type: nauc_mrr_at_100_diff1 value: 15.362987006646186 - type: nauc_mrr_at_100_max value: 10.041646833263774 - type: nauc_mrr_at_100_std value: 5.45421536846783 - type: nauc_mrr_at_10_diff1 value: 15.195360862950183 - type: nauc_mrr_at_10_max value: 9.93445070582588 - type: nauc_mrr_at_10_std value: 5.052925884003134 - type: nauc_mrr_at_1_diff1 value: 20.78440492344873 - type: nauc_mrr_at_1_max value: 9.65366117965217 - type: nauc_mrr_at_1_std value: 3.4370160103187177 - type: nauc_mrr_at_20_diff1 value: 15.367072076987753 - type: nauc_mrr_at_20_max value: 9.944084606452824 - type: nauc_mrr_at_20_std value: 5.1697642130127885 - type: nauc_mrr_at_3_diff1 value: 17.1065083677322 - type: nauc_mrr_at_3_max value: 9.730529319874428 - type: nauc_mrr_at_3_std value: 4.274768582707443 - type: nauc_mrr_at_5_diff1 value: 15.781360738081599 - type: nauc_mrr_at_5_max value: 10.189809550324469 - type: nauc_mrr_at_5_std value: 4.45427477219345 - type: nauc_ndcg_at_1000_diff1 value: 12.133137994513579 - type: nauc_ndcg_at_1000_max value: 14.593507049508561 - type: nauc_ndcg_at_1000_std value: 17.11300477285902 - type: nauc_ndcg_at_100_diff1 value: 12.768847933024317 - type: nauc_ndcg_at_100_max value: 13.62157103798925 - type: nauc_ndcg_at_100_std value: 13.97874886533375 - type: nauc_ndcg_at_10_diff1 value: 13.192522371369787 - type: nauc_ndcg_at_10_max value: 12.795709547611608 - type: nauc_ndcg_at_10_std value: 8.102799683454048 - type: nauc_ndcg_at_1_diff1 value: 20.78440492344873 - type: nauc_ndcg_at_1_max value: 9.65366117965217 - type: nauc_ndcg_at_1_std value: 3.4370160103187177 - type: nauc_ndcg_at_20_diff1 value: 13.10893336294196 - type: nauc_ndcg_at_20_max value: 12.87552853654183 - type: nauc_ndcg_at_20_std value: 10.673587471258529 - type: nauc_ndcg_at_3_diff1 value: 17.44757983297746 - type: nauc_ndcg_at_3_max value: 10.4479529428812 - type: nauc_ndcg_at_3_std value: 4.926065165471736 - type: nauc_ndcg_at_5_diff1 value: 15.131431597511005 - type: nauc_ndcg_at_5_max value: 12.138370476656045 - type: nauc_ndcg_at_5_std value: 5.747804810875746 - type: nauc_precision_at_1000_diff1 value: 4.651545309113199 - type: nauc_precision_at_1000_max value: 14.534556833197726 - type: nauc_precision_at_1000_std value: 25.883957300866957 - type: nauc_precision_at_100_diff1 value: 8.103597756413784 - type: nauc_precision_at_100_max value: 13.914816649477062 - type: nauc_precision_at_100_std value: 20.148598895345536 - type: nauc_precision_at_10_diff1 value: 8.606065646275212 - type: nauc_precision_at_10_max value: 14.068776248492663 - type: nauc_precision_at_10_std value: 11.140890379112346 - type: nauc_precision_at_1_diff1 value: 20.78440492344873 - type: nauc_precision_at_1_max value: 9.65366117965217 - type: nauc_precision_at_1_std value: 3.4370160103187177 - type: nauc_precision_at_20_diff1 value: 8.704973032555928 - type: nauc_precision_at_20_max value: 13.437392449115665 - type: nauc_precision_at_20_std value: 15.65525714739556 - type: nauc_precision_at_3_diff1 value: 15.796711189581933 - type: nauc_precision_at_3_max value: 10.514163928603118 - type: nauc_precision_at_3_std value: 5.788980186693269 - type: nauc_precision_at_5_diff1 value: 11.878373012657411 - type: nauc_precision_at_5_max value: 13.465410920052506 - type: nauc_precision_at_5_std value: 7.369374260570812 - type: nauc_recall_at_1000_diff1 value: 4.54914455375335 - type: nauc_recall_at_1000_max value: 15.398087677716521 - type: nauc_recall_at_1000_std value: 25.99787873557512 - type: nauc_recall_at_100_diff1 value: 7.937303192890431 - type: nauc_recall_at_100_max value: 14.280466786048457 - type: nauc_recall_at_100_std value: 19.989053944649168 - type: nauc_recall_at_10_diff1 value: 8.569047949172177 - type: nauc_recall_at_10_max value: 13.885951056418197 - type: nauc_recall_at_10_std value: 10.963367786952073 - type: nauc_recall_at_1_diff1 value: 20.822383776341656 - type: nauc_recall_at_1_max value: 9.583343414892674 - type: nauc_recall_at_1_std value: 2.8889126256334383 - type: nauc_recall_at_20_diff1 value: 8.683232232799698 - type: nauc_recall_at_20_max value: 13.336768111236735 - type: nauc_recall_at_20_std value: 15.457170894067298 - type: nauc_recall_at_3_diff1 value: 15.745448840185977 - type: nauc_recall_at_3_max value: 10.317079087586992 - type: nauc_recall_at_3_std value: 5.450728079255462 - type: nauc_recall_at_5_diff1 value: 11.800239024102154 - type: nauc_recall_at_5_max value: 13.175274608964674 - type: nauc_recall_at_5_std value: 7.016480519402965 - type: ndcg_at_1 value: 11.799999999999999 - type: ndcg_at_10 value: 9.668000000000001 - type: ndcg_at_100 value: 15.015999999999998 - type: ndcg_at_1000 value: 20.015 - type: ndcg_at_20 value: 11.436 - type: ndcg_at_3 value: 8.924 - type: ndcg_at_5 value: 7.911 - type: precision_at_1 value: 11.799999999999999 - type: precision_at_10 value: 5.050000000000001 - type: precision_at_100 value: 1.291 - type: precision_at_1000 value: 0.251 - type: precision_at_20 value: 3.56 - type: precision_at_3 value: 8.133 - type: precision_at_5 value: 6.88 - type: recall_at_1 value: 2.395 - type: recall_at_10 value: 10.232 - type: recall_at_100 value: 26.172 - type: recall_at_1000 value: 50.917 - type: recall_at_20 value: 14.421999999999999 - type: recall_at_3 value: 4.935 - type: recall_at_5 value: 6.973 task: type: Retrieval - dataset: config: default name: MTEB SICK-R (default) revision: 20a6d6f312dd54037fe07a32d58e5e168867909d split: test type: mteb/sickr-sts metrics: - type: cosine_pearson value: 73.8523071648734 - type: cosine_spearman value: 65.43442849067297 - type: euclidean_pearson value: 66.70464173822097 - type: euclidean_spearman value: 60.82604439637834 - type: main_score value: 65.43442849067297 - type: manhattan_pearson value: 66.58172841322595 - type: manhattan_spearman value: 61.202424661616796 - type: pearson value: 73.8523071648734 - type: spearman value: 65.43442849067297 task: type: STS - dataset: config: default name: MTEB STS12 (default) revision: a0d554a64d88156834ff5ae9920b964011b16384 split: test type: mteb/sts12-sts metrics: - type: cosine_pearson value: 66.23949905692108 - type: cosine_spearman value: 59.97334423570035 - type: euclidean_pearson value: 53.93367474754671 - type: euclidean_spearman value: 49.65643891073131 - type: main_score value: 59.97334423570035 - type: manhattan_pearson value: 52.50090747870868 - type: manhattan_spearman value: 48.726772969833064 - type: pearson value: 66.23949905692108 - type: spearman value: 59.97334423570035 task: type: STS - dataset: config: default name: MTEB STS13 (default) revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca split: test type: mteb/sts13-sts metrics: - type: cosine_pearson value: 70.87351220452432 - type: cosine_spearman value: 71.81863685179427 - type: euclidean_pearson value: 59.249945757203946 - type: euclidean_spearman value: 60.053057494316796 - type: main_score value: 71.81863685179427 - type: manhattan_pearson value: 59.798731614026714 - type: manhattan_spearman value: 60.31075071097369 - type: pearson value: 70.87351220452432 - type: spearman value: 71.81863685179427 task: type: STS - dataset: config: default name: MTEB STS14 (default) revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375 split: test type: mteb/sts14-sts metrics: - type: cosine_pearson value: 69.03600787240593 - type: cosine_spearman value: 66.99860396187162 - type: euclidean_pearson value: 58.61094669791067 - type: euclidean_spearman value: 58.286341788544995 - type: main_score value: 66.99860396187162 - type: manhattan_pearson value: 58.665872206618964 - type: manhattan_spearman value: 58.30408154246083 - type: pearson value: 69.03600787240593 - type: spearman value: 66.99860396187162 task: type: STS - dataset: config: default name: MTEB STS15 (default) revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3 split: test type: mteb/sts15-sts metrics: - type: cosine_pearson value: 74.45269985909863 - type: cosine_spearman value: 75.4907813361932 - type: euclidean_pearson value: 58.68237542933832 - type: euclidean_spearman value: 61.08891047408572 - type: main_score value: 75.4907813361932 - type: manhattan_pearson value: 59.32028954908928 - type: manhattan_spearman value: 61.38980243849822 - type: pearson value: 74.45269985909863 - type: spearman value: 75.4907813361932 task: type: STS - dataset: config: default name: MTEB STS16 (default) revision: 4d8694f8f0e0100860b497b999b3dbed754a0513 split: test type: mteb/sts16-sts metrics: - type: cosine_pearson value: 64.2309456558779 - type: cosine_spearman value: 66.97205823920407 - type: euclidean_pearson value: 52.471209393825134 - type: euclidean_spearman value: 55.05667213079255 - type: main_score value: 66.97205823920407 - type: manhattan_pearson value: 52.4566691722933 - type: manhattan_spearman value: 54.98149865449457 - type: pearson value: 64.2309456558779 - type: spearman value: 66.97205823920407 task: type: STS - dataset: config: en-de name: MTEB STS17 (en-de) revision: faeb762787bd10488a50c8b5be4a3b82e411949c split: test type: mteb/sts17-crosslingual-sts metrics: - type: cosine_pearson value: 21.06202710190164 - type: cosine_spearman value: 18.26963771909619 - type: euclidean_pearson value: -10.937704538162821 - type: euclidean_spearman value: -13.838045200730331 - type: main_score value: 18.26963771909619 - type: manhattan_pearson value: -9.194548970239005 - type: manhattan_spearman value: -12.642533487235347 - type: pearson value: 21.06202710190164 - type: spearman value: 18.26963771909619 task: type: STS - dataset: config: es-en name: MTEB STS17 (es-en) revision: faeb762787bd10488a50c8b5be4a3b82e411949c split: test type: mteb/sts17-crosslingual-sts metrics: - type: cosine_pearson value: 9.974655940103192 - type: cosine_spearman value: 6.625332823012507 - type: euclidean_pearson value: -6.193994464373409 - type: euclidean_spearman value: -13.09777719442545 - type: main_score value: 6.625332823012507 - type: manhattan_pearson value: -7.596649200902214 - type: manhattan_spearman value: -14.341067466786914 - type: pearson value: 9.974655940103192 - type: spearman value: 6.625332823012507 task: type: STS - dataset: config: en-ar name: MTEB STS17 (en-ar) revision: faeb762787bd10488a50c8b5be4a3b82e411949c split: test type: mteb/sts17-crosslingual-sts metrics: - type: cosine_pearson value: 3.939829923076509 - type: cosine_spearman value: 1.5988688581594497 - type: euclidean_pearson value: -10.456279294578557 - type: euclidean_spearman value: -9.811244215059508 - type: main_score value: 1.5988688581594497 - type: manhattan_pearson value: -10.913654400994407 - type: manhattan_spearman value: -8.604616012491228 - type: pearson value: 3.939829923076509 - type: spearman value: 1.5988688581594497 task: type: STS - dataset: config: it-en name: MTEB STS17 (it-en) revision: faeb762787bd10488a50c8b5be4a3b82e411949c split: test type: mteb/sts17-crosslingual-sts metrics: - type: cosine_pearson value: 17.28499679216241 - type: cosine_spearman value: 14.621483811474079 - type: euclidean_pearson value: -16.874097134885233 - type: euclidean_spearman value: -16.68311783384881 - type: main_score value: 14.621483811474079 - type: manhattan_pearson value: -17.639738926102574 - type: manhattan_spearman value: -16.66416708388087 - type: pearson value: 17.28499679216241 - type: spearman value: 14.621483811474079 task: type: STS - dataset: config: en-en name: MTEB STS17 (en-en) revision: faeb762787bd10488a50c8b5be4a3b82e411949c split: test type: mteb/sts17-crosslingual-sts metrics: - type: cosine_pearson value: 78.99251283215277 - type: cosine_spearman value: 80.61049377743727 - type: euclidean_pearson value: 66.17827666954877 - type: euclidean_spearman value: 67.45271515314245 - type: main_score value: 80.61049377743727 - type: manhattan_pearson value: 66.23284409257823 - type: manhattan_spearman value: 67.666247437264 - type: pearson value: 78.99251283215277 - type: spearman value: 80.61049377743727 task: type: STS - dataset: config: en-tr name: MTEB STS17 (en-tr) revision: faeb762787bd10488a50c8b5be4a3b82e411949c split: test type: mteb/sts17-crosslingual-sts metrics: - type: cosine_pearson value: -1.931391285281735 - type: cosine_spearman value: -3.321078837897458 - type: euclidean_pearson value: -21.683857378409378 - type: euclidean_spearman value: -24.244038106560804 - type: main_score value: -3.321078837897458 - type: manhattan_pearson value: -22.19415161015049 - type: manhattan_spearman value: -22.71872700697092 - type: pearson value: -1.931391285281735 - type: spearman value: -3.321078837897458 task: type: STS - dataset: config: nl-en name: MTEB STS17 (nl-en) revision: faeb762787bd10488a50c8b5be4a3b82e411949c split: test type: mteb/sts17-crosslingual-sts metrics: - type: cosine_pearson value: 21.215714201927316 - type: cosine_spearman value: 16.647983989080657 - type: euclidean_pearson value: -17.529579365480654 - type: euclidean_spearman value: -17.98599150405874 - type: main_score value: 16.647983989080657 - type: manhattan_pearson value: -17.041217222851987 - type: manhattan_spearman value: -17.099688376247617 - type: pearson value: 21.215714201927316 - type: spearman value: 16.647983989080657 task: type: STS - dataset: config: fr-en name: MTEB STS17 (fr-en) revision: faeb762787bd10488a50c8b5be4a3b82e411949c split: test type: mteb/sts17-crosslingual-sts metrics: - type: cosine_pearson value: 25.55717236376004 - type: cosine_spearman value: 21.120437860825668 - type: euclidean_pearson value: -13.532867255677811 - type: euclidean_spearman value: -14.067414622756136 - type: main_score value: 21.120437860825668 - type: manhattan_pearson value: -14.812251264524642 - type: manhattan_spearman value: -14.777202854314126 - type: pearson value: 25.55717236376004 - type: spearman value: 21.120437860825668 task: type: STS - dataset: config: en name: MTEB STS22 (en) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 45.445485581559176 - type: cosine_spearman value: 57.81995941896327 - type: euclidean_pearson value: 46.45758835829159 - type: euclidean_spearman value: 57.15291591278634 - type: main_score value: 57.81995941896327 - type: manhattan_pearson value: 45.38976415067536 - type: manhattan_spearman value: 56.412461810883244 - type: pearson value: 45.445485581559176 - type: spearman value: 57.81995941896327 task: type: STS - dataset: config: es-en name: MTEB STS22 (es-en) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 9.618696238808342 - type: cosine_spearman value: 11.05047267189447 - type: euclidean_pearson value: 10.475166065910297 - type: euclidean_spearman value: 11.515497306325212 - type: main_score value: 11.05047267189447 - type: manhattan_pearson value: 11.677707905016238 - type: manhattan_spearman value: 13.47068609853333 - type: pearson value: 9.618696238808342 - type: spearman value: 11.05047267189447 task: type: STS - dataset: config: pl-en name: MTEB STS22 (pl-en) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 9.219640350559175 - type: cosine_spearman value: 15.424812621979203 - type: euclidean_pearson value: 27.079648075136692 - type: euclidean_spearman value: 15.127881072012025 - type: main_score value: 15.424812621979203 - type: manhattan_pearson value: 29.948405026370768 - type: manhattan_spearman value: 11.450097312769431 - type: pearson value: 9.219640350559175 - type: spearman value: 15.424812621979203 task: type: STS - dataset: config: zh-en name: MTEB STS22 (zh-en) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 2.016891027432069 - type: cosine_spearman value: 9.065694923749145 - type: euclidean_pearson value: -0.2317575485284492 - type: euclidean_spearman value: 1.478447144326562 - type: main_score value: 9.065694923749145 - type: manhattan_pearson value: 1.2210552984769953 - type: manhattan_spearman value: 1.0797490938939034 - type: pearson value: 2.016891027432069 - type: spearman value: 9.065694923749145 task: type: STS - dataset: config: de-en name: MTEB STS22 (de-en) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 20.30265778022666 - type: cosine_spearman value: 27.04088495025885 - type: euclidean_pearson value: 21.92624711333554 - type: euclidean_spearman value: 30.314966090982715 - type: main_score value: 27.04088495025885 - type: manhattan_pearson value: 22.449954374970556 - type: manhattan_spearman value: 33.98792612061501 - type: pearson value: 20.30265778022666 - type: spearman value: 27.04088495025885 task: type: STS - dataset: config: default name: MTEB STSBenchmark (default) revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831 split: test type: mteb/stsbenchmark-sts metrics: - type: cosine_pearson value: 67.58098869120114 - type: cosine_spearman value: 67.2453123773366 - type: euclidean_pearson value: 58.23603604808463 - type: euclidean_spearman value: 58.623631847217 - type: main_score value: 67.2453123773366 - type: manhattan_pearson value: 58.368136302971195 - type: manhattan_spearman value: 58.837841919175105 - type: pearson value: 67.58098869120114 - type: spearman value: 67.2453123773366 task: type: STS - dataset: config: default name: MTEB SciDocsRR (default) revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab split: test type: mteb/scidocs-reranking metrics: - type: main_score value: 68.53428785087402 - type: map value: 68.53428785087402 - type: mrr value: 88.53875880836665 - type: nAUC_map_diff1 value: 11.778449408360105 - type: nAUC_map_max value: 55.710378394122195 - type: nAUC_map_std value: 66.15614923206279 - type: nAUC_mrr_diff1 value: 47.35327285304558 - type: nAUC_mrr_max value: 74.15113781105075 - type: nAUC_mrr_std value: 70.40747046150474 task: type: Reranking - dataset: config: default name: MTEB SciFact (default) revision: 0228b52cf27578f30900b9e5271d331663a030d7 split: test type: mteb/scifact metrics: - type: main_score value: 44.018 - type: map_at_1 value: 30.778 - type: map_at_10 value: 39.095 - type: map_at_100 value: 40.136 - type: map_at_1000 value: 40.19 - type: map_at_20 value: 39.695 - type: map_at_3 value: 36.25 - type: map_at_5 value: 37.942 - type: mrr_at_1 value: 32.33333333333333 - type: mrr_at_10 value: 40.46640211640211 - type: mrr_at_100 value: 41.3527413808237 - type: mrr_at_1000 value: 41.402308015811776 - type: mrr_at_20 value: 40.9920777608471 - type: mrr_at_3 value: 37.999999999999986 - type: mrr_at_5 value: 39.46666666666666 - type: nauc_map_at_1000_diff1 value: 51.57525678345129 - type: nauc_map_at_1000_max value: 35.72906391653508 - type: nauc_map_at_1000_std value: -1.672862325664642 - type: nauc_map_at_100_diff1 value: 51.57482414972323 - type: nauc_map_at_100_max value: 35.714681767398474 - type: nauc_map_at_100_std value: -1.6459806802624475 - type: nauc_map_at_10_diff1 value: 51.142890340689064 - type: nauc_map_at_10_max value: 35.78128552943207 - type: nauc_map_at_10_std value: -2.1957957240897907 - type: nauc_map_at_1_diff1 value: 57.59762900453854 - type: nauc_map_at_1_max value: 36.479602157030534 - type: nauc_map_at_1_std value: -4.834289532948042 - type: nauc_map_at_20_diff1 value: 51.47980323079124 - type: nauc_map_at_20_max value: 35.585900524174406 - type: nauc_map_at_20_std value: -1.7680354064625985 - type: nauc_map_at_3_diff1 value: 51.012766710346625 - type: nauc_map_at_3_max value: 34.8262662118054 - type: nauc_map_at_3_std value: -2.8168593560801045 - type: nauc_map_at_5_diff1 value: 50.836092917622864 - type: nauc_map_at_5_max value: 35.32174769825645 - type: nauc_map_at_5_std value: -3.113242921586995 - type: nauc_mrr_at_1000_diff1 value: 53.10217120766699 - type: nauc_mrr_at_1000_max value: 37.46657201878918 - type: nauc_mrr_at_1000_std value: 1.9085047586195323 - type: nauc_mrr_at_100_diff1 value: 53.10038602820947 - type: nauc_mrr_at_100_max value: 37.461065885458225 - type: nauc_mrr_at_100_std value: 1.9403756850021763 - type: nauc_mrr_at_10_diff1 value: 52.71420660954082 - type: nauc_mrr_at_10_max value: 37.62806428278671 - type: nauc_mrr_at_10_std value: 1.9517437711674281 - type: nauc_mrr_at_1_diff1 value: 59.730007702616675 - type: nauc_mrr_at_1_max value: 38.85146416502298 - type: nauc_mrr_at_1_std value: -0.46260223776596965 - type: nauc_mrr_at_20_diff1 value: 53.041376670418906 - type: nauc_mrr_at_20_max value: 37.45508852907037 - type: nauc_mrr_at_20_std value: 1.9843723810434797 - type: nauc_mrr_at_3_diff1 value: 52.716388196194494 - type: nauc_mrr_at_3_max value: 36.76096106397856 - type: nauc_mrr_at_3_std value: 1.716782555536502 - type: nauc_mrr_at_5_diff1 value: 52.61598345028188 - type: nauc_mrr_at_5_max value: 37.26316036644959 - type: nauc_mrr_at_5_std value: 1.3757366695050894 - type: nauc_ndcg_at_1000_diff1 value: 51.342395628428314 - type: nauc_ndcg_at_1000_max value: 37.22548194348463 - type: nauc_ndcg_at_1000_std value: 1.6360986297119697 - type: nauc_ndcg_at_100_diff1 value: 51.12772923293346 - type: nauc_ndcg_at_100_max value: 37.08162525770745 - type: nauc_ndcg_at_100_std value: 2.1437445417460146 - type: nauc_ndcg_at_10_diff1 value: 49.48104920841383 - type: nauc_ndcg_at_10_max value: 36.98553295749576 - type: nauc_ndcg_at_10_std value: 0.7074029546666143 - type: nauc_ndcg_at_1_diff1 value: 59.730007702616675 - type: nauc_ndcg_at_1_max value: 38.85146416502298 - type: nauc_ndcg_at_1_std value: -0.46260223776596965 - type: nauc_ndcg_at_20_diff1 value: 50.63630218240983 - type: nauc_ndcg_at_20_max value: 36.29047254679528 - type: nauc_ndcg_at_20_std value: 1.3772144888034745 - type: nauc_ndcg_at_3_diff1 value: 49.382153963236625 - type: nauc_ndcg_at_3_max value: 35.22306811742639 - type: nauc_ndcg_at_3_std value: -0.8877334603608296 - type: nauc_ndcg_at_5_diff1 value: 49.05555691688766 - type: nauc_ndcg_at_5_max value: 36.00098364740635 - type: nauc_ndcg_at_5_std value: -1.5274960265115565 - type: nauc_precision_at_1000_diff1 value: 12.30933370851068 - type: nauc_precision_at_1000_max value: 24.80977336944425 - type: nauc_precision_at_1000_std value: 42.85052700690557 - type: nauc_precision_at_100_diff1 value: 26.185494481397587 - type: nauc_precision_at_100_max value: 31.155891382208928 - type: nauc_precision_at_100_std value: 35.608690885169295 - type: nauc_precision_at_10_diff1 value: 36.27376093062482 - type: nauc_precision_at_10_max value: 36.42692892209515 - type: nauc_precision_at_10_std value: 16.967432904462893 - type: nauc_precision_at_1_diff1 value: 59.730007702616675 - type: nauc_precision_at_1_max value: 38.85146416502298 - type: nauc_precision_at_1_std value: -0.46260223776596965 - type: nauc_precision_at_20_diff1 value: 37.622482136709785 - type: nauc_precision_at_20_max value: 31.21688679166065 - type: nauc_precision_at_20_std value: 23.221017808713682 - type: nauc_precision_at_3_diff1 value: 42.340206572143984 - type: nauc_precision_at_3_max value: 36.3442813514268 - type: nauc_precision_at_3_std value: 7.592922050055632 - type: nauc_precision_at_5_diff1 value: 38.17808235542409 - type: nauc_precision_at_5_max value: 35.09801657302365 - type: nauc_precision_at_5_std value: 8.398007414457009 - type: nauc_recall_at_1000_diff1 value: 55.841144651529085 - type: nauc_recall_at_1000_max value: 56.572722198749226 - type: nauc_recall_at_1000_std value: 31.84957409406956 - type: nauc_recall_at_100_diff1 value: 48.328441413096336 - type: nauc_recall_at_100_max value: 42.071227967505166 - type: nauc_recall_at_100_std value: 18.845456547380337 - type: nauc_recall_at_10_diff1 value: 42.32690986833832 - type: nauc_recall_at_10_max value: 38.657602228864995 - type: nauc_recall_at_10_std value: 5.742422923256993 - type: nauc_recall_at_1_diff1 value: 57.59762900453854 - type: nauc_recall_at_1_max value: 36.479602157030534 - type: nauc_recall_at_1_std value: -4.834289532948042 - type: nauc_recall_at_20_diff1 value: 46.280085660215995 - type: nauc_recall_at_20_max value: 35.65299771551237 - type: nauc_recall_at_20_std value: 8.057327587598591 - type: nauc_recall_at_3_diff1 value: 42.84012935628984 - type: nauc_recall_at_3_max value: 33.69290527723077 - type: nauc_recall_at_3_std value: -0.9503712670051102 - type: nauc_recall_at_5_diff1 value: 42.1137382698146 - type: nauc_recall_at_5_max value: 36.12494070598603 - type: nauc_recall_at_5_std value: -1.394936950543654 - type: ndcg_at_1 value: 32.333 - type: ndcg_at_10 value: 44.018 - type: ndcg_at_100 value: 49.089 - type: ndcg_at_1000 value: 50.651 - type: ndcg_at_20 value: 46.089 - type: ndcg_at_3 value: 38.499 - type: ndcg_at_5 value: 41.297 - type: precision_at_1 value: 32.333 - type: precision_at_10 value: 6.4 - type: precision_at_100 value: 0.923 - type: precision_at_1000 value: 0.106 - type: precision_at_20 value: 3.6670000000000003 - type: precision_at_3 value: 15.443999999999999 - type: precision_at_5 value: 10.867 - type: recall_at_1 value: 30.778 - type: recall_at_10 value: 57.99999999999999 - type: recall_at_100 value: 81.722 - type: recall_at_1000 value: 94.033 - type: recall_at_20 value: 66.02799999999999 - type: recall_at_3 value: 43.056 - type: recall_at_5 value: 49.694 task: type: Retrieval - dataset: config: default name: MTEB SprintDuplicateQuestions (default) revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46 split: test type: mteb/sprintduplicatequestions-pairclassification metrics: - type: cosine_accuracy value: 99.6 - type: cosine_accuracy_threshold value: 72.43388891220093 - type: cosine_ap value: 85.05626292429993 - type: cosine_f1 value: 78.94211576846308 - type: cosine_f1_threshold value: 70.86913585662842 - type: cosine_precision value: 78.78486055776892 - type: cosine_recall value: 79.10000000000001 - type: dot_accuracy value: 99.06534653465346 - type: dot_accuracy_threshold value: 76633.75244140625 - type: dot_ap value: 35.63520526748108 - type: dot_f1 value: 40.297274979355905 - type: dot_f1_threshold value: 46533.13903808594 - type: dot_precision value: 34.31786216596343 - type: dot_recall value: 48.8 - type: euclidean_accuracy value: 99.38217821782177 - type: euclidean_accuracy_threshold value: 1529.2129516601562 - type: euclidean_ap value: 65.66713048050076 - type: euclidean_f1 value: 63.702056698165656 - type: euclidean_f1_threshold value: 1659.9403381347656 - type: euclidean_precision value: 71.71464330413016 - type: euclidean_recall value: 57.3 - type: main_score value: 85.05626292429993 - type: manhattan_accuracy value: 99.36633663366337 - type: manhattan_accuracy_threshold value: 19134.791564941406 - type: manhattan_ap value: 64.327573756549 - type: manhattan_f1 value: 62.878385554965476 - type: manhattan_f1_threshold value: 20997.62725830078 - type: manhattan_precision value: 67.04416761041902 - type: manhattan_recall value: 59.199999999999996 - type: max_accuracy value: 99.6 - type: max_ap value: 85.05626292429993 - type: max_f1 value: 78.94211576846308 - type: max_precision value: 78.78486055776892 - type: max_recall value: 79.10000000000001 - type: similarity_accuracy value: 99.6 - type: similarity_accuracy_threshold value: 72.43388891220093 - type: similarity_ap value: 85.05626292429993 - type: similarity_f1 value: 78.94211576846308 - type: similarity_f1_threshold value: 70.86913585662842 - type: similarity_precision value: 78.78486055776892 - type: similarity_recall value: 79.10000000000001 task: type: PairClassification - dataset: config: default name: MTEB StackExchangeClustering (default) revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259 split: test type: mteb/stackexchange-clustering metrics: - type: main_score value: 33.04088699016667 - type: v_measure value: 33.04088699016667 - type: v_measure_std value: 4.201419342997424 task: type: Clustering - dataset: config: default name: MTEB StackExchangeClusteringP2P (default) revision: 815ca46b2622cec33ccafc3735d572c266efdb44 split: test type: mteb/stackexchange-clustering-p2p metrics: - type: main_score value: 27.79227103935552 - type: v_measure value: 27.79227103935552 - type: v_measure_std value: 1.6306895991356034 task: type: Clustering - dataset: config: default name: MTEB StackOverflowDupQuestions (default) revision: e185fbe320c72810689fc5848eb6114e1ef5ec69 split: test type: mteb/stackoverflowdupquestions-reranking metrics: - type: main_score value: 43.37562407771596 - type: map value: 43.37562407771596 - type: mrr value: 43.95843943638062 - type: nAUC_map_diff1 value: 35.17057785776578 - type: nAUC_map_max value: 16.895292109117968 - type: nAUC_map_std value: 7.566837158800999 - type: nAUC_mrr_diff1 value: 34.529930093774155 - type: nAUC_mrr_max value: 17.875421743140148 - type: nAUC_mrr_std value: 8.16194884246291 task: type: Reranking - dataset: config: default name: MTEB SummEval (default) revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c split: test type: mteb/summeval metrics: - type: cosine_pearson value: 29.667795250962197 - type: cosine_spearman value: 29.280803143378677 - type: dot_pearson value: 17.20848486618972 - type: dot_spearman value: 19.642791960809518 - type: main_score value: 29.280803143378677 - type: pearson value: 29.667795250962197 - type: spearman value: 29.280803143378677 task: type: Summarization - dataset: config: default name: MTEB TRECCOVID (default) revision: bb9466bac8153a0349341eb1b22e06409e78ef4e split: test type: mteb/trec-covid metrics: - type: main_score value: 47.015 - type: map_at_1 value: 0.11299999999999999 - type: map_at_10 value: 0.924 - type: map_at_100 value: 4.172 - type: map_at_1000 value: 9.794 - type: map_at_20 value: 1.512 - type: map_at_3 value: 0.32299999999999995 - type: map_at_5 value: 0.5349999999999999 - type: mrr_at_1 value: 54.0 - type: mrr_at_10 value: 64.37222222222222 - type: mrr_at_100 value: 64.95440794499618 - type: mrr_at_1000 value: 64.95440794499618 - type: mrr_at_20 value: 64.79285714285714 - type: mrr_at_3 value: 61.0 - type: mrr_at_5 value: 62.9 - type: nauc_map_at_1000_diff1 value: 5.391181504174254 - type: nauc_map_at_1000_max value: 48.53906859573933 - type: nauc_map_at_1000_std value: 58.77913245945572 - type: nauc_map_at_100_diff1 value: 5.602676644566584 - type: nauc_map_at_100_max value: 30.35986103902266 - type: nauc_map_at_100_std value: 43.61342447615204 - type: nauc_map_at_10_diff1 value: 11.168677765044714 - type: nauc_map_at_10_max value: 12.615876642210566 - type: nauc_map_at_10_std value: 15.487673375733934 - type: nauc_map_at_1_diff1 value: 13.856607126355705 - type: nauc_map_at_1_max value: 2.1470727276166315 - type: nauc_map_at_1_std value: 13.755038114656543 - type: nauc_map_at_20_diff1 value: 9.278354233919723 - type: nauc_map_at_20_max value: 14.549895562986578 - type: nauc_map_at_20_std value: 21.58014466138326 - type: nauc_map_at_3_diff1 value: 17.476371244979568 - type: nauc_map_at_3_max value: 5.336749157036172 - type: nauc_map_at_3_std value: 13.60030032869252 - type: nauc_map_at_5_diff1 value: 18.159708091961715 - type: nauc_map_at_5_max value: 5.5023295542724195 - type: nauc_map_at_5_std value: 13.464524190505264 - type: nauc_mrr_at_1000_diff1 value: 24.183591049739295 - type: nauc_mrr_at_1000_max value: 23.244935337421687 - type: nauc_mrr_at_1000_std value: 36.76491491232038 - type: nauc_mrr_at_100_diff1 value: 24.183591049739295 - type: nauc_mrr_at_100_max value: 23.244935337421687 - type: nauc_mrr_at_100_std value: 36.76491491232038 - type: nauc_mrr_at_10_diff1 value: 25.116993699935996 - type: nauc_mrr_at_10_max value: 23.996446760940472 - type: nauc_mrr_at_10_std value: 36.661108373978486 - type: nauc_mrr_at_1_diff1 value: 22.46394932066349 - type: nauc_mrr_at_1_max value: 17.99338723569777 - type: nauc_mrr_at_1_std value: 31.805173515601105 - type: nauc_mrr_at_20_diff1 value: 24.29457665863037 - type: nauc_mrr_at_20_max value: 23.511208714905433 - type: nauc_mrr_at_20_std value: 37.03779743443747 - type: nauc_mrr_at_3_diff1 value: 21.325058136848703 - type: nauc_mrr_at_3_max value: 25.498590855189146 - type: nauc_mrr_at_3_std value: 35.28303533385696 - type: nauc_mrr_at_5_diff1 value: 23.91581725239823 - type: nauc_mrr_at_5_max value: 21.88399789010818 - type: nauc_mrr_at_5_std value: 37.46999023019008 - type: nauc_ndcg_at_1000_diff1 value: 3.7557778508958846 - type: nauc_ndcg_at_1000_max value: 40.346503557806564 - type: nauc_ndcg_at_1000_std value: 50.92180253083818 - type: nauc_ndcg_at_100_diff1 value: 11.758581771303305 - type: nauc_ndcg_at_100_max value: 35.16894818233675 - type: nauc_ndcg_at_100_std value: 47.424485591389114 - type: nauc_ndcg_at_10_diff1 value: 12.849993798661563 - type: nauc_ndcg_at_10_max value: 30.851313506820976 - type: nauc_ndcg_at_10_std value: 36.943619057267505 - type: nauc_ndcg_at_1_diff1 value: 11.113346207488473 - type: nauc_ndcg_at_1_max value: 15.184797768479774 - type: nauc_ndcg_at_1_std value: 27.52387082931017 - type: nauc_ndcg_at_20_diff1 value: 12.331028684560186 - type: nauc_ndcg_at_20_max value: 28.893165127974708 - type: nauc_ndcg_at_20_std value: 39.097000545114646 - type: nauc_ndcg_at_3_diff1 value: 15.782271186947469 - type: nauc_ndcg_at_3_max value: 23.91790545249963 - type: nauc_ndcg_at_3_std value: 34.87568041720673 - type: nauc_ndcg_at_5_diff1 value: 14.306657014965335 - type: nauc_ndcg_at_5_max value: 24.92679497185896 - type: nauc_ndcg_at_5_std value: 35.14072395767764 - type: nauc_precision_at_1000_diff1 value: 9.698627632231533 - type: nauc_precision_at_1000_max value: 43.62044953565815 - type: nauc_precision_at_1000_std value: 54.089192302090495 - type: nauc_precision_at_100_diff1 value: 11.799461882261514 - type: nauc_precision_at_100_max value: 36.87868882997057 - type: nauc_precision_at_100_std value: 51.09246667126284 - type: nauc_precision_at_10_diff1 value: 13.170655404348533 - type: nauc_precision_at_10_max value: 38.227922901784936 - type: nauc_precision_at_10_std value: 40.51375636546919 - type: nauc_precision_at_1_diff1 value: 22.46394932066349 - type: nauc_precision_at_1_max value: 17.99338723569777 - type: nauc_precision_at_1_std value: 31.805173515601105 - type: nauc_precision_at_20_diff1 value: 13.020942321118012 - type: nauc_precision_at_20_max value: 32.76679746744021 - type: nauc_precision_at_20_std value: 43.375734018262754 - type: nauc_precision_at_3_diff1 value: 22.36277013079758 - type: nauc_precision_at_3_max value: 29.14917970240368 - type: nauc_precision_at_3_std value: 38.40675412594522 - type: nauc_precision_at_5_diff1 value: 20.38016205233649 - type: nauc_precision_at_5_max value: 28.40199750312108 - type: nauc_precision_at_5_std value: 37.658196861765916 - type: nauc_recall_at_1000_diff1 value: -1.8797682238301674 - type: nauc_recall_at_1000_max value: 40.00611463779723 - type: nauc_recall_at_1000_std value: 50.00277798847854 - type: nauc_recall_at_100_diff1 value: 5.570829659209835 - type: nauc_recall_at_100_max value: 21.511683158026184 - type: nauc_recall_at_100_std value: 37.17966017860592 - type: nauc_recall_at_10_diff1 value: 5.649731119631445 - type: nauc_recall_at_10_max value: 12.690473408729572 - type: nauc_recall_at_10_std value: 8.697137776280309 - type: nauc_recall_at_1_diff1 value: 13.856607126355705 - type: nauc_recall_at_1_max value: 2.1470727276166315 - type: nauc_recall_at_1_std value: 13.755038114656543 - type: nauc_recall_at_20_diff1 value: 8.149753992066595 - type: nauc_recall_at_20_max value: 8.365030917145909 - type: nauc_recall_at_20_std value: 15.05385058373975 - type: nauc_recall_at_3_diff1 value: 16.664831204533417 - type: nauc_recall_at_3_max value: 4.9075975386189015 - type: nauc_recall_at_3_std value: 11.436115039116913 - type: nauc_recall_at_5_diff1 value: 17.863326487393323 - type: nauc_recall_at_5_max value: 0.04244496355094046 - type: nauc_recall_at_5_std value: 8.039336595643896 - type: ndcg_at_1 value: 48.0 - type: ndcg_at_10 value: 47.015 - type: ndcg_at_100 value: 31.857999999999997 - type: ndcg_at_1000 value: 27.142 - type: ndcg_at_20 value: 43.162 - type: ndcg_at_3 value: 49.123 - type: ndcg_at_5 value: 49.425999999999995 - type: precision_at_1 value: 54.0 - type: precision_at_10 value: 51.0 - type: precision_at_100 value: 32.56 - type: precision_at_1000 value: 13.072000000000001 - type: precision_at_20 value: 45.9 - type: precision_at_3 value: 54.0 - type: precision_at_5 value: 55.2 - type: recall_at_1 value: 0.11299999999999999 - type: recall_at_10 value: 1.162 - type: recall_at_100 value: 6.809 - type: recall_at_1000 value: 25.805 - type: recall_at_20 value: 2.051 - type: recall_at_3 value: 0.35200000000000004 - type: recall_at_5 value: 0.618 task: type: Retrieval - dataset: config: default name: MTEB Touche2020 (default) revision: a34f9a33db75fa0cbb21bb5cfc3dae8dc8bec93f split: test type: mteb/touche2020 metrics: - type: main_score value: 12.417 - type: map_at_1 value: 1.2 - type: map_at_10 value: 4.376 - type: map_at_100 value: 7.161 - type: map_at_1000 value: 8.405 - type: map_at_20 value: 5.578 - type: map_at_3 value: 2.396 - type: map_at_5 value: 3.044 - type: mrr_at_1 value: 16.3265306122449 - type: mrr_at_10 value: 30.004859086491738 - type: mrr_at_100 value: 31.506819710420675 - type: mrr_at_1000 value: 31.52488003189439 - type: mrr_at_20 value: 31.07992314474907 - type: mrr_at_3 value: 24.489795918367346 - type: mrr_at_5 value: 27.857142857142854 - type: nauc_map_at_1000_diff1 value: -15.240085041163246 - type: nauc_map_at_1000_max value: -34.07491781069546 - type: nauc_map_at_1000_std value: -39.33676134505847 - type: nauc_map_at_100_diff1 value: -17.475590176275173 - type: nauc_map_at_100_max value: -36.27378611366948 - type: nauc_map_at_100_std value: -42.367310265458066 - type: nauc_map_at_10_diff1 value: -17.79313659611791 - type: nauc_map_at_10_max value: -30.930524152161155 - type: nauc_map_at_10_std value: -37.96490423161143 - type: nauc_map_at_1_diff1 value: -20.304167493996196 - type: nauc_map_at_1_max value: -34.39784658467407 - type: nauc_map_at_1_std value: -34.8048180060142 - type: nauc_map_at_20_diff1 value: -19.601011957021058 - type: nauc_map_at_20_max value: -36.19251563365872 - type: nauc_map_at_20_std value: -41.872703350300306 - type: nauc_map_at_3_diff1 value: -18.604827557464603 - type: nauc_map_at_3_max value: -33.87036816368854 - type: nauc_map_at_3_std value: -37.87305582981634 - type: nauc_map_at_5_diff1 value: -19.000407560148222 - type: nauc_map_at_5_max value: -35.88105036080159 - type: nauc_map_at_5_std value: -39.89433800276062 - type: nauc_mrr_at_1000_diff1 value: -10.977908813445096 - type: nauc_mrr_at_1000_max value: -32.70254863800196 - type: nauc_mrr_at_1000_std value: -36.932750949391014 - type: nauc_mrr_at_100_diff1 value: -10.923380877501057 - type: nauc_mrr_at_100_max value: -32.61546764122419 - type: nauc_mrr_at_100_std value: -36.842894043351315 - type: nauc_mrr_at_10_diff1 value: -10.131576305498573 - type: nauc_mrr_at_10_max value: -31.890083580054764 - type: nauc_mrr_at_10_std value: -36.93266622814508 - type: nauc_mrr_at_1_diff1 value: -16.139790526714425 - type: nauc_mrr_at_1_max value: -29.900749975522345 - type: nauc_mrr_at_1_std value: -29.066801658151576 - type: nauc_mrr_at_20_diff1 value: -10.70805724526718 - type: nauc_mrr_at_20_max value: -32.340792705157114 - type: nauc_mrr_at_20_std value: -36.72547772593701 - type: nauc_mrr_at_3_diff1 value: -17.91765468161938 - type: nauc_mrr_at_3_max value: -32.241705526206275 - type: nauc_mrr_at_3_std value: -33.553729892050974 - type: nauc_mrr_at_5_diff1 value: -12.991140385709848 - type: nauc_mrr_at_5_max value: -33.87447283054401 - type: nauc_mrr_at_5_std value: -37.96193128324505 - type: nauc_ndcg_at_1000_diff1 value: 1.4521546341817582 - type: nauc_ndcg_at_1000_max value: -22.463819593958227 - type: nauc_ndcg_at_1000_std value: -27.617648672815875 - type: nauc_ndcg_at_100_diff1 value: -11.537693897677832 - type: nauc_ndcg_at_100_max value: -36.160393447246 - type: nauc_ndcg_at_100_std value: -44.05399962086289 - type: nauc_ndcg_at_10_diff1 value: -9.919400208671634 - type: nauc_ndcg_at_10_max value: -22.769115244797316 - type: nauc_ndcg_at_10_std value: -34.034353433778854 - type: nauc_ndcg_at_1_diff1 value: -17.822259770980857 - type: nauc_ndcg_at_1_max value: -26.332806784918134 - type: nauc_ndcg_at_1_std value: -26.435402666146484 - type: nauc_ndcg_at_20_diff1 value: -13.788195267001576 - type: nauc_ndcg_at_20_max value: -32.974957041119055 - type: nauc_ndcg_at_20_std value: -42.33157337528393 - type: nauc_ndcg_at_3_diff1 value: -16.223851866502706 - type: nauc_ndcg_at_3_max value: -26.2902601974522 - type: nauc_ndcg_at_3_std value: -32.304039646610335 - type: nauc_ndcg_at_5_diff1 value: -12.817036231720957 - type: nauc_ndcg_at_5_max value: -28.44642751642767 - type: nauc_ndcg_at_5_std value: -36.58899943553682 - type: nauc_precision_at_1000_diff1 value: 26.935463895508967 - type: nauc_precision_at_1000_max value: 46.72249889198106 - type: nauc_precision_at_1000_std value: 38.53058407998278 - type: nauc_precision_at_100_diff1 value: 4.163340339758862 - type: nauc_precision_at_100_max value: -10.581299020111306 - type: nauc_precision_at_100_std value: -29.038739456237955 - type: nauc_precision_at_10_diff1 value: 0.5857232239199855 - type: nauc_precision_at_10_max value: -12.365623679544461 - type: nauc_precision_at_10_std value: -29.949307140170728 - type: nauc_precision_at_1_diff1 value: -16.139790526714425 - type: nauc_precision_at_1_max value: -29.900749975522345 - type: nauc_precision_at_1_std value: -29.066801658151576 - type: nauc_precision_at_20_diff1 value: -7.74805679959642 - type: nauc_precision_at_20_max value: -25.268356658986903 - type: nauc_precision_at_20_std value: -37.758242471707966 - type: nauc_precision_at_3_diff1 value: -15.634998600034066 - type: nauc_precision_at_3_max value: -28.48849869574053 - type: nauc_precision_at_3_std value: -34.907495608911546 - type: nauc_precision_at_5_diff1 value: -8.48679992836417 - type: nauc_precision_at_5_max value: -29.707555980272975 - type: nauc_precision_at_5_std value: -40.733334704807156 - type: nauc_recall_at_1000_diff1 value: 8.826494916857577 - type: nauc_recall_at_1000_max value: -16.922331971426086 - type: nauc_recall_at_1000_std value: 1.4850859633484936 - type: nauc_recall_at_100_diff1 value: -12.650176624230422 - type: nauc_recall_at_100_max value: -40.574740215148125 - type: nauc_recall_at_100_std value: -40.52283965149714 - type: nauc_recall_at_10_diff1 value: -13.43480673345223 - type: nauc_recall_at_10_max value: -28.6156485981151 - type: nauc_recall_at_10_std value: -35.45555317207978 - type: nauc_recall_at_1_diff1 value: -20.304167493996196 - type: nauc_recall_at_1_max value: -34.39784658467407 - type: nauc_recall_at_1_std value: -34.8048180060142 - type: nauc_recall_at_20_diff1 value: -19.74246524681499 - type: nauc_recall_at_20_max value: -41.057831832815154 - type: nauc_recall_at_20_std value: -43.831099576419234 - type: nauc_recall_at_3_diff1 value: -22.564348397487556 - type: nauc_recall_at_3_max value: -35.421451948002236 - type: nauc_recall_at_3_std value: -36.72882367879091 - type: nauc_recall_at_5_diff1 value: -18.948821357059504 - type: nauc_recall_at_5_max value: -39.22248196683214 - type: nauc_recall_at_5_std value: -39.964758319612635 - type: ndcg_at_1 value: 14.285999999999998 - type: ndcg_at_10 value: 12.417 - type: ndcg_at_100 value: 21.564 - type: ndcg_at_1000 value: 34.264 - type: ndcg_at_20 value: 13.932 - type: ndcg_at_3 value: 13.997000000000002 - type: ndcg_at_5 value: 13.161999999999999 - type: precision_at_1 value: 16.326999999999998 - type: precision_at_10 value: 12.245000000000001 - type: precision_at_100 value: 5.163 - type: precision_at_1000 value: 1.304 - type: precision_at_20 value: 10.918 - type: precision_at_3 value: 16.326999999999998 - type: precision_at_5 value: 14.285999999999998 - type: recall_at_1 value: 1.2 - type: recall_at_10 value: 8.763 - type: recall_at_100 value: 31.584 - type: recall_at_1000 value: 70.519 - type: recall_at_20 value: 14.379 - type: recall_at_3 value: 3.229 - type: recall_at_5 value: 5.079000000000001 task: type: Retrieval - dataset: config: default name: MTEB ToxicConversationsClassification (default) revision: edfaf9da55d3dd50d43143d90c1ac476895ae6de split: test type: mteb/toxic_conversations_50k metrics: - type: accuracy value: 63.04199218749999 - type: ap value: 10.379917199607485 - type: ap_weighted value: 10.379917199607485 - type: f1 value: 47.876568123841864 - type: f1_weighted value: 71.2370937104015 - type: main_score value: 63.04199218749999 task: type: Classification - dataset: config: default name: MTEB TweetSentimentExtractionClassification (default) revision: d604517c81ca91fe16a244d1248fc021f9ecee7a split: test type: mteb/tweet_sentiment_extraction metrics: - type: accuracy value: 49.442558007923026 - type: f1 value: 49.60441043943531 - type: f1_weighted value: 48.96898929345838 - type: main_score value: 49.442558007923026 task: type: Classification - dataset: config: default name: MTEB TwentyNewsgroupsClustering (default) revision: 6125ec4e24fa026cec8a478383ee943acfbd5449 split: test type: mteb/twentynewsgroups-clustering metrics: - type: main_score value: 21.127920450161458 - type: v_measure value: 21.127920450161458 - type: v_measure_std value: 1.5027840050520012 task: type: Clustering - dataset: config: default name: MTEB TwitterSemEval2015 (default) revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1 split: test type: mteb/twittersemeval2015-pairclassification metrics: - type: cosine_accuracy value: 82.18394230196103 - type: cosine_accuracy_threshold value: 70.92341184616089 - type: cosine_ap value: 59.78262740579837 - type: cosine_f1 value: 56.536101934874935 - type: cosine_f1_threshold value: 63.08426856994629 - type: cosine_precision value: 51.13102859581733 - type: cosine_recall value: 63.21899736147757 - type: dot_accuracy value: 78.2559456398641 - type: dot_accuracy_threshold value: 75122.66235351562 - type: dot_ap value: 42.7554645305854 - type: dot_f1 value: 46.84298752095361 - type: dot_f1_threshold value: 47930.230712890625 - type: dot_precision value: 36.19746689694876 - type: dot_recall value: 66.35883905013192 - type: euclidean_accuracy value: 80.41962210168684 - type: euclidean_accuracy_threshold value: 2041.592025756836 - type: euclidean_ap value: 53.9382918676684 - type: euclidean_f1 value: 53.007111003977336 - type: euclidean_f1_threshold value: 2444.729995727539 - type: euclidean_precision value: 48.79076991346794 - type: euclidean_recall value: 58.02110817941952 - type: main_score value: 59.78262740579837 - type: manhattan_accuracy value: 80.65208320915539 - type: manhattan_accuracy_threshold value: 26017.153930664062 - type: manhattan_ap value: 54.628314460914396 - type: manhattan_f1 value: 53.78151260504202 - type: manhattan_f1_threshold value: 30961.737060546875 - type: manhattan_precision value: 47.208931419457734 - type: manhattan_recall value: 62.48021108179419 - type: max_accuracy value: 82.18394230196103 - type: max_ap value: 59.78262740579837 - type: max_f1 value: 56.536101934874935 - type: max_precision value: 51.13102859581733 - type: max_recall value: 66.35883905013192 - type: similarity_accuracy value: 82.18394230196103 - type: similarity_accuracy_threshold value: 70.92341184616089 - type: similarity_ap value: 59.78262740579837 - type: similarity_f1 value: 56.536101934874935 - type: similarity_f1_threshold value: 63.08426856994629 - type: similarity_precision value: 51.13102859581733 - type: similarity_recall value: 63.21899736147757 task: type: PairClassification - dataset: config: default name: MTEB TwitterURLCorpus (default) revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf split: test type: mteb/twitterurlcorpus-pairclassification metrics: - type: cosine_accuracy value: 86.35269918888501 - type: cosine_accuracy_threshold value: 65.62063097953796 - type: cosine_ap value: 79.86337146522463 - type: cosine_f1 value: 72.03383314109958 - type: cosine_f1_threshold value: 62.217533588409424 - type: cosine_precision value: 71.93979419444018 - type: cosine_recall value: 72.12811826301201 - type: dot_accuracy value: 82.84045484534482 - type: dot_accuracy_threshold value: 35566.62902832031 - type: dot_ap value: 69.69127356271262 - type: dot_f1 value: 64.93162154619034 - type: dot_f1_threshold value: 28885.244750976562 - type: dot_precision value: 59.36463383516203 - type: dot_recall value: 71.65075454265477 - type: euclidean_accuracy value: 83.63022470601933 - type: euclidean_accuracy_threshold value: 1693.5848236083984 - type: euclidean_ap value: 71.73555972139718 - type: euclidean_f1 value: 63.8556476722812 - type: euclidean_f1_threshold value: 1923.9103317260742 - type: euclidean_precision value: 62.26497914990124 - type: euclidean_recall value: 65.52971974129966 - type: main_score value: 79.86337146522463 - type: manhattan_accuracy value: 83.70978383203322 - type: manhattan_accuracy_threshold value: 21348.568725585938 - type: manhattan_ap value: 72.01847359087003 - type: manhattan_f1 value: 64.34136401773942 - type: manhattan_f1_threshold value: 23113.516235351562 - type: manhattan_precision value: 66.8715222988124 - type: manhattan_recall value: 61.99568832768709 - type: max_accuracy value: 86.35269918888501 - type: max_ap value: 79.86337146522463 - type: max_f1 value: 72.03383314109958 - type: max_precision value: 71.93979419444018 - type: max_recall value: 72.12811826301201 - type: similarity_accuracy value: 86.35269918888501 - type: similarity_accuracy_threshold value: 65.62063097953796 - type: similarity_ap value: 79.86337146522463 - type: similarity_f1 value: 72.03383314109958 - type: similarity_f1_threshold value: 62.217533588409424 - type: similarity_precision value: 71.93979419444018 - type: similarity_recall value: 72.12811826301201 task: type: PairClassification model_name: minishlab/M2V_base_output tags: - embeddings - static-embeddings - mteb - sentence-transformers --- # minishlab/M2V_base_output Model Card This Model2Vec model is a distilled version of the baai/bge-base-en-v1.5 Sentence Transformer. It uses static embeddings, allowing text embeddings to be computed orders of magnitude faster on both GPU and CPU. It is designed for applications where computational resources are limited or where real-time performance is critical. ## Installation Install model2vec using pip: ## Usage Load this model using the method: Alternatively, you can distill your own model using the method: ## How it works Model2vec creates a small, fast, and powerful model that outperforms other static embedding models by a large margin on all tasks we could find, while being much faster to create than traditional static embedding models such as GloVe. Best of all, you don't need any data to distill a model using Model2Vec. It works by passing a vocabulary through a sentence transformer model, then reducing the dimensionality of the resulting embeddings using PCA, and finally weighting the embeddings using zipf weighting. During inference, we simply take the mean of all token embeddings occurring in a sentence. ## Additional Resources - All Model2Vec models on the hub - Model2Vec Repo - Model2Vec Results - Model2Vec Tutorials ## Library Authors Model2Vec was developed by the Minish Lab team consisting of Stephan Tulkens and Thomas van Dongen. ## Citation Please cite the Model2Vec repository if you use this model in your work.", + "model_explanation_gemini": "Performs classification tasks on text datasets like Amazon reviews and counterfactual analysis, achieving accuracy scores ranging from 32.59% to 71.10% across different benchmarks. \n\n**Features**: \n- Base model: BAAI/bge-base-en-v1.5 \n- Language: English \n- Tasks: Text classification (e.g., sentiment, counterfactual detection) \n- Benchmarks: MTEB (AmazonCounterfactualClassification, AmazonPolarity, AmazonReviews," +} \ No newline at end of file diff --git a/model_data_json/mistralai_Mistral-7B-Instruct-v0.1.json b/model_data_json/mistralai_Mistral-7B-Instruct-v0.1.json new file mode 100644 index 0000000000000000000000000000000000000000..67e20f359eb8bf04389f36cd9ac60baab24cd588 --- /dev/null +++ b/model_data_json/mistralai_Mistral-7B-Instruct-v0.1.json @@ -0,0 +1,23 @@ +{ + "model_id": "mistralai/Mistral-7B-Instruct-v0.1", + "downloads": 475839, + "tags": [ + "transformers", + "pytorch", + "safetensors", + "mistral", + "text-generation", + "finetuned", + "conversational", + "arxiv:2310.06825", + "base_model:mistralai/Mistral-7B-v0.1", + "base_model:finetune:mistralai/Mistral-7B-v0.1", + "license:apache-2.0", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: apache-2.0 tags: - finetuned base_model: mistralai/Mistral-7B-v0.1 pipeline_tag: text-generation inference: true widget: - messages: - role: user content: What is your favorite condiment? extra_gated_description: If you want to learn more about how we process your personal data, please read our . --- # Model Card for Mistral-7B-Instruct-v0.1 ## Encode and Decode with ## Inference with ## Inference with hugging face > [!TIP] > PRs to correct the tokenizer so that it gives 1-to-1 the same results as the reference implementation are very welcome! --- The Mistral-7B-Instruct-v0.1 Large Language Model (LLM) is a instruct fine-tuned version of the Mistral-7B-v0.1 generative text model using a variety of publicly available conversation datasets. For full details of this model please read our paper and release blog post. ## Instruction format In order to leverage instruction fine-tuning, your prompt should be surrounded by and tokens. The very first instruction should begin with a begin of sentence id. The next instructions should not. The assistant generation will be ended by the end-of-sentence token id. E.g. This format is available as a chat template via the method: ## Model Architecture This instruction model is based on Mistral-7B-v0.1, a transformer model with the following architecture choices: - Grouped-Query Attention - Sliding-Window Attention - Byte-fallback BPE tokenizer ## Troubleshooting - If you see the following error: Installing transformers from source should solve the issue pip install git+ This should not be required after transformers-v4.33.4. ## Limitations The Mistral 7B Instruct model is a quick demonstration that the base model can be easily fine-tuned to achieve compelling performance. It does not have any moderation mechanisms. We're looking forward to engaging with the community on ways to make the model finely respect guardrails, allowing for deployment in environments requiring moderated outputs. ## The Mistral AI Team Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed.", + "model_explanation_gemini": "A fine-tuned instruction-following version of Mistral-7B-v0.1 designed for text generation tasks using structured conversational prompts. \n\n**Features:** \n- Instruction fine-tuned for conversational tasks \n- Uses structured prompt formatting with special tokens \n- Based on Mistral-7B-v0.1 with Grouped-Query Attention and Sliding-Window Attention \n- Includes a byte-fallback BPE tokenizer \n\n**Comparison:** \nUnlike the base Mistral-7B" +} \ No newline at end of file diff --git a/model_data_json/mistralai_Mistral-7B-Instruct-v0.2.json b/model_data_json/mistralai_Mistral-7B-Instruct-v0.2.json new file mode 100644 index 0000000000000000000000000000000000000000..231856e31ec4bd907fc3a99719434fb3173664a3 --- /dev/null +++ b/model_data_json/mistralai_Mistral-7B-Instruct-v0.2.json @@ -0,0 +1,21 @@ +{ + "model_id": "mistralai/Mistral-7B-Instruct-v0.2", + "downloads": 1277759, + "tags": [ + "transformers", + "pytorch", + "safetensors", + "mistral", + "text-generation", + "finetuned", + "conversational", + "arxiv:2310.06825", + "license:apache-2.0", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: apache-2.0 tags: - finetuned pipeline_tag: text-generation new_version: mistralai/Mistral-7B-Instruct-v0.3 inference: true widget: - messages: - role: user content: What is your favorite condiment? extra_gated_description: If you want to learn more about how we process your personal data, please read our . --- # Model Card for Mistral-7B-Instruct-v0.2 ## Encode and Decode with ## Inference with ## Inference with hugging face > [!TIP] > PRs to correct the tokenizer so that it gives 1-to-1 the same results as the reference implementation are very welcome! --- The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-7B-v0.2. Mistral-7B-v0.2 has the following changes compared to Mistral-7B-v0.1 - 32k context window (vs 8k context in v0.1) - Rope-theta = 1e6 - No Sliding-Window Attention For full details of this model please read our paper and release blog post. ## Instruction format In order to leverage instruction fine-tuning, your prompt should be surrounded by and tokens. The very first instruction should begin with a begin of sentence id. The next instructions should not. The assistant generation will be ended by the end-of-sentence token id. E.g. This format is available as a chat template via the method: ## Troubleshooting - If you see the following error: Installing transformers from source should solve the issue pip install git+ This should not be required after transformers-v4.33.4. ## Limitations The Mistral 7B Instruct model is a quick demonstration that the base model can be easily fine-tuned to achieve compelling performance. It does not have any moderation mechanisms. We're looking forward to engaging with the community on ways to make the model finely respect guardrails, allowing for deployment in environments requiring moderated outputs. ## The Mistral AI Team Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Blanche Savary, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Emma Bou Hanna, Florian Bressand, Gianna Lengyel, Guillaume Bour, Guillaume Lample, Lélio Renard Lavaud, Louis Ternon, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Théophile Gervet, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed.", + "model_explanation_gemini": "A fine-tuned instruction-following LLM based on Mistral-7B-v0.2, designed for text generation with a 32k context window and improved attention mechanisms. \n\n**Features:** \n- Instruction fine-tuned for structured prompts (enclosed in special tokens) \n- 32k context window (vs. 8k in v0.1) \n- Adjusted Rope-theta (1e6) and no sliding-window attention \n- Supports chat template" +} \ No newline at end of file diff --git a/model_data_json/mistralai_Mistral-7B-Instruct-v0.3.json b/model_data_json/mistralai_Mistral-7B-Instruct-v0.3.json new file mode 100644 index 0000000000000000000000000000000000000000..9991e3e565b9ff6f85df335f66e2f7c16b28bc01 --- /dev/null +++ b/model_data_json/mistralai_Mistral-7B-Instruct-v0.3.json @@ -0,0 +1,20 @@ +{ + "model_id": "mistralai/Mistral-7B-Instruct-v0.3", + "downloads": 701083, + "tags": [ + "transformers", + "safetensors", + "mistral", + "text-generation", + "conversational", + "base_model:mistralai/Mistral-7B-v0.3", + "base_model:finetune:mistralai/Mistral-7B-v0.3", + "license:apache-2.0", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: apache-2.0 base_model: mistralai/Mistral-7B-v0.3 extra_gated_description: If you want to learn more about how we process your personal data, please read our . --- # Model Card for Mistral-7B-Instruct-v0.3 The Mistral-7B-Instruct-v0.3 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-7B-v0.3. Mistral-7B-v0.3 has the following changes compared to Mistral-7B-v0.2 - Extended vocabulary to 32768 - Supports v3 Tokenizer - Supports function calling ## Installation It is recommended to use with mistral-inference. For HF transformers code snippets, please keep scrolling. ## Download ### Chat After installing , a CLI command should be available in your environment. You can chat with the model using ### Instruct following ### Function calling ## Generate with If you want to use Hugging Face to generate text, you can do something like this. ## Function calling with To use this example, you'll need version 4.42.0 or higher. Please see the function calling guide in the docs for more information. Note that, for reasons of space, this example does not show a complete cycle of calling a tool and adding the tool call and tool results to the chat history so that the model can use them in its next generation. For a full tool calling example, please see the function calling guide, and note that Mistral **does** use tool call IDs, so these must be included in your tool calls and tool results. They should be exactly 9 alphanumeric characters. ## Limitations The Mistral 7B Instruct model is a quick demonstration that the base model can be easily fine-tuned to achieve compelling performance. It does not have any moderation mechanisms. We're looking forward to engaging with the community on ways to make the model finely respect guardrails, allowing for deployment in environments requiring moderated outputs. ## The Mistral AI Team Albert Jiang, Alexandre Sablayrolles, Alexis Tacnet, Antoine Roux, Arthur Mensch, Audrey Herblin-Stoop, Baptiste Bout, Baudouin de Monicault, Blanche Savary, Bam4d, Caroline Feldman, Devendra Singh Chaplot, Diego de las Casas, Eleonore Arcelin, Emma Bou Hanna, Etienne Metzger, Gianna Lengyel, Guillaume Bour, Guillaume Lample, Harizo Rajaona, Jean-Malo Delignon, Jia Li, Justus Murke, Louis Martin, Louis Ternon, Lucile Saulnier, Lélio Renard Lavaud, Margaret Jennings, Marie Pellat, Marie Torelli, Marie-Anne Lachaux, Nicolas Schuhl, Patrick von Platen, Pierre Stock, Sandeep Subramanian, Sophia Yang, Szymon Antoniak, Teven Le Scao, Thibaut Lavril, Timothée Lacroix, Théophile Gervet, Thomas Wang, Valera Nemychnikova, William El Sayed, William Marshall", + "model_explanation_gemini": "A fine-tuned instruction-following version of Mistral-7B-v0.3 with extended vocabulary, v3 tokenizer support, and function calling capabilities.\n\nFeatures: \n- Instruction-tuned for task execution \n- Expanded 32,768-token vocabulary \n- Supports v3 tokenizer \n- Function calling functionality \n\nComparison: \nCompared to Mistral-7B-v0.2, this version adds a larger vocabulary, updated tokenizer support, and new function calling abilities while" +} \ No newline at end of file diff --git a/model_data_json/mistralai_Mistral-7B-v0.1.json b/model_data_json/mistralai_Mistral-7B-v0.1.json new file mode 100644 index 0000000000000000000000000000000000000000..38ce6d768cc245171ce8954e5deefcf6ead2b935 --- /dev/null +++ b/model_data_json/mistralai_Mistral-7B-v0.1.json @@ -0,0 +1,21 @@ +{ + "model_id": "mistralai/Mistral-7B-v0.1", + "downloads": 549916, + "tags": [ + "transformers", + "pytorch", + "safetensors", + "mistral", + "text-generation", + "pretrained", + "en", + "arxiv:2310.06825", + "license:apache-2.0", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: - en license: apache-2.0 tags: - pretrained pipeline_tag: text-generation inference: parameters: temperature: 0.7 extra_gated_description: If you want to learn more about how we process your personal data, please read our . --- # Model Card for Mistral-7B-v0.1 The Mistral-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. Mistral-7B-v0.1 outperforms Llama 2 13B on all benchmarks we tested. For full details of this model please read our paper and release blog post. ## Model Architecture Mistral-7B-v0.1 is a transformer model, with the following architecture choices: - Grouped-Query Attention - Sliding-Window Attention - Byte-fallback BPE tokenizer ## Troubleshooting - If you see the following error: - Or: Ensure you are utilizing a stable version of Transformers, 4.34.0 or newer. ## Notice Mistral 7B is a pretrained base model and therefore does not have any moderation mechanisms. ## The Mistral AI Team Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed.", + "model_explanation_gemini": "A pretrained generative text model with 7 billion parameters, outperforming Llama 2 13B on benchmarks, designed for text generation tasks.\n\n**Features:** \n- 7B parameter transformer \n- Grouped-Query Attention \n- Sliding-Window Attention \n- Byte-fallback BPE tokenizer \n\n**Comparison:** \nOutperforms Llama 2 13B across tested benchmarks while being smaller in size." +} \ No newline at end of file diff --git a/model_data_json/mistralai_Mistral-7B-v0.3.json b/model_data_json/mistralai_Mistral-7B-v0.3.json new file mode 100644 index 0000000000000000000000000000000000000000..30c522fe8a9ff0fb8ada344956c7ca4b91b550d6 --- /dev/null +++ b/model_data_json/mistralai_Mistral-7B-v0.3.json @@ -0,0 +1,17 @@ +{ + "model_id": "mistralai/Mistral-7B-v0.3", + "downloads": 462267, + "tags": [ + "transformers", + "safetensors", + "mistral", + "text-generation", + "license:apache-2.0", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: apache-2.0 extra_gated_description: If you want to learn more about how we process your personal data, please read our . --- # Model Card for Mistral-7B-v0.3 The Mistral-7B-v0.3 Large Language Model (LLM) is a Mistral-7B-v0.2 with extended vocabulary. Mistral-7B-v0.3 has the following changes compared to Mistral-7B-v0.2 - Extended vocabulary to 32768 ## Installation It is recommended to use with mistral-inference. For HF transformers code snippets, please keep scrolling. ## Download ### Demo After installing , a CLI command should be available in your environment. Should give something along the following lines: ## Generate with If you want to use Hugging Face to generate text, you can do something like this. ## Limitations The Mistral 7B Instruct model is a quick demonstration that the base model can be easily fine-tuned to achieve compelling performance. It does not have any moderation mechanisms. We're looking forward to engaging with the community on ways to make the model finely respect guardrails, allowing for deployment in environments requiring moderated outputs. ## The Mistral AI Team Albert Jiang, Alexandre Sablayrolles, Alexis Tacnet, Antoine Roux, Arthur Mensch, Audrey Herblin-Stoop, Baptiste Bout, Baudouin de Monicault, Blanche Savary, Bam4d, Caroline Feldman, Devendra Singh Chaplot, Diego de las Casas, Eleonore Arcelin, Emma Bou Hanna, Etienne Metzger, Gianna Lengyel, Guillaume Bour, Guillaume Lample, Harizo Rajaona, Jean-Malo Delignon, Jia Li, Justus Murke, Louis Martin, Louis Ternon, Lucile Saulnier, Lélio Renard Lavaud, Margaret Jennings, Marie Pellat, Marie Torelli, Marie-Anne Lachaux, Nicolas Schuhl, Patrick von Platen, Pierre Stock, Sandeep Subramanian, Sophia Yang, Szymon Antoniak, Teven Le Scao, Thibaut Lavril, Timothée Lacroix, Théophile Gervet, Thomas Wang, Valera Nemychnikova, William El Sayed, William Marshall", + "model_explanation_gemini": "\"Mistral-7B-v0.3 is an upgraded version of Mistral-7B-v0.2 with an extended vocabulary of 32768 tokens, designed for text generation tasks.\"\n\nFeatures: \n- Extended vocabulary (32,768 tokens) \n- Compatible with mistral-inference and Hugging Face transformers \n- No built-in moderation mechanisms \n\nComparison: \nCompared to Mistral-7B-v0.2, this version primarily improves vocabulary size while maintaining the same base" +} \ No newline at end of file diff --git a/model_data_json/mistralai_Mistral-Nemo-Instruct-2407.json b/model_data_json/mistralai_Mistral-Nemo-Instruct-2407.json new file mode 100644 index 0000000000000000000000000000000000000000..80eabf237ec52834f1b4854fa5ba9fc6e6de33b2 --- /dev/null +++ b/model_data_json/mistralai_Mistral-Nemo-Instruct-2407.json @@ -0,0 +1,29 @@ +{ + "model_id": "mistralai/Mistral-Nemo-Instruct-2407", + "downloads": 146936, + "tags": [ + "transformers", + "safetensors", + "mistral", + "text-generation", + "conversational", + "en", + "fr", + "de", + "es", + "it", + "pt", + "ru", + "zh", + "ja", + "base_model:mistralai/Mistral-Nemo-Base-2407", + "base_model:finetune:mistralai/Mistral-Nemo-Base-2407", + "license:apache-2.0", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: - en - fr - de - es - it - pt - ru - zh - ja license: apache-2.0 base_model: mistralai/Mistral-Nemo-Base-2407 extra_gated_description: If you want to learn more about how we process your personal data, please read our . --- # Model Card for Mistral-Nemo-Instruct-2407 The Mistral-Nemo-Instruct-2407 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-Nemo-Base-2407. Trained jointly by Mistral AI and NVIDIA, it significantly outperforms existing models smaller or similar in size. For more details about this model please refer to our release blog post. ## Key features - Released under the **Apache 2 License** - Pre-trained and instructed versions - Trained with a **128k context window** - Trained on a large proportion of **multilingual and code data** - Drop-in replacement of Mistral 7B ## Model Architecture Mistral Nemo is a transformer model, with the following architecture choices: - **Layers:** 40 - **Dim:** 5,120 - **Head dim:** 128 - **Hidden dim:** 14,336 - **Activation Function:** SwiGLU - **Number of heads:** 32 - **Number of kv-heads:** 8 (GQA) - **Vocabulary size:** 2**17 ~= 128k - **Rotary embeddings (theta = 1M)** ## Metrics ### Main Benchmarks | Benchmark | Score | | --- | --- | | HellaSwag (0-shot) | 83.5% | | Winogrande (0-shot) | 76.8% | | OpenBookQA (0-shot) | 60.6% | | CommonSenseQA (0-shot) | 70.4% | | TruthfulQA (0-shot) | 50.3% | | MMLU (5-shot) | 68.0% | | TriviaQA (5-shot) | 73.8% | | NaturalQuestions (5-shot) | 31.2% | ### Multilingual Benchmarks (MMLU) | Language | Score | | --- | --- | | French | 62.3% | | German | 62.7% | | Spanish | 64.6% | | Italian | 61.3% | | Portuguese | 63.3% | | Russian | 59.2% | | Chinese | 59.0% | | Japanese | 59.0% | ## Usage The model can be used with three different frameworks - []( See here - []( See here - []( See nvidia/Mistral-NeMo-12B-Instruct ### Mistral Inference #### Install It is recommended to use with mistral-inference. For HF transformers code snippets, please keep scrolling. #### Download #### Chat After installing , a CLI command should be available in your environment. You can chat with the model using *E.g.* Try out something like: #### Instruct following #### Function calling ### Transformers > [!IMPORTANT] > NOTE: Until a new release has been made, you need to install transformers from source: > If you want to use Hugging Face to generate text, you can do something like this. ## Function calling with To use this example, you'll need version 4.42.0 or higher. Please see the function calling guide in the docs for more information. Note that, for reasons of space, this example does not show a complete cycle of calling a tool and adding the tool call and tool results to the chat history so that the model can use them in its next generation. For a full tool calling example, please see the function calling guide, and note that Mistral **does** use tool call IDs, so these must be included in your tool calls and tool results. They should be exactly 9 alphanumeric characters. > [!TIP] > Unlike previous Mistral models, Mistral Nemo requires smaller temperatures. We recommend to use a temperature of 0.3. ## Limitations The Mistral Nemo Instruct model is a quick demonstration that the base model can be easily fine-tuned to achieve compelling performance. It does not have any moderation mechanisms. We're looking forward to engaging with the community on ways to make the model finely respect guardrails, allowing for deployment in environments requiring moderated outputs. ## The Mistral AI Team Albert Jiang, Alexandre Sablayrolles, Alexis Tacnet, Alok Kothari, Antoine Roux, Arthur Mensch, Audrey Herblin-Stoop, Augustin Garreau, Austin Birky, Bam4d, Baptiste Bout, Baudouin de Monicault, Blanche Savary, Carole Rambaud, Caroline Feldman, Devendra Singh Chaplot, Diego de las Casas, Eleonore Arcelin, Emma Bou Hanna, Etienne Metzger, Gaspard Blanchet, Gianna Lengyel, Guillaume Bour, Guillaume Lample, Harizo Rajaona, Henri Roussez, Hichem Sattouf, Ian Mack, Jean-Malo Delignon, Jessica Chudnovsky, Justus Murke, Kartik Khandelwal, Lawrence Stewart, Louis Martin, Louis Ternon, Lucile Saulnier, Lélio Renard Lavaud, Margaret Jennings, Marie Pellat, Marie Torelli, Marie-Anne Lachaux, Marjorie Janiewicz, Mickaël Seznec, Nicolas Schuhl, Niklas Muhs, Olivier de Garrigues, Patrick von Platen, Paul Jacob, Pauline Buche, Pavan Kumar Reddy, Perry Savas, Pierre Stock, Romain Sauvestre, Sagar Vaze, Sandeep Subramanian, Saurabh Garg, Sophia Yang, Szymon Antoniak, Teven Le Scao, Thibault Schueller, Thibaut Lavril, Thomas Wang, Théophile Gervet, Timothée Lacroix, Valera Nemychnikova, Wendy Shang, William El Sayed, William Marshall", + "model_explanation_gemini": "A multilingual, instruction-tuned LLM with a 128k context window, optimized for performance across languages and code, serving as a drop-in replacement for Mistral 7B. \n\n**Features**: \n- **License**: Apache 2.0 \n- **Base Model**: Mistral-Nemo-Base-2407 \n- **Context Window**: 128k tokens \n- **Multilingual Support**: English, French, German, Spanish, Italian, Portuguese, Russian, Chinese, Japanese" +} \ No newline at end of file diff --git a/model_data_json/mistralai_Mistral-Small-24B-Instruct-2501.json b/model_data_json/mistralai_Mistral-Small-24B-Instruct-2501.json new file mode 100644 index 0000000000000000000000000000000000000000..824b0d7cb0fed91a7932affafbfe1ef627d82579 --- /dev/null +++ b/model_data_json/mistralai_Mistral-Small-24B-Instruct-2501.json @@ -0,0 +1,29 @@ +{ + "model_id": "mistralai/Mistral-Small-24B-Instruct-2501", + "downloads": 822996, + "tags": [ + "vllm", + "safetensors", + "mistral", + "text-generation", + "transformers", + "conversational", + "en", + "fr", + "de", + "es", + "it", + "pt", + "zh", + "ja", + "ru", + "ko", + "base_model:mistralai/Mistral-Small-24B-Base-2501", + "base_model:finetune:mistralai/Mistral-Small-24B-Base-2501", + "license:apache-2.0", + "text-generation-inference", + "region:us" + ], + "description": "--- language: - en - fr - de - es - it - pt - zh - ja - ru - ko license: apache-2.0 library_name: vllm inference: false base_model: - mistralai/Mistral-Small-24B-Base-2501 extra_gated_description: >- If you want to learn more about how we process your personal data, please read our . tags: - transformers --- # Model Card for Mistral-Small-24B-Instruct-2501 Mistral Small 3 ( 2501 ) sets a new benchmark in the \"small\" Large Language Models category below 70B, boasting 24B parameters and achieving state-of-the-art capabilities comparable to larger models! This model is an instruction-fine-tuned version of the base model: Mistral-Small-24B-Base-2501. Mistral Small can be deployed locally and is exceptionally \"knowledge-dense\", fitting in a single RTX 4090 or a 32GB RAM MacBook once quantized. Perfect for: - Fast response conversational agents. - Low latency function calling. - Subject matter experts via fine-tuning. - Local inference for hobbyists and organizations handling sensitive data. For enterprises that need specialized capabilities (increased context, particular modalities, domain specific knowledge, etc.), we will be releasing commercial models beyond what Mistral AI contributes to the community. This release demonstrates our commitment to open source, serving as a strong base model. Learn more about Mistral Small in our blog post. Model developper: Mistral AI Team ## Key Features - **Multilingual:** Supports dozens of languages, including English, French, German, Spanish, Italian, Chinese, Japanese, Korean, Portuguese, Dutch, and Polish. - **Agent-Centric:** Offers best-in-class agentic capabilities with native function calling and JSON outputting. - **Advanced Reasoning:** State-of-the-art conversational and reasoning capabilities. - **Apache 2.0 License:** Open license allowing usage and modification for both commercial and non-commercial purposes. - **Context Window:** A 32k context window. - **System Prompt:** Maintains strong adherence and support for system prompts. - **Tokenizer:** Utilizes a Tekken tokenizer with a 131k vocabulary size. ## Benchmark results ### Human evaluated benchmarks | Category | Gemma-2-27B | Qwen-2.5-32B | Llama-3.3-70B | Gpt4o-mini | |----------|-------------|--------------|---------------|------------| | Mistral is better | 0.536 | 0.496 | 0.192 | 0.200 | | Mistral is slightly better | 0.196 | 0.184 | 0.164 | 0.204 | | Ties | 0.052 | 0.060 | 0.236 | 0.160 | | Other is slightly better | 0.060 | 0.088 | 0.112 | 0.124 | | Other is better | 0.156 | 0.172 | 0.296 | 0.312 | **Note**: - We conducted side by side evaluations with an external third-party vendor, on a set of over 1k proprietary coding and generalist prompts. - Evaluators were tasked with selecting their preferred model response from anonymized generations produced by Mistral Small 3 vs another model. - We are aware that in some cases the benchmarks on human judgement starkly differ from publicly available benchmarks, but have taken extra caution in verifying a fair evaluation. We are confident that the above benchmarks are valid. ### Publicly accesible benchmarks **Reasoning & Knowledge** | Evaluation | mistral-small-24B-instruct-2501 | gemma-2b-27b | llama-3.3-70b | qwen2.5-32b | gpt-4o-mini-2024-07-18 | |------------|---------------|--------------|---------------|---------------|-------------| | mmlu_pro_5shot_cot_instruct | 0.663 | 0.536 | 0.666 | 0.683 | 0.617 | | gpqa_main_cot_5shot_instruct | 0.453 | 0.344 | 0.531 | 0.404 | 0.377 | **Math & Coding** | Evaluation | mistral-small-24B-instruct-2501 | gemma-2b-27b | llama-3.3-70b | qwen2.5-32b | gpt-4o-mini-2024-07-18 | |------------|---------------|--------------|---------------|---------------|-------------| | humaneval_instruct_pass@1 | 0.848 | 0.732 | 0.854 | 0.909 | 0.890 | | math_instruct | 0.706 | 0.535 | 0.743 | 0.819 | 0.761 | **Instruction following** | Evaluation | mistral-small-24B-instruct-2501 | gemma-2b-27b | llama-3.3-70b | qwen2.5-32b | gpt-4o-mini-2024-07-18 | |------------|---------------|--------------|---------------|---------------|-------------| | mtbench_dev | 8.35 | 7.86 | 7.96 | 8.26 | 8.33 | | wildbench | 52.27 | 48.21 | 50.04 | 52.73 | 56.13 | | arena_hard | 0.873 | 0.788 | 0.840 | 0.860 | 0.897 | | ifeval | 0.829 | 0.8065 | 0.8835 | 0.8401 | 0.8499 | **Note**: - Performance accuracy on all benchmarks were obtained through the same internal evaluation pipeline - as such, numbers may vary slightly from previously reported performance (Qwen2.5-32B-Instruct, Llama-3.3-70B-Instruct, Gemma-2-27B-IT). - Judge based evals such as Wildbench, Arena hard and MTBench were based on gpt-4o-2024-05-13. ### Basic Instruct Template (V7-Tekken) *, and are placeholders.* ***Please make sure to use mistral-common as the source of truth*** ## Usage The model can be used with the following frameworks; - []( See here - []( See here ### vLLM We recommend using this model with the vLLM library to implement production-ready inference pipelines. **Note 1**: We recommond using a relatively low temperature, such as . **Note 2**: Make sure to add a system prompt to the model to best tailer it for your needs. If you want to use the model as a general assistant, we recommend the following system prompt: **_Installation_** Make sure you install []( Also make sure you have []( installed: You can also make use of a ready-to-go docker image or on the docker hub. #### Server We recommand that you use Mistral-Small-24B-Instruct-2501 in a server/client setting. 1. Spin up a server: **Note:** Running Mistral-Small-24B-Instruct-2501 on GPU requires ~55 GB of GPU RAM in bf16 or fp16. 2. To ping the client you can use a simple Python snippet. # /\\_/\\ # ( o.o ) # > ^ < # ### Function calling Mistral-Small-24-Instruct-2501 is excellent at function / tool calling tasks via vLLM. *E.g.:*
Example
#### Offline # /\\_/\\ # ( o.o ) # > ^ < # ### Transformers If you want to use Hugging Face transformers to generate text, you can do something like this. ### Ollama Ollama can run this model locally on MacOS, Windows and Linux. 4-bit quantization (aliased to default): 8-bit quantization: FP16:", + "model_explanation_gemini": "A 24B-parameter multilingual instruction-tuned model optimized for fast, low-latency conversational agents, function calling, and local inference with advanced reasoning and agentic capabilities. \n\n**Features:** \n- Multilingual (supports 10+ languages) \n- Agent-centric (function calling, JSON output) \n- 32k context window \n- Apache 2.0 licensed \n- Optimized for local deployment (e.g., RTX 4090 or 32GB MacBook" +} \ No newline at end of file diff --git a/model_data_json/mistralai_Mistral-Small-3.1-24B-Instruct-2503.json b/model_data_json/mistralai_Mistral-Small-3.1-24B-Instruct-2503.json new file mode 100644 index 0000000000000000000000000000000000000000..476c009bf3a72c0aff74425cb8f743f72843f83d --- /dev/null +++ b/model_data_json/mistralai_Mistral-Small-3.1-24B-Instruct-2503.json @@ -0,0 +1,41 @@ +{ + "model_id": "mistralai/Mistral-Small-3.1-24B-Instruct-2503", + "downloads": 78289, + "tags": [ + "vllm", + "safetensors", + "mistral3", + "image-text-to-text", + "conversational", + "en", + "fr", + "de", + "es", + "pt", + "it", + "ja", + "ko", + "ru", + "zh", + "ar", + "fa", + "id", + "ms", + "ne", + "pl", + "ro", + "sr", + "sv", + "tr", + "uk", + "vi", + "hi", + "bn", + "base_model:mistralai/Mistral-Small-3.1-24B-Base-2503", + "base_model:finetune:mistralai/Mistral-Small-3.1-24B-Base-2503", + "license:apache-2.0", + "region:us" + ], + "description": "--- language: - en - fr - de - es - pt - it - ja - ko - ru - zh - ar - fa - id - ms - ne - pl - ro - sr - sv - tr - uk - vi - hi - bn license: apache-2.0 library_name: vllm inference: false base_model: - mistralai/Mistral-Small-3.1-24B-Base-2503 extra_gated_description: >- If you want to learn more about how we process your personal data, please read our
. pipeline_tag: image-text-to-text --- # Model Card for Mistral-Small-3.1-24B-Instruct-2503 Building upon Mistral Small 3 (2501), Mistral Small 3.1 (2503) **adds state-of-the-art vision understanding** and enhances **long context capabilities up to 128k tokens** without compromising text performance. With 24 billion parameters, this model achieves top-tier capabilities in both text and vision tasks. This model is an instruction-finetuned version of: Mistral-Small-3.1-24B-Base-2503. Mistral Small 3.1 can be deployed locally and is exceptionally \"knowledge-dense,\" fitting within a single RTX 4090 or a 32GB RAM MacBook once quantized. It is ideal for: - Fast-response conversational agents. - Low-latency function calling. - Subject matter experts via fine-tuning. - Local inference for hobbyists and organizations handling sensitive data. - Programming and math reasoning. - Long document understanding. - Visual understanding. For enterprises requiring specialized capabilities (increased context, specific modalities, domain-specific knowledge, etc.), we will release commercial models beyond what Mistral AI contributes to the community. Learn more about Mistral Small 3.1 in our blog post. ## Key Features - **Vision:** Vision capabilities enable the model to analyze images and provide insights based on visual content in addition to text. - **Multilingual:** Supports dozens of languages, including English, French, German, Greek, Hindi, Indonesian, Italian, Japanese, Korean, Malay, Nepali, Polish, Portuguese, Romanian, Russian, Serbian, Spanish, Swedish, Turkish, Ukrainian, Vietnamese, Arabic, Bengali, Chinese, Farsi. - **Agent-Centric:** Offers best-in-class agentic capabilities with native function calling and JSON outputting. - **Advanced Reasoning:** State-of-the-art conversational and reasoning capabilities. - **Apache 2.0 License:** Open license allowing usage and modification for both commercial and non-commercial purposes. - **Context Window:** A 128k context window. - **System Prompt:** Maintains strong adherence and support for system prompts. - **Tokenizer:** Utilizes a Tekken tokenizer with a 131k vocabulary size. ## Benchmark Results When available, we report numbers previously published by other model providers, otherwise we re-evaluate them using our own evaluation harness. ### Pretrain Evals | Model | MMLU (5-shot) | MMLU Pro (5-shot CoT) | TriviaQA | GPQA Main (5-shot CoT)| MMMU | |--------------------------------|---------------|-----------------------|------------|-----------------------|-----------| | **Small 3.1 24B Base** | **81.01%** | **56.03%** | 80.50% | **37.50%** | **59.27%**| | Gemma 3 27B PT | 78.60% | 52.20% | **81.30%** | 24.30% | 56.10% | ### Instruction Evals #### Text | Model | MMLU | MMLU Pro (5-shot CoT) | MATH | GPQA Main (5-shot CoT) | GPQA Diamond (5-shot CoT )| MBPP | HumanEval | SimpleQA (TotalAcc)| |--------------------------------|-----------|-----------------------|------------------------|------------------------|---------------------------|-----------|-----------|--------------------| | **Small 3.1 24B Instruct** | 80.62% | 66.76% | 69.30% | **44.42%** | **45.96%** | 74.71% | **88.41%**| **10.43%** | | Gemma 3 27B IT | 76.90% | **67.50%** | **89.00%** | 36.83% | 42.40% | 74.40% | 87.80% | 10.00% | | GPT4o Mini | **82.00%**| 61.70% | 70.20% | 40.20% | 39.39% | 84.82% | 87.20% | 9.50% | | Claude 3.5 Haiku | 77.60% | 65.00% | 69.20% | 37.05% | 41.60% | **85.60%**| 88.10% | 8.02% | | Cohere Aya-Vision 32B | 72.14% | 47.16% | 41.98% | 34.38% | 33.84% | 70.43% | 62.20% | 7.65% | #### Vision | Model | MMMU | MMMU PRO | Mathvista | ChartQA | DocVQA | AI2D | MM MT Bench | |--------------------------------|------------|-----------|-----------|-----------|-----------|-------------|-------------| | **Small 3.1 24B Instruct** | 64.00% | **49.25%**| **68.91%**| 86.24% | **94.08%**| **93.72%** | **7.3** | | Gemma 3 27B IT | **64.90%** | 48.38% | 67.60% | 76.00% | 86.60% | 84.50% | 7 | | GPT4o Mini | 59.40% | 37.60% | 56.70% | 76.80% | 86.70% | 88.10% | 6.6 | | Claude 3.5 Haiku | 60.50% | 45.03% | 61.60% | **87.20%**| 90.00% | 92.10% | 6.5 | | Cohere Aya-Vision 32B | 48.20% | 31.50% | 50.10% | 63.04% | 72.40% | 82.57% | 4.1 | ### Multilingual Evals | Model | Average | European | East Asian | Middle Eastern | |--------------------------------|------------|------------|------------|----------------| | **Small 3.1 24B Instruct** | **71.18%** | **75.30%** | **69.17%** | 69.08% | | Gemma 3 27B IT | 70.19% | 74.14% | 65.65% | 70.76% | | GPT4o Mini | 70.36% | 74.21% | 65.96% | **70.90%** | | Claude 3.5 Haiku | 70.16% | 73.45% | 67.05% | 70.00% | | Cohere Aya-Vision 32B | 62.15% | 64.70% | 57.61% | 64.12% | ### Long Context Evals | Model | LongBench v2 | RULER 32K | RULER 128K | |--------------------------------|-----------------|-------------|------------| | **Small 3.1 24B Instruct** | **37.18%** | **93.96%** | 81.20% | | Gemma 3 27B IT | 34.59% | 91.10% | 66.00% | | GPT4o Mini | 29.30% | 90.20% | 65.8% | | Claude 3.5 Haiku | 35.19% | 92.60% | **91.90%** | ## Basic Instruct Template (V7-Tekken) *, and are placeholders.* ***Please make sure to use mistral-common as the source of truth*** ## Usage The model can be used with the following frameworks; - []( See here **Note 1**: We recommend using a relatively low temperature, such as . **Note 2**: Make sure to add a system prompt to the model to best tailer it for your needs. If you want to use the model as a general assistant, we recommend the following system prompt: ### vLLM (recommended) We recommend using this model with the vLLM library to implement production-ready inference pipelines. **_Installation_** Make sure you install []( Doing so should automatically install []( To check: You can also make use of a ready-to-go docker image or on the docker hub. #### Server We recommand that you use Mistral-Small-3.1-24B-Instruct-2503 in a server/client setting. 1. Spin up a server: **Note:** Running Mistral-Small-3.1-24B-Instruct-2503 on GPU requires ~55 GB of GPU RAM in bf16 or fp16. 2. To ping the client you can use a simple Python snippet. ### Function calling Mistral-Small-3.1-24-Instruct-2503 is excellent at function / tool calling tasks via vLLM. *E.g.:*
Example
#### Offline # /\\_/\\ # ( o.o ) # > ^ < # ### Transformers (untested) Transformers-compatible model weights are also uploaded (thanks a lot @cyrilvallez). However the transformers implementation was **not throughly tested**, but only on \"vibe-checks\". Hence, we can only ensure 100% correct behavior when using the original weight format with vllm (see above).", + "model_explanation_gemini": "A 24B-parameter multilingual instruction-tuned model with vision capabilities, 128k context window, and strong reasoning/agentic functions for tasks like conversational AI, document analysis, and image understanding.\n\n**Features:** \n- **Vision:** Image analysis alongside text processing \n- **Multilingual:** Supports 25+ languages \n- **Agent-Centric:** Native function calling & JSON output \n- **Long Context:** 128k token window \n- **Performance:** Top-tier MMLU/" +} \ No newline at end of file diff --git a/model_data_json/mistralai_Mixtral-8x7B-Instruct-v0.1.json b/model_data_json/mistralai_Mixtral-8x7B-Instruct-v0.1.json new file mode 100644 index 0000000000000000000000000000000000000000..98caca3f4bc32052d943171dc0d9a8061a99903f --- /dev/null +++ b/model_data_json/mistralai_Mixtral-8x7B-Instruct-v0.1.json @@ -0,0 +1,25 @@ +{ + "model_id": "mistralai/Mixtral-8x7B-Instruct-v0.1", + "downloads": 477341, + "tags": [ + "transformers", + "safetensors", + "mixtral", + "text-generation", + "conversational", + "fr", + "it", + "de", + "es", + "en", + "base_model:mistralai/Mixtral-8x7B-v0.1", + "base_model:finetune:mistralai/Mixtral-8x7B-v0.1", + "license:apache-2.0", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: - fr - it - de - es - en license: apache-2.0 base_model: mistralai/Mixtral-8x7B-v0.1 inference: parameters: temperature: 0.5 widget: - messages: - role: user content: What is your favorite condiment? extra_gated_description: If you want to learn more about how we process your personal data, please read our
. --- # Model Card for Mixtral-8x7B ### Tokenization with ## Inference with ## Inference with hugging face > [!TIP] > PRs to correct the transformers tokenizer so that it gives 1-to-1 the same results as the mistral-common reference implementation are very welcome! --- The Mixtral-8x7B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts. The Mixtral-8x7B outperforms Llama 2 70B on most benchmarks we tested. For full details of this model please read our release blog post. ## Warning This repo contains weights that are compatible with vLLM serving of the model as well as Hugging Face transformers library. It is based on the original Mixtral torrent release, but the file format and parameter names are different. Please note that model cannot (yet) be instantiated with HF. ## Instruction format This format must be strictly respected, otherwise the model will generate sub-optimal outputs. The template used to build a prompt for the Instruct model is defined as follows: Note that and are special tokens for beginning of string (BOS) and end of string (EOS) while [INST] and [/INST] are regular strings. As reference, here is the pseudo-code used to tokenize instructions during fine-tuning: In the pseudo-code above, note that the method should not add a BOS or EOS token automatically, but should add a prefix space. In the Transformers library, one can use chat templates which make sure the right format is applied. ## Run the model By default, transformers will load the model in full precision. Therefore you might be interested to further reduce down the memory requirements to run the model through the optimizations we offer in HF ecosystem: ### In half-precision Note precision only works on GPU devices
Click to expand
### Lower precision using (8-bit & 4-bit) using
Click to expand
### Load the model with Flash Attention 2
Click to expand
## Limitations The Mixtral-8x7B Instruct model is a quick demonstration that the base model can be easily fine-tuned to achieve compelling performance. It does not have any moderation mechanisms. We're looking forward to engaging with the community on ways to make the model finely respect guardrails, allowing for deployment in environments requiring moderated outputs. # The Mistral AI Team Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Blanche Savary, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Emma Bou Hanna, Florian Bressand, Gianna Lengyel, Guillaume Bour, Guillaume Lample, Lélio Renard Lavaud, Louis Ternon, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Théophile Gervet, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed.", + "model_explanation_gemini": "A multilingual, instruction-tuned sparse mixture-of-experts model (Mixtral-8x7B-Instruct-v0.1) optimized for generative tasks, outperforming Llama 2 70B on benchmarks while supporting multiple languages (English, French, Italian, German, Spanish). \n\n**Features**: \n- **Sparse Mixture of Experts** architecture \n- **Multilingual** (fr, it, de, es, en) \n- **Instruction-tuned** for conversational" +} \ No newline at end of file diff --git a/model_data_json/mixedbread-ai_mxbai-rerank-large-v1.json b/model_data_json/mixedbread-ai_mxbai-rerank-large-v1.json new file mode 100644 index 0000000000000000000000000000000000000000..41e1c9f1f75055e8160da922dd843403a37e57c2 --- /dev/null +++ b/model_data_json/mixedbread-ai_mxbai-rerank-large-v1.json @@ -0,0 +1,22 @@ +{ + "model_id": "mixedbread-ai/mxbai-rerank-large-v1", + "downloads": 75127, + "tags": [ + "transformers", + "onnx", + "safetensors", + "deberta-v2", + "text-classification", + "reranker", + "transformers.js", + "sentence-transformers", + "text-ranking", + "en", + "license:apache-2.0", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- library_name: transformers tags: - reranker - transformers.js - sentence-transformers license: apache-2.0 language: - en pipeline_tag: text-ranking ---

The crispy rerank family from

🍞 Looking for a simple end-to-end retrieval solution? Meet Omni, our multimodal and multilingual model.

# mxbai-rerank-large-v1 This is the largest model in our family of powerful reranker models. You can learn more about the models in our blog post. We have three models: - mxbai-rerank-xsmall-v1 - mxbai-rerank-base-v1 - mxbai-rerank-large-v1 (🍞) ## Quickstart Currently, the best way to use our models is with the most recent version of sentence-transformers. Let's say you have a query, and you want to rerank a set of documents. You can do that with only one line of code:
JavaScript Example Install transformers.js Let's say you have a query, and you want to rerank a set of documents. In JavaScript, you need to add a function:
## Using API You can use the model via our API as follows: The API comes with additional features, such as a continous trained reranker! Check out the docs for more information. ## Evaluation Our reranker models are designed to elevate your search. They work extremely well in combination with keyword search and can even outperform semantic search systems in many cases. | Model | NDCG@10 | Accuracy@3 | | ------------------------------------------------------------------------------------- | -------- | ---------- | | Lexical Search (Lucene) | 38.0 | 66.4 | | BAAI/bge-reranker-base | 41.6 | 66.9 | | BAAI/bge-reranker-large | 45.2 | 70.6 | | cohere-embed-v3 (semantic search) | 47.5 | 70.9 | | mxbai-rerank-xsmall-v1 | **43.9** | **70.0** | | mxbai-rerank-base-v1 | **46.9** | **72.3** | | mxbai-rerank-large-v1 | **48.8** | **74.9** | The reported results are aggregated from 11 datasets of BEIR. We used Pyserini to evaluate the models. Find more in our blog-post and on this spreadsheet. ## Community Please join our Discord Community and share your feedback and thoughts! We are here to help and also always happy to chat. ## Citation ## License Apache 2.0", + "model_explanation_gemini": "\"Ranks text passages by relevance for retrieval tasks, optimized for English.\"\n\nFeatures: \n- Reranker model for text-ranking \n- Supports English language \n- Compatible with Transformers.js and sentence-transformers \n\nComparison: \nNo explicit comparison to other models is provided in the description." +} \ No newline at end of file diff --git a/model_data_json/mlx-community_DeepSeek-R1-4bit.json b/model_data_json/mlx-community_DeepSeek-R1-4bit.json new file mode 100644 index 0000000000000000000000000000000000000000..f54d5b312b483e557ef28c61cdba3da721760c29 --- /dev/null +++ b/model_data_json/mlx-community_DeepSeek-R1-4bit.json @@ -0,0 +1,16 @@ +{ + "model_id": "mlx-community/DeepSeek-R1-4bit", + "downloads": 74010, + "tags": [ + "mlx", + "safetensors", + "deepseek_v3", + "custom_code", + "base_model:deepseek-ai/DeepSeek-R1", + "base_model:quantized:deepseek-ai/DeepSeek-R1", + "4-bit", + "region:us" + ], + "description": "--- base_model: deepseek-ai/DeepSeek-R1 tags: - mlx --- # mlx-community/DeepSeek-R1-4bit The Model mlx-community/DeepSeek-R1-4bit was converted to MLX format from deepseek-ai/DeepSeek-R1 using mlx-lm version **0.21.0**. ## Use with mlx", + "model_explanation_gemini": "\"Converts the DeepSeek-R1 model to MLX format for compatibility with Apple's MLX framework.\"\n\nFeatures: \n- 4-bit quantization \n- Converted from deepseek-ai/DeepSeek-R1 \n- Uses mlx-lm version 0.21.0 \n- Compatible with MLX framework \n\nComparison: \nNo explicit comparison provided in the description." +} \ No newline at end of file diff --git a/model_data_json/mlx-community_DeepSeek-R1-Distill-Qwen-14B-4bit.json b/model_data_json/mlx-community_DeepSeek-R1-Distill-Qwen-14B-4bit.json new file mode 100644 index 0000000000000000000000000000000000000000..4c8415b1211dac8ba7e5e4924c2b92302a022f0f --- /dev/null +++ b/model_data_json/mlx-community_DeepSeek-R1-Distill-Qwen-14B-4bit.json @@ -0,0 +1,15 @@ +{ + "model_id": "mlx-community/DeepSeek-R1-Distill-Qwen-14B-4bit", + "downloads": 74030, + "tags": [ + "mlx", + "safetensors", + "qwen2", + "base_model:deepseek-ai/DeepSeek-R1-Distill-Qwen-14B", + "base_model:quantized:deepseek-ai/DeepSeek-R1-Distill-Qwen-14B", + "4-bit", + "region:us" + ], + "description": "--- base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B tags: - mlx --- # mlx-community/DeepSeek-R1-Distill-Qwen-14B-4bit The Model mlx-community/DeepSeek-R1-Distill-Qwen-14B-4bit was converted to MLX format from deepseek-ai/DeepSeek-R1-Distill-Qwen-14B using mlx-lm version **0.21.1**. ## Use with mlx", + "model_explanation_gemini": "A 4-bit quantized version of DeepSeek-R1-Distill-Qwen-14B converted to MLX format for efficient inference on Apple hardware.\n\nFeatures: \n- 4-bit quantization \n- Converted from DeepSeek-R1-Distill-Qwen-14B \n- Optimized for MLX framework \n- Uses mlx-lm version 0.21.1 \n\nComparison: \nThis model is a quantized MLX-compatible variant of DeepSeek" +} \ No newline at end of file diff --git a/model_data_json/mlx-community_DeepSeek-R1-Distill-Qwen-32B-4bit.json b/model_data_json/mlx-community_DeepSeek-R1-Distill-Qwen-32B-4bit.json new file mode 100644 index 0000000000000000000000000000000000000000..44d2b5fc45c3924825d74898db7f985934cf739d --- /dev/null +++ b/model_data_json/mlx-community_DeepSeek-R1-Distill-Qwen-32B-4bit.json @@ -0,0 +1,21 @@ +{ + "model_id": "mlx-community/DeepSeek-R1-Distill-Qwen-32B-4bit", + "downloads": 74600, + "tags": [ + "transformers", + "safetensors", + "qwen2", + "text-generation", + "mlx", + "conversational", + "base_model:deepseek-ai/DeepSeek-R1-Distill-Qwen-32B", + "base_model:quantized:deepseek-ai/DeepSeek-R1-Distill-Qwen-32B", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "4-bit", + "region:us" + ], + "description": "--- library_name: transformers base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B tags: - mlx --- # mlx-community/DeepSeek-R1-Distill-Qwen-32B-4bit The Model mlx-community/DeepSeek-R1-Distill-Qwen-32B-4bit was converted to MLX format from deepseek-ai/DeepSeek-R1-Distill-Qwen-32B using mlx-lm version **0.20.2**. ## Use with mlx", + "model_explanation_gemini": "A 4-bit quantized version of DeepSeek-R1-Distill-Qwen-32B converted to MLX format for efficient inference on Apple hardware. \n\nFeatures: \n- 4-bit quantization \n- Converted from DeepSeek-R1-Distill-Qwen-32B \n- Optimized for MLX (Apple hardware compatibility) \n\nComparison: \nThis model is a quantized MLX-compatible variant of DeepSeek-R1-Distill-Qwen-32" +} \ No newline at end of file diff --git a/model_data_json/mlx-community_DeepSeek-R1-Distill-Qwen-32B-bf16.json b/model_data_json/mlx-community_DeepSeek-R1-Distill-Qwen-32B-bf16.json new file mode 100644 index 0000000000000000000000000000000000000000..fca4c5eecf3be4c12582a372c2bba8eee5894272 --- /dev/null +++ b/model_data_json/mlx-community_DeepSeek-R1-Distill-Qwen-32B-bf16.json @@ -0,0 +1,14 @@ +{ + "model_id": "mlx-community/DeepSeek-R1-Distill-Qwen-32B-bf16", + "downloads": 73493, + "tags": [ + "mlx", + "safetensors", + "qwen2", + "base_model:deepseek-ai/DeepSeek-R1-Distill-Qwen-32B", + "base_model:finetune:deepseek-ai/DeepSeek-R1-Distill-Qwen-32B", + "region:us" + ], + "description": "--- base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B tags: - mlx --- # mlx-community/DeepSeek-R1-Distill-Qwen-32B-bf16 The Model mlx-community/DeepSeek-R1-Distill-Qwen-32B-bf16 was converted to MLX format from deepseek-ai/DeepSeek-R1-Distill-Qwen-32B using mlx-lm version **0.21.1**. ## Use with mlx", + "model_explanation_gemini": "A distilled version of the Qwen-32B model converted to MLX format for efficient inference on Apple hardware. \n\n**Features:** \n- Converted from `deepseek-ai/DeepSeek-R1-Distill-Qwen-32B` to MLX format \n- Optimized for Apple devices via MLX \n- Uses BF16 precision \n\n**Comparison:** \nThis model is a distilled and MLX-optimized variant of the original Qwen-32B, offering improved" +} \ No newline at end of file diff --git a/model_data_json/mlx-community_Qwen2.5-Coder-32B-Instruct-4bit.json b/model_data_json/mlx-community_Qwen2.5-Coder-32B-Instruct-4bit.json new file mode 100644 index 0000000000000000000000000000000000000000..d89099af725987c3bef124c1524114673b3ad734 --- /dev/null +++ b/model_data_json/mlx-community_Qwen2.5-Coder-32B-Instruct-4bit.json @@ -0,0 +1,28 @@ +{ + "model_id": "mlx-community/Qwen2.5-Coder-32B-Instruct-4bit", + "downloads": 74138, + "tags": [ + "transformers", + "safetensors", + "qwen2", + "text-generation", + "code", + "codeqwen", + "chat", + "qwen", + "qwen-coder", + "mlx", + "conversational", + "en", + "base_model:Qwen/Qwen2.5-Coder-32B-Instruct", + "base_model:quantized:Qwen/Qwen2.5-Coder-32B-Instruct", + "license:apache-2.0", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "4-bit", + "region:us" + ], + "description": "--- base_model: Qwen/Qwen2.5-Coder-32B-Instruct language: - en library_name: transformers license: apache-2.0 license_link: pipeline_tag: text-generation tags: - code - codeqwen - chat - qwen - qwen-coder - mlx --- # mlx-community/Qwen2.5-Coder-32B-Instruct-4bit The Model mlx-community/Qwen2.5-Coder-32B-Instruct-4bit was converted to MLX format from Qwen/Qwen2.5-Coder-32B-Instruct using mlx-lm version **0.19.3**. ## Use with mlx", + "model_explanation_gemini": "A 4-bit quantized version of Qwen2.5-Coder-32B-Instruct converted to MLX format for efficient text generation, specialized in code-related tasks. \n\n**Features:** \n- Base model: Qwen/Qwen2.5-Coder-32B-Instruct \n- Quantization: 4-bit \n- Format: MLX (optimized for Apple hardware) \n- Task: Text generation (focused on coding/chat) \n- License: Apache-" +} \ No newline at end of file diff --git a/model_data_json/mosaicml_mpt-7b-chat.json b/model_data_json/mosaicml_mpt-7b-chat.json new file mode 100644 index 0000000000000000000000000000000000000000..aa037002e962ce4fb43515bf20f96dc2ab1262b1 --- /dev/null +++ b/model_data_json/mosaicml_mpt-7b-chat.json @@ -0,0 +1,28 @@ +{ + "model_id": "mosaicml/mpt-7b-chat", + "downloads": 80028, + "tags": [ + "transformers", + "pytorch", + "mpt", + "text-generation", + "Composer", + "MosaicML", + "llm-foundry", + "custom_code", + "dataset:jeffwan/sharegpt_vicuna", + "dataset:Hello-SimpleAI/HC3", + "dataset:tatsu-lab/alpaca", + "dataset:Anthropic/hh-rlhf", + "dataset:victor123/evol_instruct_70k", + "arxiv:2205.14135", + "arxiv:2108.12409", + "arxiv:2010.04245", + "license:cc-by-nc-sa-4.0", + "autotrain_compatible", + "text-generation-inference", + "region:us" + ], + "description": "--- license: cc-by-nc-sa-4.0 datasets: - jeffwan/sharegpt_vicuna - Hello-SimpleAI/HC3 - tatsu-lab/alpaca - Anthropic/hh-rlhf - victor123/evol_instruct_70k tags: - Composer - MosaicML - llm-foundry inference: false --- # MPT-7B-Chat MPT-7B-Chat is a chatbot-like model for dialogue generation. It was built by finetuning MPT-7B on the ShareGPT-Vicuna, HC3, Alpaca, HH-RLHF, and Evol-Instruct datasets. * License: _CC-By-NC-SA-4.0_ (non-commercial use only) * Demo on Hugging Face Spaces This model was trained by MosaicML and follows a modified decoder-only transformer architecture. ## Model Date May 5, 2023 ## Model License _CC-By-NC-SA-4.0_ (non-commercial use only) ## Documentation * Blog post: Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs * Codebase (mosaicml/llm-foundry repo) * Questions: Feel free to contact us via the MosaicML Community Slack! ### Example Dialogue **SamIAm85**: >I want you to come up with a tweet based on this summary of the article: >\"Introducing MPT-7B, the latest entry in our MosaicML Foundation Series. >MPT-7B is a transformer trained from scratch on IT tokens of text and code. >It is open source and it matches the quality of LLaMA-7B. >MPT-7B was trained on the MosaicML platform in 9.5 days with zero human intervention at a cost of ~$200k. >Starting today, you can train, finetune, and deploy your own private MPT models, either starting from one of our checkpoints or training from scratch. >For inspiration, we are also releasing three finetuned models in addition to the base MPT-7B: MPT-7B-Instruct, MPT-7B-Chat, and MPT-7B-Story Writer-65k+, the last of which uses a context length of 65k tokens! **MPT-7B-Instruct**: >Here's my suggestion for a tweet: >\"Exciting news! @MosaicML has launched #MPT_7B - a new standard for open-source LLMs. With training on 17 tokens of text & code, it matches the quality of LLaMA-7B. Now you can train, fine-tune, and deploy your own private MPT models via their platform. Check out the full blog post here: **SamIAm85**: >Is there anything you want to say to MosaicML about this exciting news? **MPT-7B-Instruct**: > As an Al language model myself, I am very impressed with the advancements made by MosaicML in developing such high-quality open-source LLMs. > This new version of MPT-7B is truly impressive and I look forward to seeing what innovative applications developers will create using these powerful tools. > Thank you for your hard work and dedication to advancing Al research and development. ## How to Use This model is best used with the MosaicML llm-foundry repository for training and finetuning. Note: This model requires that be passed to the method. This is because we use a custom model architecture that is not yet part of the Hugging Face package. includes options for many training efficiency features such as FlashAttention, ALiBi, QK LayerNorm, and more. To use the optimized triton implementation of FlashAttention, you can load the model on GPU () with and with precision: Although the model was trained with a sequence length of 2048, ALiBi enables users to increase the maximum sequence length during finetuning and/or inference. For example: This model was trained with the EleutherAI/gpt-neox-20b tokenizer. The model can then be used, for example, within a text-generation pipeline. Note: when running Torch modules in lower precision, it is best practice to use the torch.autocast context manager. ## Model Description The architecture is a modification of a standard decoder-only transformer. The model has been modified from a standard transformer in the following ways: * It uses FlashAttention * It uses ALiBi (Attention with Linear Biases) and does not use positional embeddings * It does not use biases | Hyperparameter | Value | |----------------|-------| |n_parameters | 6.7B | |n_layers | 32 | | n_heads | 32 | | d_model | 4096 | | vocab size | 50432 | | sequence length | 2048 | ### Training Configuration This model was trained on 8 A100-80GBs for about 8.2 hours, followed by training for 6.7 hours on 32 A100-40GBs using the MosaicML Platform. The model was trained with sharded data parallelism using FSDP and used the AdamW optimizer. ## Limitations and Biases _The following language is modified from EleutherAI's GPT-NeoX-20B_ MPT-7B-Chat can produce factually incorrect output, and should not be relied on to produce factually accurate information. MPT-7B-Chat was trained on various public datasets. While great efforts have been taken to clean the pretraining data, it is possible that this model could generate lewd, biased or otherwise offensive outputs. ## Acknowledgements This model was finetuned by Sam Havens and the MosaicML NLP team ## Disclaimer The license on this model does not constitute legal advice. We are not responsible for the actions of third parties who use this model. Please cosult an attorney before using this model for commercial purposes. ## MosaicML Platform If you're interested in training and deploying your own MPT or LLMs on the MosaicML Platform, sign up here. ## Citation Please cite this model using the following format:", + "model_explanation_gemini": "\"MPT-7B-Chat is a dialogue-generation model fine-tuned for chatbot-like interactions using datasets including ShareGPT-Vicuna, Alpaca, and HH-RLHF, featuring FlashAttention and ALiBi for extended context handling.\"\n\n**Features**: \n- Fine-tuned on conversational datasets (ShareGPT-Vicuna, HC3, Alpaca, etc.) \n- Modified decoder-only transformer architecture \n- Supports FlashAttention and ALiBi for efficiency and longer sequences (" +} \ No newline at end of file diff --git a/model_data_json/mrm8488_bert-spanish-cased-finetuned-ner.json b/model_data_json/mrm8488_bert-spanish-cased-finetuned-ner.json new file mode 100644 index 0000000000000000000000000000000000000000..d20a0e0f101b9f42e1d9ec77995b008cd4a709b3 --- /dev/null +++ b/model_data_json/mrm8488_bert-spanish-cased-finetuned-ner.json @@ -0,0 +1,17 @@ +{ + "model_id": "mrm8488/bert-spanish-cased-finetuned-ner", + "downloads": 81872, + "tags": [ + "transformers", + "pytorch", + "jax", + "bert", + "token-classification", + "es", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: es thumbnail: --- # Spanish BERT (BETO) + NER This model is a fine-tuned on NER-C version of the Spanish BERT cased (BETO) for **NER** downstream task. ## Details of the downstream task (NER) - Dataset - Dataset: CONLL Corpora ES I preprocessed the dataset and split it as train / dev (80/20) | Dataset | # Examples | | ---------------------- | ----- | | Train | 8.7 K | | Dev | 2.2 K | - Fine-tune on NER script provided by Huggingface - Labels covered: ## Metrics on evaluation set: | Metric | # score | | :------------------------------------------------------------------------------------: | :-------: | | F1 | **90.17** | Precision | **89.86** | | Recall | **90.47** | ## Comparison: | Model | # F1 score |Size(MB)| | :--------------------------------------------------------------------------------------------------------------: | :-------: |:------| | bert-base-spanish-wwm-cased (BETO) | 88.43 | 421 | bert-spanish-cased-finetuned-ner (this one) | **90.17** | 420 | | Best Multilingual BERT | 87.38 | 681 | |TinyBERT-spanish-uncased-finetuned-ner | 70.00 | **55** | ## Model in action Fast usage with **pipelines**: > Created by Manuel Romero/@mrm8488 > Made with in Spain", + "model_explanation_gemini": "Fine-tuned Spanish BERT (BETO) model for named entity recognition (NER) with a 90.17 F1 score on the CONLL Corpora ES dataset. \n\n**Features:** \n- Language: Spanish (es) \n- Task: Named Entity Recognition (NER) \n- Base Model: Spanish BERT cased (BETO) \n- Dataset: CONLL Corpora ES (8.7K train, 2.2K dev) \n- Metrics" +} \ No newline at end of file diff --git a/model_data_json/naufalihsan_indonesian-sbert-large.json b/model_data_json/naufalihsan_indonesian-sbert-large.json new file mode 100644 index 0000000000000000000000000000000000000000..008364931251e2abd5ea11e2f05d8cd7406411d0 --- /dev/null +++ b/model_data_json/naufalihsan_indonesian-sbert-large.json @@ -0,0 +1,18 @@ +{ + "model_id": "naufalihsan/indonesian-sbert-large", + "downloads": 82294, + "tags": [ + "sentence-transformers", + "pytorch", + "bert", + "feature-extraction", + "sentence-similarity", + "transformers", + "autotrain_compatible", + "text-embeddings-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- pipeline_tag: sentence-similarity tags: - sentence-transformers - feature-extraction - sentence-similarity - transformers --- # {MODEL_NAME} This is a sentence-transformers model: It maps sentences & paragraphs to a 1024 dimensional dense vector space and can be used for tasks like clustering or semantic search. ## Usage (Sentence-Transformers) Using this model becomes easy when you have sentence-transformers installed: Then you can use the model like this: ## Usage (HuggingFace Transformers) Without sentence-transformers, you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings. ## Evaluation Results For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: ## Training The model was trained with the parameters: **DataLoader**: of length 360 with parameters: **Loss**: Parameters of the fit()-Method: ## Full Model Architecture ## Citing & Authors ", + "model_explanation_gemini": "Maps sentences and paragraphs to 1024-dimensional vectors for tasks like clustering and semantic search.\n\nModel Features: \n- Sentence/paragraph embeddings in 1024-dimensional space \n- Compatible with sentence-transformers and HuggingFace Transformers \n- Supports clustering and semantic search tasks \n\nComparative Explanation: \nNo explicit comparison to other models is provided in the description." +} \ No newline at end of file diff --git a/model_data_json/ncbi_MedCPT-Query-Encoder.json b/model_data_json/ncbi_MedCPT-Query-Encoder.json new file mode 100644 index 0000000000000000000000000000000000000000..c07957d85dc05e1e730c8ef315f1e1f8b8a9e788 --- /dev/null +++ b/model_data_json/ncbi_MedCPT-Query-Encoder.json @@ -0,0 +1,18 @@ +{ + "model_id": "ncbi/MedCPT-Query-Encoder", + "downloads": 76693, + "tags": [ + "transformers", + "pytorch", + "safetensors", + "bert", + "feature-extraction", + "arxiv:2307.00589", + "license:other", + "text-embeddings-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: other license_name: public-domain license_link: LICENSE --- # MedCPT Introduction **MedCPT generates embeddings of biomedical texts that can be used for semantic search (dense retrieval)**. The model contains two encoders: - MedCPT Query Encoder: compute the embeddings of short texts (e.g., questions, search queries, sentences). - MedCPT Article Encoder: compute the embeddings of articles (e.g., PubMed titles & abstracts). **This repo contains the MedCPT Query Encoder.** **MedCPT has been pre-trained by an unprecedented scale of 255M query-article pairs from PubMed search logs**, and has been shown to achieve state-of-the-art performance on several zero-shot biomedical IR datasets. In general, there are three use cases: 1. Query-to-article search with both encoders. 2. Query representation for clustering or query-to-query search with the query encoder. 3. Article representation for clustering or article-to-article search with the article encoder. For more details, please check out our paper (Bioinformatics, 2023). Please note that the released version is slightly different from the version reported in the paper. # Case 1. Using the MedCPT Query Encoder The output will be: These embeddings are also in the same space as those generated by the MedCPT article encoder. # Case 2. Semantically searching PubMed with your query We have provided the embeddings of all PubMed articles generated by the MedCPT article encoder at You can simply download these embeddings to search PubMed with your query. # Acknowledgments This work was supported by the Intramural Research Programs of the National Institutes of Health, National Library of Medicine. # Disclaimer This tool shows the results of research conducted in the Computational Biology Branch, NCBI/NLM. The information produced on this website is not intended for direct diagnostic use or medical decision-making without review and oversight by a clinical professional. Individuals should not change their health behavior solely on the basis of information produced on this website. NIH does not independently verify the validity or utility of the information produced by this tool. If you have questions about the information produced on this website, please see a health care professional. More information about NCBI's disclaimer policy is available. # Citation If you find this repo helpful, please cite MedCPT by:", + "model_explanation_gemini": "Generates embeddings for short biomedical texts to enable semantic search and clustering, specifically designed for queries and compatible with article embeddings from its counterpart model.\n\nModel Features: \n- Encodes short biomedical texts (queries, questions, sentences) \n- Pre-trained on 255M PubMed query-article pairs \n- Enables query-to-article, query-to-query, and clustering tasks \n- Embeddings align with those from the MedCPT Article Encoder \n- State-of-the-art zero-shot performance" +} \ No newline at end of file diff --git a/model_data_json/nickprock_multi-sentence-BERTino.json b/model_data_json/nickprock_multi-sentence-BERTino.json new file mode 100644 index 0000000000000000000000000000000000000000..ab71d3e69225d828e1992aa66c6a9b1a7615c9da --- /dev/null +++ b/model_data_json/nickprock_multi-sentence-BERTino.json @@ -0,0 +1,23 @@ +{ + "model_id": "nickprock/multi-sentence-BERTino", + "downloads": 82414, + "tags": [ + "sentence-transformers", + "onnx", + "safetensors", + "distilbert", + "feature-extraction", + "sentence-similarity", + "transformers", + "it", + "dataset:stsb_multi_mt", + "dataset:unicamp-dl/mmarco", + "license:mit", + "autotrain_compatible", + "text-embeddings-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- pipeline_tag: sentence-similarity tags: - sentence-transformers - feature-extraction - sentence-similarity - transformers license: mit datasets: - stsb_multi_mt - unicamp-dl/mmarco language: - it library_name: sentence-transformers --- # {multi-sentence-BERTino} This is a sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search. This model is trained from indigo-ai/BERTino using mmarco italian (200K) and stsb italian. ## Usage (Sentence-Transformers) Using this model becomes easy when you have sentence-transformers installed: Then you can use the model like this: ## Usage (FastEmbed) Using this model becomes easy when you have FastEmbed installed: Then you can use the model like this: ## Usage (HuggingFace Transformers) Without sentence-transformers, you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings. ## Evaluation Results For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: ## Training The model was trained with the parameters: **DataLoader**: of length 31250 with parameters: **Loss**: with parameters: **DataLoader**: of length 360 with parameters: **Loss**: **DataLoader**: of length 31250 with parameters: **Loss**: with parameters: Parameters of the fit()-Method: ## Full Model Architecture ## Citing & Authors ", + "model_explanation_gemini": "Maps sentences and paragraphs to 768-dimensional vectors for tasks like clustering and semantic search, trained on Italian datasets (mmarco and stsb) using BERTino.\n\nFeatures: \n- Sentence/paragraph embeddings (768-dimensional) \n- Supports clustering and semantic search \n- Trained on Italian datasets (mmarco, stsb) \n- Based on BERTino architecture \n\nComparison: \nSimilar to other sentence-transformers but specifically fine-tuned for Italian language tasks, leveraging BERT" +} \ No newline at end of file diff --git a/model_data_json/nomic-ai_modernbert-embed-base.json b/model_data_json/nomic-ai_modernbert-embed-base.json new file mode 100644 index 0000000000000000000000000000000000000000..574c8da11e177725bf79c78d7046303c42d564a5 --- /dev/null +++ b/model_data_json/nomic-ai_modernbert-embed-base.json @@ -0,0 +1,25 @@ +{ + "model_id": "nomic-ai/modernbert-embed-base", + "downloads": 77956, + "tags": [ + "sentence-transformers", + "onnx", + "safetensors", + "modernbert", + "feature-extraction", + "sentence-similarity", + "mteb", + "transformers.js", + "en", + "arxiv:2402.01613", + "base_model:answerdotai/ModernBERT-base", + "base_model:finetune:answerdotai/ModernBERT-base", + "license:apache-2.0", + "model-index", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- pipeline_tag: sentence-similarity tags: - sentence-transformers - feature-extraction - sentence-similarity - mteb - transformers.js model-index: - name: binarize_False results: - task: type: Classification dataset: type: None name: MTEB AmazonCounterfactualClassification (en) config: en split: test revision: e8379541af4e31359cca9fbcf4b00f2671dba205 metrics: - type: accuracy value: 78.13432835820896 - type: ap value: 42.190424731303246 - type: f1 value: 72.34446401534811 - task: type: Classification dataset: type: None name: MTEB AmazonPolarityClassification config: default split: test revision: e2d317d38cd51312af73b3d32a06d1a08b442046 metrics: - type: accuracy value: 93.093825 - type: ap value: 90.03727505544286 - type: f1 value: 93.0874055138833 - task: type: Classification dataset: type: None name: MTEB AmazonReviewsClassification (en) config: en split: test revision: 1399c76144fd37290681b995c656ef9b2e06e26d metrics: - type: accuracy value: 48.428000000000004 - type: f1 value: 47.74311520203536 - task: type: Retrieval dataset: type: None name: MTEB ArguAna config: default split: test revision: c22ab2a51041ffd869aaddef7af8d8215647e41a metrics: - type: map_at_1 value: 23.898 - type: map_at_10 value: 39.775 - type: map_at_100 value: 40.827000000000005 - type: map_at_1000 value: 40.837 - type: map_at_20 value: 40.604 - type: map_at_3 value: 34.519 - type: map_at_5 value: 37.307 - type: mrr_at_1 value: 24.395 - type: mrr_at_10 value: 39.963 - type: mrr_at_100 value: 41.014 - type: mrr_at_1000 value: 41.024 - type: mrr_at_20 value: 40.791 - type: mrr_at_3 value: 34.732 - type: mrr_at_5 value: 37.480999999999995 - type: ndcg_at_1 value: 23.898 - type: ndcg_at_10 value: 48.962 - type: ndcg_at_100 value: 53.386 - type: ndcg_at_1000 value: 53.634 - type: ndcg_at_20 value: 51.898999999999994 - type: ndcg_at_3 value: 38.034 - type: ndcg_at_5 value: 43.036 - type: precision_at_1 value: 23.898 - type: precision_at_10 value: 7.852 - type: precision_at_100 value: 0.9769999999999999 - type: precision_at_1000 value: 0.1 - type: precision_at_20 value: 4.4990000000000006 - type: precision_at_3 value: 16.073999999999998 - type: precision_at_5 value: 12.063 - type: recall_at_1 value: 23.898 - type: recall_at_10 value: 78.521 - type: recall_at_100 value: 97.724 - type: recall_at_1000 value: 99.644 - type: recall_at_20 value: 89.972 - type: recall_at_3 value: 48.222 - type: recall_at_5 value: 60.313 - task: type: Clustering dataset: type: None name: MTEB ArxivClusteringP2P config: default split: test revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d metrics: - type: v_measure value: 47.69067314293749 - type: v_measures value: [0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413, 0.4953006738713271, 0.500982950617211, 0.490168788349858, 0.4924060458428337, 0.475176328561399, 0.47446297663785564, 0.46948807073019405, 0.4772028638329531, 0.48735189935310713, 0.48641173887761663, 0.5575029526712674, 0.5574020390232136, 0.5536066904942645, 0.5536169413675474, 0.5566938602585987, 0.5561143054736898, 0.561846457174852, 0.5511643632282144, 0.5514762015499715, 0.551824471283655, 0.5148077891863135, 0.29015461701593837, 0.4430422977323321, 0.40857527197890686, 0.3479983114229163, 0.27582001934225003, 0.29595564003512503, 0.22528676611734755, 0.3073271865740206, 1.0, 0.2749401557058413] - task: type: Clustering dataset: type: None name: MTEB ArxivClusteringS2S config: default split: test revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53 metrics: - type: v_measure value: 38.0916537995626 - type: v_measures value: [0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377, 0.37814352051854533, 0.39235658929084877, 0.3871170834588581, 0.4042678213739614, 0.3918486409557737, 0.38473003463452093, 0.35622070034791886, 0.3911472272128115, 0.3986923912337426, 0.39040109467533013, 0.4370949482641744, 0.4414023630938724, 0.4351473848532441, 0.4401176389499172, 0.4423731097742471, 0.438309696145818, 0.43410597641884624, 0.43900908630646696, 0.44081346534023286, 0.4386000014888906, 0.4047539306032343, 0.21697191913450847, 0.29241358200068185, 0.3390740154458194, 0.2793967439904601, 0.20383792346854981, 0.23904022437429004, 0.14733601126565044, 0.22946888289524586, 1.0, 0.19422067034794377] - task: type: Reranking dataset: type: None name: MTEB AskUbuntuDupQuestions config: default split: test revision: 2000358ca161889fa9c082cb41daa8dcfb161a54 metrics: - type: map value: 62.33195643912506 - type: mrr value: 76.43978366970057 - task: type: STS dataset: type: None name: MTEB BIOSSES config: default split: test revision: d3fb88f8f02e40887cd149695127462bbcf29b4a metrics: - type: cos_sim_pearson value: 81.20285894915236 - type: cos_sim_spearman value: 78.16322678527897 - type: euclidean_pearson value: 80.6118408638417 - type: euclidean_spearman value: 78.19033583671204 - type: manhattan_pearson value: 80.41282660275819 - type: manhattan_spearman value: 77.98611431591628 - task: type: Classification dataset: type: None name: MTEB Banking77Classification config: default split: test revision: 0fd18e25b25c072e09e0d92ab615fda904d66300 metrics: - type: accuracy value: 85.25324675324676 - type: f1 value: 85.19854235582687 - task: type: Clustering dataset: type: None name: MTEB BiorxivClusteringP2P config: default split: test revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40 metrics: - type: v_measure value: 39.65216461057432 - type: v_measures value: [0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885, 0.409550367831406, 0.3943451642663655, 0.38843873187080014, 0.40032616646112934, 0.3956833025503425, 0.3842865397042604, 0.3950585966936957, 0.41669832667987455, 0.39790986378306964, 0.3829194012164885] - task: type: Clustering dataset: type: None name: MTEB BiorxivClusteringS2S config: default split: test revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908 metrics: - type: v_measure value: 33.28787287895752 - type: v_measures value: [0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306, 0.3235019092705102, 0.34053753555843735, 0.32485572754337366, 0.3149662563474906, 0.3326837187664875, 0.3229632335470733, 0.33078383561261365, 0.35111148393509534, 0.33383133843449825, 0.35355224888017306] - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackAndroidRetrieval config: default split: test revision: f46a197baaae43b4f621051089b82a364682dfeb metrics: - type: map_at_1 value: 32.677 - type: map_at_10 value: 43.739 - type: map_at_100 value: 45.152 - type: map_at_1000 value: 45.279 - type: map_at_20 value: 44.553 - type: map_at_3 value: 40.321 - type: map_at_5 value: 42.201 - type: mrr_at_1 value: 40.2 - type: mrr_at_10 value: 49.755 - type: mrr_at_100 value: 50.468 - type: mrr_at_1000 value: 50.513 - type: mrr_at_20 value: 50.192 - type: mrr_at_3 value: 47.163 - type: mrr_at_5 value: 48.686 - type: ndcg_at_1 value: 40.2 - type: ndcg_at_10 value: 49.963 - type: ndcg_at_100 value: 54.978 - type: ndcg_at_1000 value: 56.979 - type: ndcg_at_20 value: 51.983000000000004 - type: ndcg_at_3 value: 45.086999999999996 - type: ndcg_at_5 value: 47.309 - type: precision_at_1 value: 40.2 - type: precision_at_10 value: 9.328 - type: precision_at_100 value: 1.443 - type: precision_at_1000 value: 0.19 - type: precision_at_20 value: 5.558 - type: precision_at_3 value: 21.364 - type: precision_at_5 value: 15.222 - type: recall_at_1 value: 32.677 - type: recall_at_10 value: 61.71 - type: recall_at_100 value: 82.431 - type: recall_at_1000 value: 94.896 - type: recall_at_20 value: 68.73700000000001 - type: recall_at_3 value: 47.431 - type: recall_at_5 value: 53.739000000000004 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackEnglishRetrieval config: default split: test revision: ad9991cb51e31e31e430383c75ffb2885547b5f0 metrics: - type: map_at_1 value: 32.71 - type: map_at_10 value: 43.297000000000004 - type: map_at_100 value: 44.607 - type: map_at_1000 value: 44.729 - type: map_at_20 value: 44.013999999999996 - type: map_at_3 value: 40.213 - type: map_at_5 value: 42.004000000000005 - type: mrr_at_1 value: 40.892 - type: mrr_at_10 value: 49.394 - type: mrr_at_100 value: 50.005 - type: mrr_at_1000 value: 50.043000000000006 - type: mrr_at_20 value: 49.764 - type: mrr_at_3 value: 47.134 - type: mrr_at_5 value: 48.522 - type: ndcg_at_1 value: 40.892 - type: ndcg_at_10 value: 49.047000000000004 - type: ndcg_at_100 value: 53.266999999999996 - type: ndcg_at_1000 value: 55.096999999999994 - type: ndcg_at_20 value: 50.707 - type: ndcg_at_3 value: 44.896 - type: ndcg_at_5 value: 46.983000000000004 - type: precision_at_1 value: 40.892 - type: precision_at_10 value: 9.293 - type: precision_at_100 value: 1.473 - type: precision_at_1000 value: 0.192 - type: precision_at_20 value: 5.446 - type: precision_at_3 value: 21.592 - type: precision_at_5 value: 15.540999999999999 - type: recall_at_1 value: 32.71 - type: recall_at_10 value: 58.592999999999996 - type: recall_at_100 value: 76.242 - type: recall_at_1000 value: 87.717 - type: recall_at_20 value: 64.646 - type: recall_at_3 value: 46.253 - type: recall_at_5 value: 51.946999999999996 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackGamingRetrieval config: default split: test revision: 4885aa143210c98657558c04aaf3dc47cfb54340 metrics: - type: map_at_1 value: 41.644999999999996 - type: map_at_10 value: 53.825 - type: map_at_100 value: 54.82 - type: map_at_1000 value: 54.87499999999999 - type: map_at_20 value: 54.43 - type: map_at_3 value: 50.705 - type: map_at_5 value: 52.501 - type: mrr_at_1 value: 47.524 - type: mrr_at_10 value: 57.260999999999996 - type: mrr_at_100 value: 57.902 - type: mrr_at_1000 value: 57.931999999999995 - type: mrr_at_20 value: 57.689 - type: mrr_at_3 value: 55.089 - type: mrr_at_5 value: 56.38999999999999 - type: ndcg_at_1 value: 47.524 - type: ndcg_at_10 value: 59.41499999999999 - type: ndcg_at_100 value: 63.258 - type: ndcg_at_1000 value: 64.376 - type: ndcg_at_20 value: 61.149 - type: ndcg_at_3 value: 54.381 - type: ndcg_at_5 value: 56.89999999999999 - type: precision_at_1 value: 47.524 - type: precision_at_10 value: 9.386 - type: precision_at_100 value: 1.221 - type: precision_at_1000 value: 0.136 - type: precision_at_20 value: 5.223 - type: precision_at_3 value: 24.096 - type: precision_at_5 value: 16.364 - type: recall_at_1 value: 41.644999999999996 - type: recall_at_10 value: 72.386 - type: recall_at_100 value: 88.794 - type: recall_at_1000 value: 96.75399999999999 - type: recall_at_20 value: 78.74 - type: recall_at_3 value: 59.028000000000006 - type: recall_at_5 value: 65.197 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackGisRetrieval config: default split: test revision: 5003b3064772da1887988e05400cf3806fe491f2 metrics: - type: map_at_1 value: 28.648 - type: map_at_10 value: 36.388999999999996 - type: map_at_100 value: 37.372 - type: map_at_1000 value: 37.457 - type: map_at_20 value: 36.912 - type: map_at_3 value: 34.076 - type: map_at_5 value: 35.415 - type: mrr_at_1 value: 30.508000000000003 - type: mrr_at_10 value: 38.132 - type: mrr_at_100 value: 39.04 - type: mrr_at_1000 value: 39.106 - type: mrr_at_20 value: 38.643 - type: mrr_at_3 value: 35.876000000000005 - type: mrr_at_5 value: 37.208999999999996 - type: ndcg_at_1 value: 30.508000000000003 - type: ndcg_at_10 value: 40.762 - type: ndcg_at_100 value: 45.732 - type: ndcg_at_1000 value: 47.799 - type: ndcg_at_20 value: 42.591 - type: ndcg_at_3 value: 36.266999999999996 - type: ndcg_at_5 value: 38.58 - type: precision_at_1 value: 30.508000000000003 - type: precision_at_10 value: 6.010999999999999 - type: precision_at_100 value: 0.897 - type: precision_at_1000 value: 0.11100000000000002 - type: precision_at_20 value: 3.412 - type: precision_at_3 value: 14.991 - type: precision_at_5 value: 10.328 - type: recall_at_1 value: 28.648 - type: recall_at_10 value: 52.342999999999996 - type: recall_at_100 value: 75.268 - type: recall_at_1000 value: 90.641 - type: recall_at_20 value: 59.303 - type: recall_at_3 value: 40.447 - type: recall_at_5 value: 46.117000000000004 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackMathematicaRetrieval config: default split: test revision: 90fceea13679c63fe563ded68f3b6f06e50061de metrics: - type: map_at_1 value: 18.476 - type: map_at_10 value: 27.148 - type: map_at_100 value: 28.317999999999998 - type: map_at_1000 value: 28.427999999999997 - type: map_at_20 value: 27.764 - type: map_at_3 value: 24.801000000000002 - type: map_at_5 value: 26.133 - type: mrr_at_1 value: 22.886 - type: mrr_at_10 value: 31.741000000000003 - type: mrr_at_100 value: 32.708 - type: mrr_at_1000 value: 32.769 - type: mrr_at_20 value: 32.296 - type: mrr_at_3 value: 29.498 - type: mrr_at_5 value: 30.773 - type: ndcg_at_1 value: 22.886 - type: ndcg_at_10 value: 32.265 - type: ndcg_at_100 value: 37.829 - type: ndcg_at_1000 value: 40.558 - type: ndcg_at_20 value: 34.372 - type: ndcg_at_3 value: 28.105000000000004 - type: ndcg_at_5 value: 30.04 - type: precision_at_1 value: 22.886 - type: precision_at_10 value: 5.808 - type: precision_at_100 value: 0.985 - type: precision_at_1000 value: 0.13699999999999998 - type: precision_at_20 value: 3.495 - type: precision_at_3 value: 13.639999999999999 - type: precision_at_5 value: 9.577 - type: recall_at_1 value: 18.476 - type: recall_at_10 value: 43.442 - type: recall_at_100 value: 67.376 - type: recall_at_1000 value: 86.874 - type: recall_at_20 value: 51.038 - type: recall_at_3 value: 31.785999999999998 - type: recall_at_5 value: 36.858999999999995 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackPhysicsRetrieval config: default split: test revision: 79531abbd1fb92d06c6d6315a0cbbbf5bb247ea4 metrics: - type: map_at_1 value: 29.098000000000003 - type: map_at_10 value: 38.97 - type: map_at_100 value: 40.293 - type: map_at_1000 value: 40.397 - type: map_at_20 value: 39.778999999999996 - type: map_at_3 value: 35.723 - type: map_at_5 value: 37.519999999999996 - type: mrr_at_1 value: 35.515 - type: mrr_at_10 value: 44.55 - type: mrr_at_100 value: 45.37 - type: mrr_at_1000 value: 45.412 - type: mrr_at_20 value: 45.054 - type: mrr_at_3 value: 41.835 - type: mrr_at_5 value: 43.356 - type: ndcg_at_1 value: 35.515 - type: ndcg_at_10 value: 44.91 - type: ndcg_at_100 value: 50.27700000000001 - type: ndcg_at_1000 value: 52.215 - type: ndcg_at_20 value: 47.235 - type: ndcg_at_3 value: 39.505 - type: ndcg_at_5 value: 42.016 - type: precision_at_1 value: 35.515 - type: precision_at_10 value: 8.152 - type: precision_at_100 value: 1.262 - type: precision_at_1000 value: 0.16 - type: precision_at_20 value: 4.851 - type: precision_at_3 value: 18.447 - type: precision_at_5 value: 13.321 - type: recall_at_1 value: 29.098000000000003 - type: recall_at_10 value: 57.115 - type: recall_at_100 value: 79.467 - type: recall_at_1000 value: 92.162 - type: recall_at_20 value: 65.161 - type: recall_at_3 value: 42.254000000000005 - type: recall_at_5 value: 48.415 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackProgrammersRetrieval config: default split: test revision: 6184bc1440d2dbc7612be22b50686b8826d22b32 metrics: - type: map_at_1 value: 27.372000000000003 - type: map_at_10 value: 37.781 - type: map_at_100 value: 39.128 - type: map_at_1000 value: 39.238 - type: map_at_20 value: 38.592 - type: map_at_3 value: 34.782999999999994 - type: map_at_5 value: 36.466 - type: mrr_at_1 value: 33.904 - type: mrr_at_10 value: 43.15 - type: mrr_at_100 value: 44.049 - type: mrr_at_1000 value: 44.107 - type: mrr_at_20 value: 43.721 - type: mrr_at_3 value: 40.677 - type: mrr_at_5 value: 42.19 - type: ndcg_at_1 value: 33.904 - type: ndcg_at_10 value: 43.527 - type: ndcg_at_100 value: 49.004999999999995 - type: ndcg_at_1000 value: 51.276999999999994 - type: ndcg_at_20 value: 45.988 - type: ndcg_at_3 value: 38.824999999999996 - type: ndcg_at_5 value: 41.04 - type: precision_at_1 value: 33.904 - type: precision_at_10 value: 7.854 - type: precision_at_100 value: 1.2309999999999999 - type: precision_at_1000 value: 0.16 - type: precision_at_20 value: 4.692 - type: precision_at_3 value: 18.531 - type: precision_at_5 value: 13.150999999999998 - type: recall_at_1 value: 27.372000000000003 - type: recall_at_10 value: 55.245999999999995 - type: recall_at_100 value: 78.278 - type: recall_at_1000 value: 93.718 - type: recall_at_20 value: 64.095 - type: recall_at_3 value: 41.665 - type: recall_at_5 value: 47.632000000000005 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackRetrieval config: default split: test revision: f46a197baaae43b4f621051089b82a364682dfeb metrics: - type: map_at_1 value: 27.734166666666667 - type: map_at_10 value: 36.858 - type: map_at_100 value: 38.043833333333325 - type: map_at_1000 value: 38.15541666666667 - type: map_at_20 value: 37.521249999999995 - type: map_at_3 value: 34.07658333333333 - type: map_at_5 value: 35.62683333333333 - type: mrr_at_1 value: 32.676249999999996 - type: mrr_at_10 value: 40.999 - type: mrr_at_100 value: 41.835 - type: mrr_at_1000 value: 41.8895 - type: mrr_at_20 value: 41.4865 - type: mrr_at_3 value: 38.645 - type: mrr_at_5 value: 39.99725000000001 - type: ndcg_at_1 value: 32.676249999999996 - type: ndcg_at_10 value: 42.08016666666666 - type: ndcg_at_100 value: 47.082750000000004 - type: ndcg_at_1000 value: 49.276583333333335 - type: ndcg_at_20 value: 44.04808333333334 - type: ndcg_at_3 value: 37.43375 - type: ndcg_at_5 value: 39.623000000000005 - type: precision_at_1 value: 32.676249999999996 - type: precision_at_10 value: 7.271 - type: precision_at_100 value: 1.1458333333333333 - type: precision_at_1000 value: 0.152 - type: precision_at_20 value: 4.282916666666667 - type: precision_at_3 value: 17.061416666666666 - type: precision_at_5 value: 12.05466666666667 - type: recall_at_1 value: 27.734166666666667 - type: recall_at_10 value: 53.33574999999999 - type: recall_at_100 value: 75.16275 - type: recall_at_1000 value: 90.34891666666665 - type: recall_at_20 value: 60.4935 - type: recall_at_3 value: 40.377916666666664 - type: recall_at_5 value: 46.0195 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackStatsRetrieval config: default split: test revision: 65ac3a16b8e91f9cee4c9828cc7c335575432a2a metrics: - type: map_at_1 value: 25.653 - type: map_at_10 value: 32.151 - type: map_at_100 value: 33.152 - type: map_at_1000 value: 33.243 - type: map_at_20 value: 32.717 - type: map_at_3 value: 30.287 - type: map_at_5 value: 31.25 - type: mrr_at_1 value: 28.988000000000003 - type: mrr_at_10 value: 35.131 - type: mrr_at_100 value: 36.002 - type: mrr_at_1000 value: 36.069 - type: mrr_at_20 value: 35.61 - type: mrr_at_3 value: 33.308 - type: mrr_at_5 value: 34.259 - type: ndcg_at_1 value: 28.988000000000003 - type: ndcg_at_10 value: 35.988 - type: ndcg_at_100 value: 40.764 - type: ndcg_at_1000 value: 43.112 - type: ndcg_at_20 value: 37.852999999999994 - type: ndcg_at_3 value: 32.562000000000005 - type: ndcg_at_5 value: 33.983000000000004 - type: precision_at_1 value: 28.988000000000003 - type: precision_at_10 value: 5.475 - type: precision_at_100 value: 0.8500000000000001 - type: precision_at_1000 value: 0.11199999999999999 - type: precision_at_20 value: 3.229 - type: precision_at_3 value: 13.905999999999999 - type: precision_at_5 value: 9.386999999999999 - type: recall_at_1 value: 25.653 - type: recall_at_10 value: 44.962 - type: recall_at_100 value: 66.405 - type: recall_at_1000 value: 83.88799999999999 - type: recall_at_20 value: 51.79899999999999 - type: recall_at_3 value: 35.144999999999996 - type: recall_at_5 value: 38.814 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackTexRetrieval config: default split: test revision: 46989137a86843e03a6195de44b09deda022eec7 metrics: - type: map_at_1 value: 17.825 - type: map_at_10 value: 25.592 - type: map_at_100 value: 26.613999999999997 - type: map_at_1000 value: 26.734 - type: map_at_20 value: 26.115 - type: map_at_3 value: 23.119 - type: map_at_5 value: 24.54 - type: mrr_at_1 value: 21.335 - type: mrr_at_10 value: 29.165000000000003 - type: mrr_at_100 value: 30.049 - type: mrr_at_1000 value: 30.121 - type: mrr_at_20 value: 29.639 - type: mrr_at_3 value: 26.863999999999997 - type: mrr_at_5 value: 28.185 - type: ndcg_at_1 value: 21.335 - type: ndcg_at_10 value: 30.357 - type: ndcg_at_100 value: 35.410000000000004 - type: ndcg_at_1000 value: 38.24 - type: ndcg_at_20 value: 32.08 - type: ndcg_at_3 value: 25.95 - type: ndcg_at_5 value: 28.081 - type: precision_at_1 value: 21.335 - type: precision_at_10 value: 5.506 - type: precision_at_100 value: 0.928 - type: precision_at_1000 value: 0.135 - type: precision_at_20 value: 3.2550000000000003 - type: precision_at_3 value: 12.239 - type: precision_at_5 value: 8.885 - type: recall_at_1 value: 17.825 - type: recall_at_10 value: 41.105999999999995 - type: recall_at_100 value: 64.17 - type: recall_at_1000 value: 84.19200000000001 - type: recall_at_20 value: 47.497 - type: recall_at_3 value: 28.862 - type: recall_at_5 value: 34.348 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackUnixRetrieval config: default split: test revision: 6c6430d3a6d36f8d2a829195bc5dc94d7e063e53 metrics: - type: map_at_1 value: 29.435 - type: map_at_10 value: 38.261 - type: map_at_100 value: 39.242 - type: map_at_1000 value: 39.347 - type: map_at_20 value: 38.742 - type: map_at_3 value: 35.457 - type: map_at_5 value: 37.043 - type: mrr_at_1 value: 34.235 - type: mrr_at_10 value: 42.24 - type: mrr_at_100 value: 42.988 - type: mrr_at_1000 value: 43.043 - type: mrr_at_20 value: 42.613 - type: mrr_at_3 value: 39.832 - type: mrr_at_5 value: 41.227000000000004 - type: ndcg_at_1 value: 34.235 - type: ndcg_at_10 value: 43.384 - type: ndcg_at_100 value: 48.14 - type: ndcg_at_1000 value: 50.414 - type: ndcg_at_20 value: 44.913 - type: ndcg_at_3 value: 38.454 - type: ndcg_at_5 value: 40.776 - type: precision_at_1 value: 34.235 - type: precision_at_10 value: 7.164 - type: precision_at_100 value: 1.065 - type: precision_at_1000 value: 0.13699999999999998 - type: precision_at_20 value: 4.021 - type: precision_at_3 value: 17.226 - type: precision_at_5 value: 12.071 - type: recall_at_1 value: 29.435 - type: recall_at_10 value: 54.93900000000001 - type: recall_at_100 value: 76.176 - type: recall_at_1000 value: 91.989 - type: recall_at_20 value: 60.451 - type: recall_at_3 value: 41.332 - type: recall_at_5 value: 47.316 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackWebmastersRetrieval config: default split: test revision: 160c094312a0e1facb97e55eeddb698c0abe3571 metrics: - type: map_at_1 value: 25.605 - type: map_at_10 value: 34.162 - type: map_at_100 value: 35.827999999999996 - type: map_at_1000 value: 36.04 - type: map_at_20 value: 35.016000000000005 - type: map_at_3 value: 30.984 - type: map_at_5 value: 32.717 - type: mrr_at_1 value: 30.435000000000002 - type: mrr_at_10 value: 38.681 - type: mrr_at_100 value: 39.656000000000006 - type: mrr_at_1000 value: 39.71 - type: mrr_at_20 value: 39.208999999999996 - type: mrr_at_3 value: 35.903 - type: mrr_at_5 value: 37.454 - type: ndcg_at_1 value: 30.435000000000002 - type: ndcg_at_10 value: 39.916000000000004 - type: ndcg_at_100 value: 45.958 - type: ndcg_at_1000 value: 48.449999999999996 - type: ndcg_at_20 value: 42.085 - type: ndcg_at_3 value: 34.696 - type: ndcg_at_5 value: 37.147000000000006 - type: precision_at_1 value: 30.435000000000002 - type: precision_at_10 value: 7.767 - type: precision_at_100 value: 1.547 - type: precision_at_1000 value: 0.23800000000000002 - type: precision_at_20 value: 4.941 - type: precision_at_3 value: 16.073999999999998 - type: precision_at_5 value: 11.937000000000001 - type: recall_at_1 value: 25.605 - type: recall_at_10 value: 50.654999999999994 - type: recall_at_100 value: 77.609 - type: recall_at_1000 value: 93.518 - type: recall_at_20 value: 58.845000000000006 - type: recall_at_3 value: 36.272 - type: recall_at_5 value: 42.596000000000004 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackWordpressRetrieval config: default split: test revision: 4ffe81d471b1924886b33c7567bfb200e9eec5c4 metrics: - type: map_at_1 value: 23.666 - type: map_at_10 value: 30.980999999999998 - type: map_at_100 value: 32.0 - type: map_at_1000 value: 32.098 - type: map_at_20 value: 31.621 - type: map_at_3 value: 28.449999999999996 - type: map_at_5 value: 29.731999999999996 - type: mrr_at_1 value: 25.692999999999998 - type: mrr_at_10 value: 32.788000000000004 - type: mrr_at_100 value: 33.783 - type: mrr_at_1000 value: 33.849000000000004 - type: mrr_at_20 value: 33.408 - type: mrr_at_3 value: 30.561 - type: mrr_at_5 value: 31.716 - type: ndcg_at_1 value: 25.692999999999998 - type: ndcg_at_10 value: 35.428 - type: ndcg_at_100 value: 40.375 - type: ndcg_at_1000 value: 42.802 - type: ndcg_at_20 value: 37.621 - type: ndcg_at_3 value: 30.476999999999997 - type: ndcg_at_5 value: 32.621 - type: precision_at_1 value: 25.692999999999998 - type: precision_at_10 value: 5.508 - type: precision_at_100 value: 0.848 - type: precision_at_1000 value: 0.116 - type: precision_at_20 value: 3.272 - type: precision_at_3 value: 12.631 - type: precision_at_5 value: 8.872 - type: recall_at_1 value: 23.666 - type: recall_at_10 value: 47.532000000000004 - type: recall_at_100 value: 69.73700000000001 - type: recall_at_1000 value: 87.83800000000001 - type: recall_at_20 value: 55.61000000000001 - type: recall_at_3 value: 34.06 - type: recall_at_5 value: 39.254 - task: type: Retrieval dataset: type: None name: MTEB ClimateFEVER config: default split: test revision: 47f2ac6acb640fc46020b02a5b59fdda04d39380 metrics: - type: map_at_1 value: 16.337 - type: map_at_10 value: 26.488 - type: map_at_100 value: 28.415000000000003 - type: map_at_1000 value: 28.584 - type: map_at_20 value: 27.557 - type: map_at_3 value: 22.665 - type: map_at_5 value: 24.542 - type: mrr_at_1 value: 36.417 - type: mrr_at_10 value: 48.001 - type: mrr_at_100 value: 48.784 - type: mrr_at_1000 value: 48.809000000000005 - type: mrr_at_20 value: 48.507 - type: mrr_at_3 value: 45.103 - type: mrr_at_5 value: 46.843 - type: ndcg_at_1 value: 36.417 - type: ndcg_at_10 value: 35.67 - type: ndcg_at_100 value: 42.716 - type: ndcg_at_1000 value: 45.639 - type: ndcg_at_20 value: 38.471 - type: ndcg_at_3 value: 30.444 - type: ndcg_at_5 value: 32.004 - type: precision_at_1 value: 36.417 - type: precision_at_10 value: 10.73 - type: precision_at_100 value: 1.833 - type: precision_at_1000 value: 0.23800000000000002 - type: precision_at_20 value: 6.596 - type: precision_at_3 value: 22.302 - type: precision_at_5 value: 16.521 - type: recall_at_1 value: 16.337 - type: recall_at_10 value: 40.671 - type: recall_at_100 value: 64.55300000000001 - type: recall_at_1000 value: 80.934 - type: recall_at_20 value: 48.381 - type: recall_at_3 value: 27.279999999999998 - type: recall_at_5 value: 32.621 - task: type: Retrieval dataset: type: None name: MTEB DBPedia config: default split: test revision: c0f706b76e590d620bd6618b3ca8efdd34e2d659 metrics: - type: map_at_1 value: 9.056000000000001 - type: map_at_10 value: 19.419 - type: map_at_100 value: 27.069 - type: map_at_1000 value: 28.666000000000004 - type: map_at_20 value: 22.434 - type: map_at_3 value: 13.895 - type: map_at_5 value: 16.121 - type: mrr_at_1 value: 69.0 - type: mrr_at_10 value: 75.804 - type: mrr_at_100 value: 76.117 - type: mrr_at_1000 value: 76.125 - type: mrr_at_20 value: 76.009 - type: mrr_at_3 value: 74.375 - type: mrr_at_5 value: 75.4 - type: ndcg_at_1 value: 57.49999999999999 - type: ndcg_at_10 value: 41.495 - type: ndcg_at_100 value: 45.208 - type: ndcg_at_1000 value: 52.221 - type: ndcg_at_20 value: 40.617999999999995 - type: ndcg_at_3 value: 46.592 - type: ndcg_at_5 value: 43.559 - type: precision_at_1 value: 69.0 - type: precision_at_10 value: 32.574999999999996 - type: precision_at_100 value: 10.205 - type: precision_at_1000 value: 2.036 - type: precision_at_20 value: 24.687 - type: precision_at_3 value: 49.75 - type: precision_at_5 value: 42.0 - type: recall_at_1 value: 9.056000000000001 - type: recall_at_10 value: 24.866 - type: recall_at_100 value: 50.097 - type: recall_at_1000 value: 72.038 - type: recall_at_20 value: 31.858999999999998 - type: recall_at_3 value: 15.096000000000002 - type: recall_at_5 value: 18.548000000000002 - task: type: Classification dataset: type: None name: MTEB EmotionClassification config: default split: test revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37 metrics: - type: accuracy value: 48.259999999999984 - type: f1 value: 43.1498589523159 - task: type: Retrieval dataset: type: None name: MTEB FEVER config: default split: test revision: bea83ef9e8fb933d90a2f1d5515737465d613e12 metrics: - type: map_at_1 value: 74.798 - type: map_at_10 value: 83.454 - type: map_at_100 value: 83.623 - type: map_at_1000 value: 83.635 - type: map_at_20 value: 83.55 - type: map_at_3 value: 82.392 - type: map_at_5 value: 83.167 - type: mrr_at_1 value: 80.708 - type: mrr_at_10 value: 88.377 - type: mrr_at_100 value: 88.411 - type: mrr_at_1000 value: 88.411 - type: mrr_at_20 value: 88.402 - type: mrr_at_3 value: 87.646 - type: mrr_at_5 value: 88.232 - type: ndcg_at_1 value: 80.708 - type: ndcg_at_10 value: 87.35199999999999 - type: ndcg_at_100 value: 87.91600000000001 - type: ndcg_at_1000 value: 88.12299999999999 - type: ndcg_at_20 value: 87.593 - type: ndcg_at_3 value: 85.738 - type: ndcg_at_5 value: 86.845 - type: precision_at_1 value: 80.708 - type: precision_at_10 value: 10.432 - type: precision_at_100 value: 1.091 - type: precision_at_1000 value: 0.11299999999999999 - type: precision_at_20 value: 5.296 - type: precision_at_3 value: 32.778 - type: precision_at_5 value: 20.399 - type: recall_at_1 value: 74.798 - type: recall_at_10 value: 94.459 - type: recall_at_100 value: 96.614 - type: recall_at_1000 value: 97.868 - type: recall_at_20 value: 95.254 - type: recall_at_3 value: 90.144 - type: recall_at_5 value: 92.965 - task: type: Retrieval dataset: type: None name: MTEB FiQA2018 config: default split: test revision: 27a168819829fe9bcd655c2df245fb19452e8e06 metrics: - type: map_at_1 value: 20.008 - type: map_at_10 value: 32.731 - type: map_at_100 value: 34.467999999999996 - type: map_at_1000 value: 34.643 - type: map_at_20 value: 33.717000000000006 - type: map_at_3 value: 28.427999999999997 - type: map_at_5 value: 30.788 - type: mrr_at_1 value: 40.586 - type: mrr_at_10 value: 49.056 - type: mrr_at_100 value: 49.887 - type: mrr_at_1000 value: 49.929 - type: mrr_at_20 value: 49.552 - type: mrr_at_3 value: 46.785 - type: mrr_at_5 value: 48.004000000000005 - type: ndcg_at_1 value: 40.586 - type: ndcg_at_10 value: 40.589999999999996 - type: ndcg_at_100 value: 47.03 - type: ndcg_at_1000 value: 49.994 - type: ndcg_at_20 value: 43.229 - type: ndcg_at_3 value: 37.061 - type: ndcg_at_5 value: 37.992 - type: precision_at_1 value: 40.586 - type: precision_at_10 value: 11.219 - type: precision_at_100 value: 1.781 - type: precision_at_1000 value: 0.232 - type: precision_at_20 value: 6.705 - type: precision_at_3 value: 24.743000000000002 - type: precision_at_5 value: 18.086 - type: recall_at_1 value: 20.008 - type: recall_at_10 value: 47.412 - type: recall_at_100 value: 71.274 - type: recall_at_1000 value: 88.898 - type: recall_at_20 value: 55.706999999999994 - type: recall_at_3 value: 33.346 - type: recall_at_5 value: 39.112 - task: type: Retrieval dataset: type: None name: MTEB HotpotQA config: default split: test revision: ab518f4d6fcca38d87c25209f94beba119d02014 metrics: - type: map_at_1 value: 41.789 - type: map_at_10 value: 57.898 - type: map_at_100 value: 58.632 - type: map_at_1000 value: 58.693 - type: map_at_20 value: 58.314 - type: map_at_3 value: 55.236 - type: map_at_5 value: 56.852999999999994 - type: mrr_at_1 value: 83.57900000000001 - type: mrr_at_10 value: 87.631 - type: mrr_at_100 value: 87.764 - type: mrr_at_1000 value: 87.77000000000001 - type: mrr_at_20 value: 87.70700000000001 - type: mrr_at_3 value: 87.02499999999999 - type: mrr_at_5 value: 87.34100000000001 - type: ndcg_at_1 value: 83.57900000000001 - type: ndcg_at_10 value: 67.11399999999999 - type: ndcg_at_100 value: 69.686 - type: ndcg_at_1000 value: 70.926 - type: ndcg_at_20 value: 68.119 - type: ndcg_at_3 value: 63.402 - type: ndcg_at_5 value: 65.354 - type: precision_at_1 value: 83.57900000000001 - type: precision_at_10 value: 13.333 - type: precision_at_100 value: 1.537 - type: precision_at_1000 value: 0.16999999999999998 - type: precision_at_20 value: 6.988999999999999 - type: precision_at_3 value: 38.929 - type: precision_at_5 value: 24.897 - type: recall_at_1 value: 41.789 - type: recall_at_10 value: 66.664 - type: recall_at_100 value: 76.833 - type: recall_at_1000 value: 85.14500000000001 - type: recall_at_20 value: 69.892 - type: recall_at_3 value: 58.392999999999994 - type: recall_at_5 value: 62.242 - task: type: Classification dataset: type: None name: MTEB ImdbClassification config: default split: test revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7 metrics: - type: accuracy value: 86.6108 - type: ap value: 81.63890253106925 - type: f1 value: 86.54585789538082 - task: type: Retrieval dataset: type: None name: MTEB MSMARCO config: default split: dev revision: c5a29a104738b98a9e76336939199e264163d4a0 metrics: - type: map_at_1 value: 22.407 - type: map_at_10 value: 34.603 - type: map_at_100 value: 35.808 - type: map_at_1000 value: 35.855 - type: map_at_20 value: 35.368 - type: map_at_3 value: 30.764000000000003 - type: map_at_5 value: 32.964 - type: mrr_at_1 value: 23.009 - type: mrr_at_10 value: 35.136 - type: mrr_at_100 value: 36.284 - type: mrr_at_1000 value: 36.325 - type: mrr_at_20 value: 35.869 - type: mrr_at_3 value: 31.351000000000003 - type: mrr_at_5 value: 33.54 - type: ndcg_at_1 value: 23.009 - type: ndcg_at_10 value: 41.471999999999994 - type: ndcg_at_100 value: 47.211999999999996 - type: ndcg_at_1000 value: 48.361 - type: ndcg_at_20 value: 44.169000000000004 - type: ndcg_at_3 value: 33.646 - type: ndcg_at_5 value: 37.580000000000005 - type: precision_at_1 value: 23.009 - type: precision_at_10 value: 6.54 - type: precision_at_100 value: 0.941 - type: precision_at_1000 value: 0.104 - type: precision_at_20 value: 3.832 - type: precision_at_3 value: 14.283999999999999 - type: precision_at_5 value: 10.564 - type: recall_at_1 value: 22.407 - type: recall_at_10 value: 62.678999999999995 - type: recall_at_100 value: 89.09700000000001 - type: recall_at_1000 value: 97.822 - type: recall_at_20 value: 73.116 - type: recall_at_3 value: 41.4 - type: recall_at_5 value: 50.855 - task: type: Classification dataset: type: None name: MTEB MTOPDomainClassification (en) config: en split: test revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf metrics: - type: accuracy value: 92.94573643410853 - type: f1 value: 92.73148878666994 - task: type: Classification dataset: type: None name: MTEB MTOPIntentClassification (en) config: en split: test revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba metrics: - type: accuracy value: 77.86137710898313 - type: f1 value: 60.360562463738724 - task: type: Classification dataset: type: None name: MTEB MassiveIntentClassification (en) config: en split: test revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 metrics: - type: accuracy value: 73.83322125084062 - type: f1 value: 71.61864304680206 - task: type: Classification dataset: type: None name: MTEB MassiveScenarioClassification (en) config: en split: test revision: 7d571f92784cd94a019292a1f45445077d0ef634 metrics: - type: accuracy value: 77.50504371217215 - type: f1 value: 77.52039268347185 - task: type: Clustering dataset: type: None name: MTEB MedrxivClusteringP2P config: default split: test revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73 metrics: - type: v_measure value: 34.346952648910225 - type: v_measures value: [0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024, 0.3246964225451952, 0.33269208719245646, 0.3355911472371345, 0.32978655133380147, 0.3275090874657499, 0.3752583186941529, 0.3494711327267592, 0.36636134409497156, 0.3538734420417993, 0.3394557315590024] - task: type: Clustering dataset: type: None name: MTEB MedrxivClusteringS2S config: default split: test revision: 35191c8c0dca72d8ff3efcd72aa802307d469663 metrics: - type: v_measure value: 32.19992734583148 - type: v_measures value: [0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027, 0.31100967211136193, 0.31302897733611235, 0.3126922134381441, 0.30243629014133017, 0.31564501718268645, 0.34772968477866795, 0.32522623268021805, 0.3410158265159116, 0.33581770403870503, 0.31539111636001027] - task: type: Reranking dataset: type: None name: MTEB MindSmallReranking config: default split: test revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69 metrics: - type: map value: 30.62309561205373 - type: mrr value: 31.707879717902554 - task: type: Retrieval dataset: type: None name: MTEB NFCorpus config: default split: test revision: ec0fa4fe99da2ff19ca1214b7966684033a58814 metrics: - type: map_at_1 value: 5.668 - type: map_at_10 value: 12.225999999999999 - type: map_at_100 value: 15.122 - type: map_at_1000 value: 16.422 - type: map_at_20 value: 13.361999999999998 - type: map_at_3 value: 9.083 - type: map_at_5 value: 10.5 - type: mrr_at_1 value: 46.44 - type: mrr_at_10 value: 53.553 - type: mrr_at_100 value: 54.15 - type: mrr_at_1000 value: 54.193000000000005 - type: mrr_at_20 value: 53.837 - type: mrr_at_3 value: 51.702999999999996 - type: mrr_at_5 value: 52.647 - type: ndcg_at_1 value: 44.272 - type: ndcg_at_10 value: 33.395 - type: ndcg_at_100 value: 29.976999999999997 - type: ndcg_at_1000 value: 38.388 - type: ndcg_at_20 value: 30.606 - type: ndcg_at_3 value: 39.212 - type: ndcg_at_5 value: 36.611 - type: precision_at_1 value: 46.129999999999995 - type: precision_at_10 value: 24.334 - type: precision_at_100 value: 7.553999999999999 - type: precision_at_1000 value: 1.994 - type: precision_at_20 value: 17.678 - type: precision_at_3 value: 36.326 - type: precision_at_5 value: 31.330999999999996 - type: recall_at_1 value: 5.668 - type: recall_at_10 value: 15.837000000000002 - type: recall_at_100 value: 29.845 - type: recall_at_1000 value: 60.563 - type: recall_at_20 value: 18.587999999999997 - type: recall_at_3 value: 10.096 - type: recall_at_5 value: 12.261 - task: type: Retrieval dataset: type: None name: MTEB NQ config: default split: test revision: b774495ed302d8c44a3a7ea25c90dbce03968f31 metrics: - type: map_at_1 value: 39.335 - type: map_at_10 value: 54.932 - type: map_at_100 value: 55.742000000000004 - type: map_at_1000 value: 55.766000000000005 - type: map_at_20 value: 55.504 - type: map_at_3 value: 50.904 - type: map_at_5 value: 53.388999999999996 - type: mrr_at_1 value: 44.003 - type: mrr_at_10 value: 57.419 - type: mrr_at_100 value: 57.963 - type: mrr_at_1000 value: 57.981 - type: mrr_at_20 value: 57.80499999999999 - type: mrr_at_3 value: 54.30199999999999 - type: mrr_at_5 value: 56.257000000000005 - type: ndcg_at_1 value: 43.974999999999994 - type: ndcg_at_10 value: 62.153999999999996 - type: ndcg_at_100 value: 65.326 - type: ndcg_at_1000 value: 65.862 - type: ndcg_at_20 value: 63.922999999999995 - type: ndcg_at_3 value: 54.834 - type: ndcg_at_5 value: 58.857000000000006 - type: precision_at_1 value: 43.974999999999994 - type: precision_at_10 value: 9.722 - type: precision_at_100 value: 1.153 - type: precision_at_1000 value: 0.12 - type: precision_at_20 value: 5.3 - type: precision_at_3 value: 24.392 - type: precision_at_5 value: 16.993 - type: recall_at_1 value: 39.335 - type: recall_at_10 value: 81.501 - type: recall_at_100 value: 94.851 - type: recall_at_1000 value: 98.817 - type: recall_at_20 value: 87.968 - type: recall_at_3 value: 62.795 - type: recall_at_5 value: 71.985 - task: type: Retrieval dataset: type: None name: MTEB QuoraRetrieval config: default split: test revision: e4e08e0b7dbe3c8700f0daef558ff32256715259 metrics: - type: map_at_1 value: 71.222 - type: map_at_10 value: 85.193 - type: map_at_100 value: 85.802 - type: map_at_1000 value: 85.81800000000001 - type: map_at_20 value: 85.587 - type: map_at_3 value: 82.253 - type: map_at_5 value: 84.142 - type: mrr_at_1 value: 82.04 - type: mrr_at_10 value: 88.101 - type: mrr_at_100 value: 88.196 - type: mrr_at_1000 value: 88.196 - type: mrr_at_20 value: 88.175 - type: mrr_at_3 value: 87.145 - type: mrr_at_5 value: 87.825 - type: ndcg_at_1 value: 82.04 - type: ndcg_at_10 value: 88.849 - type: ndcg_at_100 value: 89.992 - type: ndcg_at_1000 value: 90.089 - type: ndcg_at_20 value: 89.468 - type: ndcg_at_3 value: 86.06899999999999 - type: ndcg_at_5 value: 87.669 - type: precision_at_1 value: 82.04 - type: precision_at_10 value: 13.447000000000001 - type: precision_at_100 value: 1.528 - type: precision_at_1000 value: 0.157 - type: precision_at_20 value: 7.116 - type: precision_at_3 value: 37.617 - type: precision_at_5 value: 24.776 - type: recall_at_1 value: 71.222 - type: recall_at_10 value: 95.73899999999999 - type: recall_at_100 value: 99.572 - type: recall_at_1000 value: 99.988 - type: recall_at_20 value: 97.725 - type: recall_at_3 value: 87.742 - type: recall_at_5 value: 92.23400000000001 - task: type: Clustering dataset: type: None name: MTEB RedditClustering config: default split: test revision: 24640382cdbf8abc73003fb0fa6d111a705499eb metrics: - type: v_measure value: 56.502005725283524 - type: v_measures value: [0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237, 0.5845673186673394, 0.648423996059595, 0.5081078446363154, 0.577059582267051, 0.5449838765447135, 0.5255305026550916, 0.6001776953894321, 0.5075448301528861, 0.5238448212279936, 0.5329001795025329, 0.5112306232092642, 0.6002807353254037, 0.5525285295615835, 0.56281813563348, 0.6722346506108504, 0.5293879728430999, 0.5972632642217942, 0.6345018102197326, 0.515945887049231, 0.5291998092690363, 0.5250323799432043, 0.538426398169316, 0.6954213901632498, 0.580008522375662, 0.5280806756230237] - task: type: Clustering dataset: type: None name: MTEB RedditClusteringP2P config: default split: test revision: 385e3cb46b4cfa89021f56c4380204149d0efe33 metrics: - type: v_measure value: 63.14989421688691 - type: v_measures value: [0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534, 0.673210410652684, 0.6825035243902045, 0.6275126414823813, 0.40001836573261074, 0.711458797825346, 0.6212317163461291, 0.4113635660304527, 0.7394060043565659, 0.6969073197749642, 0.7513770750973534] - task: type: Retrieval dataset: type: None name: MTEB SCIDOCS config: default split: test revision: f8c2fcf00f625baaa80f62ec5bd9e1fff3b8ae88 metrics: - type: map_at_1 value: 4.4830000000000005 - type: map_at_10 value: 11.04 - type: map_at_100 value: 12.764000000000001 - type: map_at_1000 value: 13.04 - type: map_at_20 value: 11.953 - type: map_at_3 value: 8.125 - type: map_at_5 value: 9.565999999999999 - type: mrr_at_1 value: 22.1 - type: mrr_at_10 value: 32.494 - type: mrr_at_100 value: 33.525 - type: mrr_at_1000 value: 33.596 - type: mrr_at_20 value: 33.089 - type: mrr_at_3 value: 29.416999999999998 - type: mrr_at_5 value: 31.267 - type: ndcg_at_1 value: 22.1 - type: ndcg_at_10 value: 18.587 - type: ndcg_at_100 value: 25.482 - type: ndcg_at_1000 value: 30.581999999999997 - type: ndcg_at_20 value: 21.077 - type: ndcg_at_3 value: 18.165 - type: ndcg_at_5 value: 15.676000000000002 - type: precision_at_1 value: 22.1 - type: precision_at_10 value: 9.48 - type: precision_at_100 value: 1.942 - type: precision_at_1000 value: 0.316 - type: precision_at_20 value: 6.175 - type: precision_at_3 value: 17.033 - type: precision_at_5 value: 13.719999999999999 - type: recall_at_1 value: 4.4830000000000005 - type: recall_at_10 value: 19.208 - type: recall_at_100 value: 39.417 - type: recall_at_1000 value: 64.235 - type: recall_at_20 value: 25.057000000000002 - type: recall_at_3 value: 10.348 - type: recall_at_5 value: 13.893 - task: type: STS dataset: type: None name: MTEB SICK-R config: default split: test revision: 20a6d6f312dd54037fe07a32d58e5e168867909d metrics: - type: cos_sim_pearson value: 83.50181312649208 - type: cos_sim_spearman value: 79.92900705478993 - type: euclidean_pearson value: 81.13482128094503 - type: euclidean_spearman value: 79.92732266864367 - type: manhattan_pearson value: 81.06702121654993 - type: manhattan_spearman value: 79.86983106619135 - task: type: STS dataset: type: None name: MTEB STS12 config: default split: test revision: a0d554a64d88156834ff5ae9920b964011b16384 metrics: - type: cos_sim_pearson value: 83.85431681906961 - type: cos_sim_spearman value: 77.61671419416626 - type: euclidean_pearson value: 81.30538320520961 - type: euclidean_spearman value: 77.62096481461272 - type: manhattan_pearson value: 81.2306021173407 - type: manhattan_spearman value: 77.58386300715222 - task: type: STS dataset: type: None name: MTEB STS13 config: default split: test revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca metrics: - type: cos_sim_pearson value: 84.98057702322754 - type: cos_sim_spearman value: 86.13305071688859 - type: euclidean_pearson value: 85.70903555966376 - type: euclidean_spearman value: 86.13150222328171 - type: manhattan_pearson value: 85.69380834788831 - type: manhattan_spearman value: 86.10784739081191 - task: type: STS dataset: type: None name: MTEB STS14 config: default split: test revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375 metrics: - type: cos_sim_pearson value: 83.43368314724589 - type: cos_sim_spearman value: 81.26767916144169 - type: euclidean_pearson value: 83.23234690932492 - type: euclidean_spearman value: 81.2671726214706 - type: manhattan_pearson value: 83.2381239261109 - type: manhattan_spearman value: 81.27674961470714 - task: type: STS dataset: type: None name: MTEB STS15 config: default split: test revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3 metrics: - type: cos_sim_pearson value: 86.8637546411748 - type: cos_sim_spearman value: 88.25330888676139 - type: euclidean_pearson value: 87.81194589390417 - type: euclidean_spearman value: 88.25258669625579 - type: manhattan_pearson value: 87.8131866998459 - type: manhattan_spearman value: 88.26523268929576 - task: type: STS dataset: type: None name: MTEB STS16 config: default split: test revision: 4d8694f8f0e0100860b497b999b3dbed754a0513 metrics: - type: cos_sim_pearson value: 83.83129743147286 - type: cos_sim_spearman value: 85.73732687732624 - type: euclidean_pearson value: 85.18051277328075 - type: euclidean_spearman value: 85.73565846174445 - type: manhattan_pearson value: 85.179029651079 - type: manhattan_spearman value: 85.75709685404729 - task: type: STS dataset: type: None name: MTEB STS17 (en-en) config: en-en split: test revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d metrics: - type: cos_sim_pearson value: 87.04715794253148 - type: cos_sim_spearman value: 87.61577496386343 - type: euclidean_pearson value: 88.34713614361046 - type: euclidean_spearman value: 87.56541901567275 - type: manhattan_pearson value: 88.26010824585985 - type: manhattan_spearman value: 87.35211736948182 - task: type: STS dataset: type: None name: MTEB STS22 (en) config: en split: test revision: eea2b4fe26a775864c896887d910b76a8098ad3f metrics: - type: cos_sim_pearson value: 62.36160793264433 - type: cos_sim_spearman value: 66.07767480051893 - type: euclidean_pearson value: 66.4716471304865 - type: euclidean_spearman value: 66.03999286501872 - type: manhattan_pearson value: 66.46197824372902 - type: manhattan_spearman value: 65.82936468127227 - task: type: STS dataset: type: None name: MTEB STSBenchmark config: default split: test revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831 metrics: - type: cos_sim_pearson value: 85.27768996785856 - type: cos_sim_spearman value: 86.96704639052885 - type: euclidean_pearson value: 86.48753189555983 - type: euclidean_spearman value: 86.96981285751171 - type: manhattan_pearson value: 86.49262465015401 - type: manhattan_spearman value: 86.95378609580054 - task: type: Reranking dataset: type: None name: MTEB SciDocsRR config: default split: test revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab metrics: - type: map value: 81.52012853393428 - type: mrr value: 94.70817671798063 - task: type: Retrieval dataset: type: None name: MTEB SciFact config: default split: test revision: 0228b52cf27578f30900b9e5271d331663a030d7 metrics: - type: map_at_1 value: 55.344 - type: map_at_10 value: 64.82900000000001 - type: map_at_100 value: 65.42 - type: map_at_1000 value: 65.443 - type: map_at_20 value: 65.2 - type: map_at_3 value: 61.8 - type: map_at_5 value: 63.510999999999996 - type: mrr_at_1 value: 58.333 - type: mrr_at_10 value: 66.24600000000001 - type: mrr_at_100 value: 66.742 - type: mrr_at_1000 value: 66.762 - type: mrr_at_20 value: 66.549 - type: mrr_at_3 value: 64.056 - type: mrr_at_5 value: 65.372 - type: ndcg_at_1 value: 58.333 - type: ndcg_at_10 value: 69.626 - type: ndcg_at_100 value: 72.236 - type: ndcg_at_1000 value: 72.872 - type: ndcg_at_20 value: 70.864 - type: ndcg_at_3 value: 64.50399999999999 - type: ndcg_at_5 value: 67.07600000000001 - type: precision_at_1 value: 58.333 - type: precision_at_10 value: 9.4 - type: precision_at_100 value: 1.073 - type: precision_at_1000 value: 0.11299999999999999 - type: precision_at_20 value: 4.983 - type: precision_at_3 value: 25.222 - type: precision_at_5 value: 16.8 - type: recall_at_1 value: 55.344 - type: recall_at_10 value: 82.789 - type: recall_at_100 value: 94.6 - type: recall_at_1000 value: 99.667 - type: recall_at_20 value: 87.533 - type: recall_at_3 value: 69.18299999999999 - type: recall_at_5 value: 75.622 - task: type: PairClassification dataset: type: None name: MTEB SprintDuplicateQuestions config: default split: test revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46 metrics: - type: cos_sim_accuracy value: 99.69405940594059 - type: cos_sim_ap value: 92.03642221694545 - type: cos_sim_f1 value: 84.06395048994327 - type: cos_sim_precision value: 86.79446219382322 - type: cos_sim_recall value: 81.5 - type: dot_accuracy value: 99.6930693069307 - type: dot_ap value: 91.9971441434875 - type: dot_f1 value: 83.8006230529595 - type: dot_precision value: 87.14902807775377 - type: dot_recall value: 80.7 - type: euclidean_accuracy value: 99.69504950495049 - type: euclidean_ap value: 92.03626548389335 - type: euclidean_f1 value: 84.10732714138285 - type: euclidean_precision value: 86.88699360341151 - type: euclidean_recall value: 81.5 - type: manhattan_accuracy value: 99.69504950495049 - type: manhattan_ap value: 92.02049659660081 - type: manhattan_f1 value: 84.34959349593495 - type: manhattan_precision value: 85.74380165289256 - type: manhattan_recall value: 83.0 - type: max_accuracy value: 99.69504950495049 - type: max_ap value: 92.03642221694545 - type: max_f1 value: 84.34959349593495 - task: type: Clustering dataset: type: None name: MTEB StackExchangeClustering config: default split: test revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259 metrics: - type: v_measure value: 67.04916654680977 - type: v_measures value: [0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096, 0.707614120277991, 0.694974842783697, 0.5756359888519659, 0.6964499615297283, 0.6547764033608466, 0.6448470247319567, 0.6263766967145058, 0.7139286894225703, 0.6737195749489034, 0.6824504575459811, 0.7667603743275774, 0.7595788549615426, 0.7086156082505461, 0.6624140136843005, 0.6136884209896801, 0.6717953455355791, 0.6494834308652331, 0.6507885275711466, 0.6382769468968572, 0.6556052416453325, 0.6700496626301571, 0.6424264693175464, 0.6400679099051025, 0.7118398877792876, 0.6501271821744096] - task: type: Clustering dataset: type: None name: MTEB StackExchangeClusteringP2P config: default split: test revision: 815ca46b2622cec33ccafc3735d572c266efdb44 metrics: - type: v_measure value: 33.36641413495258 - type: v_measures value: [0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235, 0.3245963448931168, 0.31882294716748927, 0.31975204745764507, 0.30752650651575314, 0.3191185767616115, 0.35880812225202774, 0.3427515820677152, 0.344097881083346, 0.35390675395072985, 0.3472606513458235] - task: type: Reranking dataset: type: None name: MTEB StackOverflowDupQuestions config: default split: test revision: e185fbe320c72810689fc5848eb6114e1ef5ec69 metrics: - type: map value: 51.19282080158746 - type: mrr value: 51.871100713012474 - task: type: Summarization dataset: type: None name: MTEB SummEval config: default split: test revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c metrics: - type: cos_sim_pearson value: 31.437664703708485 - type: cos_sim_spearman value: 31.391119208581575 - type: dot_pearson value: 31.19925970504054 - type: dot_spearman value: 31.38087224016694 - task: type: Retrieval dataset: type: None name: MTEB TRECCOVID config: default split: test revision: bb9466bac8153a0349341eb1b22e06409e78ef4e metrics: - type: map_at_1 value: 0.249 - type: map_at_10 value: 2.163 - type: map_at_100 value: 13.242999999999999 - type: map_at_1000 value: 30.866 - type: map_at_20 value: 3.9539999999999997 - type: map_at_3 value: 0.718 - type: map_at_5 value: 1.169 - type: mrr_at_1 value: 96.0 - type: mrr_at_10 value: 98.0 - type: mrr_at_100 value: 98.0 - type: mrr_at_1000 value: 98.0 - type: mrr_at_20 value: 98.0 - type: mrr_at_3 value: 98.0 - type: mrr_at_5 value: 98.0 - type: ndcg_at_1 value: 92.0 - type: ndcg_at_10 value: 84.147 - type: ndcg_at_100 value: 65.143 - type: ndcg_at_1000 value: 56.038 - type: ndcg_at_20 value: 80.869 - type: ndcg_at_3 value: 89.11200000000001 - type: ndcg_at_5 value: 87.199 - type: precision_at_1 value: 96.0 - type: precision_at_10 value: 87.8 - type: precision_at_100 value: 66.72 - type: precision_at_1000 value: 24.684 - type: precision_at_20 value: 84.3 - type: precision_at_3 value: 94.0 - type: precision_at_5 value: 91.2 - type: recall_at_1 value: 0.249 - type: recall_at_10 value: 2.284 - type: recall_at_100 value: 16.025 - type: recall_at_1000 value: 52.068999999999996 - type: recall_at_20 value: 4.3180000000000005 - type: recall_at_3 value: 0.738 - type: recall_at_5 value: 1.212 - task: type: Retrieval dataset: type: None name: MTEB Touche2020 config: default split: test revision: a34f9a33db75fa0cbb21bb5cfc3dae8dc8bec93f metrics: - type: map_at_1 value: 3.4520000000000004 - type: map_at_10 value: 13.045000000000002 - type: map_at_100 value: 19.442 - type: map_at_1000 value: 21.09 - type: map_at_20 value: 15.667 - type: map_at_3 value: 7.409000000000001 - type: map_at_5 value: 9.73 - type: mrr_at_1 value: 46.939 - type: mrr_at_10 value: 60.295 - type: mrr_at_100 value: 60.904 - type: mrr_at_1000 value: 60.919000000000004 - type: mrr_at_20 value: 60.77 - type: mrr_at_3 value: 58.50300000000001 - type: mrr_at_5 value: 59.014 - type: ndcg_at_1 value: 44.897999999999996 - type: ndcg_at_10 value: 31.911 - type: ndcg_at_100 value: 41.945 - type: ndcg_at_1000 value: 53.181999999999995 - type: ndcg_at_20 value: 31.505 - type: ndcg_at_3 value: 39.745000000000005 - type: ndcg_at_5 value: 35.528999999999996 - type: precision_at_1 value: 46.939 - type: precision_at_10 value: 26.531 - type: precision_at_100 value: 8.163 - type: precision_at_1000 value: 1.559 - type: precision_at_20 value: 19.387999999999998 - type: precision_at_3 value: 40.136 - type: precision_at_5 value: 33.878 - type: recall_at_1 value: 3.4520000000000004 - type: recall_at_10 value: 18.899 - type: recall_at_100 value: 50.207 - type: recall_at_1000 value: 83.871 - type: recall_at_20 value: 26.756999999999998 - type: recall_at_3 value: 8.729000000000001 - type: recall_at_5 value: 12.084999999999999 - task: type: Classification dataset: type: None name: MTEB ToxicConversationsClassification config: default split: test revision: edfaf9da55d3dd50d43143d90c1ac476895ae6de metrics: - type: accuracy value: 67.4560546875 - type: ap value: 12.720403845355294 - type: f1 value: 51.76062666567839 - task: type: Classification dataset: type: None name: MTEB TweetSentimentExtractionClassification config: default split: test revision: d604517c81ca91fe16a244d1248fc021f9ecee7a metrics: - type: accuracy value: 62.36276174306734 - type: f1 value: 62.69956906934332 - task: type: Clustering dataset: type: None name: MTEB TwentyNewsgroupsClustering config: default split: test revision: 6125ec4e24fa026cec8a478383ee943acfbd5449 metrics: - type: v_measure value: 49.473492910233965 - type: v_measures value: [0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281, 0.48829262296803855, 0.49853262011854643, 0.48457750518082765, 0.5020774116970983, 0.5001897357021557, 0.4702417082210781, 0.4763216048226018, 0.49932879417585735, 0.5129628835129124, 0.514824404624281] - task: type: PairClassification dataset: type: None name: MTEB TwitterSemEval2015 config: default split: test revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1 metrics: - type: cos_sim_accuracy value: 85.75430649102938 - type: cos_sim_ap value: 73.62842656477649 - type: cos_sim_f1 value: 67.76023680315738 - type: cos_sim_precision value: 63.61741547012506 - type: cos_sim_recall value: 72.4802110817942 - type: dot_accuracy value: 85.7423854085951 - type: dot_ap value: 73.59147637253723 - type: dot_f1 value: 67.69498693867396 - type: dot_precision value: 64.03859731701577 - type: dot_recall value: 71.79419525065963 - type: euclidean_accuracy value: 85.7423854085951 - type: euclidean_ap value: 73.6288990409654 - type: euclidean_f1 value: 67.80415430267064 - type: euclidean_precision value: 63.79711493718009 - type: euclidean_recall value: 72.34828496042216 - type: manhattan_accuracy value: 85.69470107885796 - type: manhattan_ap value: 73.49219614602531 - type: manhattan_f1 value: 67.60809797550613 - type: manhattan_precision value: 64.22127255460589 - type: manhattan_recall value: 71.37203166226914 - type: max_accuracy value: 85.75430649102938 - type: max_ap value: 73.6288990409654 - type: max_f1 value: 67.80415430267064 - task: type: PairClassification dataset: type: None name: MTEB TwitterURLCorpus config: default split: test revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf metrics: - type: cos_sim_accuracy value: 89.08293553770326 - type: cos_sim_ap value: 86.21246419992926 - type: cos_sim_f1 value: 78.49922526377924 - type: cos_sim_precision value: 75.35769939084857 - type: cos_sim_recall value: 81.9140745303357 - type: dot_accuracy value: 89.08681647067955 - type: dot_ap value: 86.19733517196862 - type: dot_f1 value: 78.51132446157838 - type: dot_precision value: 75.70233755093287 - type: dot_recall value: 81.53680320295658 - type: euclidean_accuracy value: 89.07517367175069 - type: euclidean_ap value: 86.21198725320203 - type: euclidean_f1 value: 78.49867139061116 - type: euclidean_precision value: 75.38276155372839 - type: euclidean_recall value: 81.88327687095781 - type: manhattan_accuracy value: 89.0538285403811 - type: manhattan_ap value: 86.17785515765131 - type: manhattan_f1 value: 78.48184098593084 - type: manhattan_precision value: 74.34396308285694 - type: manhattan_recall value: 83.10748383122882 - type: max_accuracy value: 89.08681647067955 - type: max_ap value: 86.21246419992926 - type: max_f1 value: 78.51132446157838 license: apache-2.0 language: - en base_model: - answerdotai/ModernBERT-base - nomic-ai/modernbert-embed-unsupervised base_model_relation: finetune --- # ModernBERT Embed | Classification (12) | Clustering (11) | Pair Classification (3) | Reranking (4) | Retrieval (15) | STS (10) | Summarization (1) | |-----------------------|------------|--------------|---------------------|-----------------|-------------------------|---------------|----------------|-----------|------------------| | nomic-embed-text-v1 | 768 | 62.4 | 74.1 | 43.9 | **85.2** | 55.7 | 52.8 | 82.1 | 30.1 | | nomic-embed-text-v1.5 | 768 | 62.28 | 73.55 | 43.93 | 84.61 | 55.78 | **53.01** | **81.94** | 30.4 | | modernbert-embed-base | 768 | **62.62** | **74.31** | **44.98** | 83.96 | **56.42** | 52.89 | 81.78 | **31.39** | | nomic-embed-text-v1.5 | 256 | 61.04 | 72.1 | 43.16 | 84.09 | 55.18 | 50.81 | 81.34 | 30.05 | | modernbert-embed-base | 256 | 61.17 | 72.40 | 43.82 | 83.45 | 55.69 | 50.62 | 81.12 | 31.27 | ## Usage You can use these models directly with the latest transformers release and requires installing : Reminder, this model is trained similarly to Nomic Embed and **REQUIRES** prefixes to be added to the input. For more information, see the instructions in Nomic Embed. Most use cases, adding to the query and to the documents will be sufficient. ### Sentence Transformers
Click to see Sentence Transformers usage with Matryoshka Truncation In Sentence Transformers, you can truncate embeddings to a smaller dimension by using the parameter when loading the model. Note the small differences compared to the full 768-dimensional similarities.
### Transformers
Click to see Transformers usage with Matryoshka Truncation In , you can truncate embeddings to a smaller dimension by slicing the mean pooled embeddings, prior to normalization. Note the small differences compared to the full 768-dimensional similarities.
### Transformers.js If you haven't already, you can install the Transformers.js JavaScript library from NPM using: Then, you can compute embeddings as follows: ## Training Click the Nomic Atlas map below to visualize a 5M sample of our contrastive pretraining data! ![image/webp]( We train our embedder using a multi-stage training pipeline. Starting from a long-context BERT model, the first unsupervised contrastive stage trains on a dataset generated from weakly related text pairs, such as question-answer pairs from forums like StackExchange and Quora, title-body pairs from Amazon reviews, and summarizations from news articles. In the second finetuning stage, higher quality labeled datasets such as search queries and answers from web searches are leveraged. Data curation and hard-example mining is crucial in this stage. For more details, see the Nomic Embed Technical Report and corresponding blog post. Training data to train the models is released in its entirety. For more details, see the repository ## Join the Nomic Community - Nomic: - Discord: - Twitter: ## Citation If you find the model, dataset, or training code useful, please cite our work", + "model_explanation_gemini": "Generates sentence embeddings for tasks like similarity, classification, retrieval, and clustering, with performance metrics provided for various MTEB benchmarks.\n\nFeatures: \n- Sentence similarity \n- Feature extraction \n- Classification (e.g., Amazon reviews, polarity) \n- Retrieval (e.g., ArguAna) \n- Clustering (e.g., ArxivClusteringP2P) \n- Benchmarked on MTEB tasks \n\nComparison: \nOutperforms or matches baseline metrics in MTEB tasks like" +} \ No newline at end of file diff --git a/model_data_json/nvidia_Cosmos-1.0-Diffusion-7B-Video2World.json b/model_data_json/nvidia_Cosmos-1.0-Diffusion-7B-Video2World.json new file mode 100644 index 0000000000000000000000000000000000000000..3e837a815fe5d2cdd4656ec27566637e738942c0 --- /dev/null +++ b/model_data_json/nvidia_Cosmos-1.0-Diffusion-7B-Video2World.json @@ -0,0 +1,15 @@ +{ + "model_id": "nvidia/Cosmos-1.0-Diffusion-7B-Video2World", + "downloads": 244460, + "tags": [ + "cosmos", + "safetensors", + "nvidia", + "nemo", + "arxiv:2501.03575", + "license:other", + "region:us" + ], + "description": "--- license: other license_name: nvidia-open-model-license license_link: >- library_name: cosmos tags: - nvidia - nemo - cosmos - diffusers extra_gated_prompt: >- # NVIDIA Open Model License Agreement Version Release Date: January 6, 2025 This NVIDIA Open Model License Agreement (the \"Agreement\") is a legal agreement between the Legal Entity You represent, or if no entity is identified, You and NVIDIA Corporation and its Affiliates (\"NVIDIA\") and governs Your use of the Models that NVIDIA provides to You under this Agreement. NVIDIA and You are each a \"party\" and collectively the \"parties.\" NVIDIA models released under this Agreement are intended to be used permissively and enable the further development of AI technologies. Subject to the terms of this Agreement, NVIDIA confirms that: * Models are commercially usable. * You are free to create and distribute Derivative Models. * NVIDIA does not claim ownership to any outputs generated using the Models or Model Derivatives. By using, reproducing, modifying, distributing, performing or displaying any portion or element of the Model or Derivative Model, or otherwise accepting the terms of this Agreement, you agree to be bound by this Agreement. ## 1. Definitions The following definitions apply to this Agreement: 1.1. \"NVIDIA Cosmos Model\" means a multimodal Model shared under this Agreement. 1.2. \"Derivative Model\" means all (a) modifications to the Model, (b) works based on the Model, and (c) any other derivative works of the Model. An output is not a Derivative Model. 1.3. \"Legal Entity\" means the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, \"control\" means (a) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (b) ownership of fifty percent (50%) or more of the outstanding shares, or (c) beneficial ownership of such entity. 1.4. \"Model\" means the machine learning model, software, checkpoints, learnt weights, algorithms, parameters, configuration files and documentation shared under this Agreement. 1.5. \"You\" or \"Your\" means an individual or Legal Entity exercising permissions granted by this Agreement. ## 2. Conditions for Use, License Grant, AI Ethics and IP Ownership 2.1. Conditions for Use. The Model and any Derivative Model are subject to additional terms as described in Section 2 and Section 3 of this Agreement and govern Your use. If You institute copyright or patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Model or a Derivative Model constitutes direct or contributory copyright or patent infringement, then any licenses granted to You under this Agreement for that Model or Derivative Model will terminate as of the date such litigation is filed. If You bypass, disable, reduce the efficacy of, or circumvent any technical limitation, safety guardrail or associated safety guardrail hyperparameter, encryption, security, digital rights management, or authentication mechanism contained in the Model, your rights under this Agreement will automatically terminate. NVIDIA may update this Agreement to comply with legal and regulatory requirements at any time and You agree to either comply with any updated license or cease Your copying, use, and distribution of the Model and any Derivative Model. 2.2. License Grant. The rights granted herein are explicitly conditioned on Your full compliance with the terms of this Agreement. Subject to the terms and conditions of this Agreement, NVIDIA hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, revocable (as stated in Section 2.1) license to publicly perform, publicly display, reproduce, use, create derivative works of, make, have made, sell, offer for sale, distribute (through multiple tiers of distribution) and import the Model. 2.3. AI Ethics. Use of the Models under the Agreement must be consistent with NVIDIA's Trustworthy AI terms found at 2.4. NVIDIA owns the Model and any Model Derivatives created by NVIDIA. Subject to NVIDIA's underlying ownership rights in the Model or its Model Derivatives, You are and will be the owner of Your Model Derivatives. NVIDIA claims no ownership rights in outputs. You are responsible for outputs and their subsequent uses. Except as expressly granted in this Agreement, (a) NVIDIA reserves all rights, interests and remedies in connection with the Model and (b) no other license or right is granted to you by implication, estoppel or otherwise. ## 3. Redistribution You may reproduce and distribute copies of the Model or Derivative Models thereof in any medium, with or without modifications, provided that You meet the following conditions: 3.1. If you distribute the Model, You must give any other recipients of the Model a copy of this Agreement and include the following attribution notice within a \"Notice\" text file with such copies: \"Licensed by NVIDIA Corporation under the NVIDIA Open Model License\"; 3.2. If you distribute or make available a NVIDIA Cosmos Model, or a product or service (including an AI model) that contains or uses a NVIDIA Cosmos Model, use a NVIDIA Cosmos Model to create a Derivative Model, or use a NVIDIA Cosmos Model or its outputs to create, train, fine tune, or otherwise improve an AI model, you will include \"Built on NVIDIA Cosmos\" on a related website, user interface, blogpost, about page, or product documentation; and 3.3. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Models as a whole, provided Your use, reproduction, and distribution of the Model otherwise complies with the conditions stated in this Agreement. ## 4. Trademarks This Agreement does not grant permission to use the trade names, trademarks, service marks, or product names of NVIDIA, except as required for reasonable and customary use in describing the origin of the Model and reproducing the content of the \"Notice\" text file. ## **5. Disclaimer of Warranty** **Unless required by applicable law or agreed to in writing, NVIDIA provides the Model on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Model, Derivative Models and outputs and assume any risks associated with Your exercise of permissions under this Agreement.** ## **6. Limitation of Liability** **In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, will NVIDIA be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this Agreement or out of the use or inability to use the Model, Derivative Models or outputs (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if NVIDIA has been advised of the possibility of such damages.** ## 7. Indemnity You will indemnify and hold harmless NVIDIA from and against any claim by any third party arising out of or related to your use or distribution of the Model, Model Derivatives or outputs. ## 8. Feedback NVIDIA appreciates your feedback, and You agree that NVIDIA may use it without restriction or compensation to You. ## 9. Governing Law This Agreement will be governed in all respects by the laws of the United States and the laws of the State of Delaware, without regard to conflict of laws principles or the United Nations Convention on Contracts for the International Sale of Goods. The state and federal courts residing in Santa Clara County, California will have exclusive jurisdiction over any dispute or claim arising out of or related to this Agreement, and the parties irrevocably consent to personal jurisdiction and venue in those courts; except that, either party may apply for injunctive remedies or an equivalent type of urgent legal relief in any jurisdiction. ## 10. Trade and Compliance You agree to comply with all applicable export, import, trade and economic sanctions laws and regulations, as amended, including without limitation U.S. Export Administration Regulations and Office of Foreign Assets Control regulations. These laws include restrictions on destinations, end-users and end-use. extra_gated_fields: By clicking Submit below, I accept the terms of the NVIDIA Open Model License Agreement and acknowledge that I am an adult of legal age of majority in the country in which the Cosmos Models will be used and have authority to accept this Agreement: checkbox extra_gated_description: >- The information you provide will be collected, stored, processed and shared in accordance with the NVIDIA Privacy Policy. extra_gated_button_content: Submit --- # **Cosmos-1.0-Diffusion**: A Suite of Diffusion-based World Foundation Models **Cosmos** | **Code** | **Paper** | **Paper Website** # Model Overview ## Description: **Cosmos World Foundation Models**: A family of highly performant pre-trained world foundation models purpose-built for generating physics-aware videos and world states for physical AI development. The Cosmos diffusion models are a collection of diffusion based world foundation models that generate dynamic, high quality videos from text, image, or video inputs. It can serve as the building block for various applications or research that are related to world generation. The models are ready for commercial use under NVIDIA Open Model license agreement. **Model Developer**: NVIDIA ## Model Versions In Cosmos 1.0 release, the Cosmos Diffusion WFM family includes the following models: - Cosmos-1.0-Diffusion-7B-Text2World - Given a text description, predict an output video of 121 frames. - Cosmos-1.0-Diffusion-14B-Text2World - Given a text description, predict an output video of 121 frames. - Cosmos-1.0-Diffusion-7B-Video2World - Given a text description and an image as the first frame, predict the future 120 frames. - Cosmos-1.0-Diffusion-14B-Video2World - Given a text description and an image as the first frame, predict the future 120 frames. ### License: This model is released under the NVIDIA Open Model License. For a custom license, please contact cosmos-license@nvidia.com. Under the NVIDIA Open Model License, NVIDIA confirms: * Models are commercially usable. * You are free to create and distribute Derivative Models. * NVIDIA does not claim ownership to any outputs generated using the Models or Derivative Models. **Important Note**: If you bypass, disable, reduce the efficacy of, or circumvent any technical limitation, **safety guardrail** or associated safety guardrail hyperparameter, encryption, security, digital rights management, or authentication mechanism contained in the Model, your rights under NVIDIA Open Model License Agreement will automatically terminate. * Cosmos-1.0-Guardrail is the safety guardrail for this model. ## Model Architecture: Cosmos-1.0-Diffusion-7B-Video2World is a diffusion transformer model designed for video denoising in the latent space. The network is composed of interleaved self-attention, cross-attention and feedforward layers as its building blocks. The cross-attention layers allow the model to condition on input text throughout the denoising process. Before each layers, adaptive layer normalization is applied to embed the time information for denoising. When image or video is provided as input, their latent frames are concatenated with the generated frames along the temporal dimension. Augment noise is added to conditional latent frames to bridge the training and inference gap. ## Input/Output Specifications * **Input** * **Input Type(s)**: Text+Image, Text+Video * **Input Format(s)**: * Text: String * Image: jpg, png, jpeg, webp * Video: mp4 * **Input Parameters**: * Text: One-dimensional (1D) * Image: Two-dimensional (2D) * Video: Three-dimensional (3D) * **Other Properties Related to Input**: * The input string should contain fewer than 300 words and should provide descriptive content for world generation, such as a scene description, key objects or characters, background, and any specific actions or motions to be depicted within the 5-second duration. * The input image should be of 1280x704 resolution. * The input video should be of 1280x704 resolution and 9 input frames. * **Output** * **Output Type(s)**: Video * **Output Format(s)**: mp4 * **Output Parameters**: Three-dimensional (3D) * **Other Properties Related to Output**: By default, the generated video is a 5-second clip with a resolution of 1280x704 pixels and a frame rate of 24 frames per second (fps). The video content visualizes the input text description as a short animated scene, capturing key elements within the specified time constraints. Aspect ratios and resolutions are configurable, with options including 1:1 (960x960 pixels), 4:3 (960x704 pixels), 3:4 (704x960 pixels), 16:9 (1280x704 pixels), and 9:16 (704x1280 pixels). The frame rate is also adjustable within a range of 12 to 40 fps. ## Software Integration **Runtime Engine(s):** * Cosmos * Diffusers **Supported Hardware Microarchitecture Compatibility:** * NVIDIA Blackwell * NVIDIA Hopper * NVIDIA Ampere **Note**: We have only tested doing inference with BF16 precision. **Operating System(s):** * Linux (We have not tested on other operating systems.) # Usage * See Cosmos for details. Cosmos can also be used with Diffusers! # Evaluation Please see our technical paper for detailed evaluations. ## Inference Time and GPU Memory Usage The numbers provided below may vary depending on system specs and are for reference only. | Offloading Strategy | 7B Video2World | 14B Video2World | |----------------------------------------------------------------------------------|---------|---------| | Offload prompt upsampler | 76.5 GB | > 80.0 GB | | Offload prompt upsampler & guardrails | 59.9 GB | 73.3 GB | | Offload prompt upsampler & guardrails & T5 encoder | 41.3 GB | 54.8 GB | | Offload prompt upsampler & guardrails & T5 encoder & tokenizer | 41.1 GB | 54.5 GB | | Offload prompt upsampler & guardrails & T5 encoder & tokenizer & diffusion model | 27.3 GB | 39.0 GB | The following table shows the end-to-end inference runtime on a single H100 GPU, excluding model initialization time: | 7B Video2World (offload prompt upsampler) | 14B Video2World (offload prompt upsampler, guardrails) | |---------|---------| | ~383 seconds | ~593 seconds | ## Ethical Considerations NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. For more detailed information on ethical considerations for this model, please see the subcards of Explainability, Bias, Safety & Security, and Privacy below. Please report security vulnerabilities or NVIDIA AI Concerns here. ### Plus Plus (++) Promise We value you, the datasets, the diversity they represent, and what we have been entrusted with. This model and its associated data have been: * Verified to comply with current applicable disclosure laws, regulations, and industry standards. * Verified to comply with applicable privacy labeling requirements. * Annotated to describe the collector/source (NVIDIA or a third-party). * Characterized for technical limitations. * Reviewed to ensure proper disclosure is accessible to, maintained for, and in compliance with NVIDIA data subjects and their requests. * Reviewed before release. * Tagged for known restrictions and potential safety implications. ### Bias Field | Response :---------------------------------------------------------------------------------------------------|:--------------- Participation considerations from adversely impacted groups protected classes in model design and testing: | None Measures taken to mitigate against unwanted bias: | None ### Explainability Field | Response :------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------- Intended Application & Domain: | World Generation Model Type: | Transformer Intended Users: | Physical AI developers Output: | Videos Describe how the model works: | Generates videos based on video inputs Technical Limitations: | The model may not follow the video input accurately. Verified to have met prescribed NVIDIA quality standards: | Yes Performance Metrics: | Quantitative and Qualitative Evaluation Potential Known Risks: | The model's output can generate all forms of videos, including what may be considered toxic, offensive, or indecent. Licensing: | NVIDIA Open Model License ### Privacy Field | Response :----------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------- Generatable or reverse engineerable personal information? | None Known Protected class data used to create this model? | None Known Was consent obtained for any personal data used? | None Known How often is dataset reviewed? | Before Release Is a mechanism in place to honor data subject right of access or deletion of personal data? | Not Applicable If personal data was collected for the development of the model, was it collected directly by NVIDIA? | Not Applicable If personal data was collected for the development of the model by NVIDIA, do you maintain or have access to disclosures made to data subjects? | Not Applicable If personal data was collected for the development of this AI model, was it minimized to only what was required? | Not Applicable Is there provenance for all datasets used in training? | Yes Does data labeling (annotation, metadata) comply with privacy laws? | Yes Is data compliant with data subject requests for data correction or removal, if such a request was made? | Not Applicable ### Safety Field | Response :---------------------------------------------------|:---------------------------------- Model Application(s): | World Generation Describe the life critical impact (if present). | None Known Use Case Restrictions: | NVIDIA Open Model License Model and dataset restrictions: | The Principle of least privilege (PoLP) is applied limiting access for dataset generation and model development. Restrictions enforce dataset access during training, and dataset license constraints adhered to. Model checkpoints are made available on Hugging Face, and may become available on cloud providers' model catalog.", + "model_explanation_gemini": "Generates video-to-world simulations using diffusion-based multimodal AI under NVIDIA's permissive open-model license.\n\n**Features:** \n- Video-to-world simulation via diffusion \n- Multimodal capabilities \n- Commercially usable under NVIDIA's license \n- Supports derivative model creation \n- No ownership claims on generated outputs \n\n**Comparison:** \nUnlike standard video generation models, this one focuses on simulating dynamic \"world\" environments from video inputs, leveraging NVIDIA's Cosmos framework for multimodal diffusion. It offers broader commercial" +} \ No newline at end of file diff --git a/model_data_json/nvidia_mit-b0.json b/model_data_json/nvidia_mit-b0.json new file mode 100644 index 0000000000000000000000000000000000000000..8fddd8c34d78748bbb0f76b669b0d5e188b0f1f8 --- /dev/null +++ b/model_data_json/nvidia_mit-b0.json @@ -0,0 +1,20 @@ +{ + "model_id": "nvidia/mit-b0", + "downloads": 77331, + "tags": [ + "transformers", + "pytorch", + "tf", + "segformer", + "image-classification", + "vision", + "dataset:imagenet_1k", + "arxiv:2105.15203", + "license:other", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: other tags: - vision datasets: - imagenet_1k widget: - src: example_title: House - src: example_title: Castle --- # SegFormer (b0-sized) encoder pre-trained-only SegFormer encoder fine-tuned on Imagenet-1k. It was introduced in the paper SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers by Xie et al. and first released in this repository. Disclaimer: The team releasing SegFormer did not write a model card for this model so this model card has been written by the Hugging Face team. ## Model description SegFormer consists of a hierarchical Transformer encoder and a lightweight all-MLP decode head to achieve great results on semantic segmentation benchmarks such as ADE20K and Cityscapes. The hierarchical Transformer is first pre-trained on ImageNet-1k, after which a decode head is added and fine-tuned altogether on a downstream dataset. This repository only contains the pre-trained hierarchical Transformer, hence it can be used for fine-tuning purposes. ## Intended uses & limitations You can use the model for fine-tuning of semantic segmentation. See the model hub to look for fine-tuned versions on a task that interests you. ### How to use Here is how to use this model to classify an image of the COCO 2017 dataset into one of the 1,000 ImageNet classes: For more code examples, we refer to the documentation. ### License The license for this model can be found here. ### BibTeX entry and citation info", + "model_explanation_gemini": "Pre-trained SegFormer (b0-sized) hierarchical Transformer encoder for fine-tuning on semantic segmentation tasks.\n\nFeatures: \n- Hierarchical Transformer encoder \n- Pre-trained on ImageNet-1k \n- Designed for semantic segmentation (requires adding MLP decode head for downstream use) \n- Lightweight architecture \n\nComparison: \nUnlike full SegFormer models that include both encoder and MLP decode head, this variant provides only the pre-trained encoder component for custom fine-tuning." +} \ No newline at end of file diff --git a/model_data_json/nvidia_parakeet-rnnt-0.6b.json b/model_data_json/nvidia_parakeet-rnnt-0.6b.json new file mode 100644 index 0000000000000000000000000000000000000000..d643c4d0d1a7fc22906fd6e851baa1a0e14816bf --- /dev/null +++ b/model_data_json/nvidia_parakeet-rnnt-0.6b.json @@ -0,0 +1,36 @@ +{ + "model_id": "nvidia/parakeet-rnnt-0.6b", + "downloads": 73324, + "tags": [ + "nemo", + "automatic-speech-recognition", + "speech", + "audio", + "Transducer", + "FastConformer", + "Conformer", + "pytorch", + "NeMo", + "hf-asr-leaderboard", + "en", + "dataset:librispeech_asr", + "dataset:fisher_corpus", + "dataset:Switchboard-1", + "dataset:WSJ-0", + "dataset:WSJ-1", + "dataset:National-Singapore-Corpus-Part-1", + "dataset:National-Singapore-Corpus-Part-6", + "dataset:vctk", + "dataset:voxpopuli", + "dataset:europarl", + "dataset:multilingual_librispeech", + "dataset:mozilla-foundation/common_voice_8_0", + "dataset:MLCommons/peoples_speech", + "arxiv:2305.05084", + "license:cc-by-4.0", + "model-index", + "region:us" + ], + "description": "--- language: - en library_name: nemo datasets: - librispeech_asr - fisher_corpus - Switchboard-1 - WSJ-0 - WSJ-1 - National-Singapore-Corpus-Part-1 - National-Singapore-Corpus-Part-6 - vctk - voxpopuli - europarl - multilingual_librispeech - mozilla-foundation/common_voice_8_0 - MLCommons/peoples_speech thumbnail: null tags: - automatic-speech-recognition - speech - audio - Transducer - FastConformer - Conformer - pytorch - NeMo - hf-asr-leaderboard license: cc-by-4.0 widget: - example_title: Librispeech sample 1 src: - example_title: Librispeech sample 2 src: model-index: - name: parakeet-rnnt-0.6b results: - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: AMI (Meetings test) type: edinburghcstr/ami config: ihm split: test args: language: en metrics: - name: Test WER type: wer value: 17.55 - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: Earnings-22 type: revdotcom/earnings22 split: test args: language: en metrics: - name: Test WER type: wer value: 14.78 - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: GigaSpeech type: speechcolab/gigaspeech split: test args: language: en metrics: - name: Test WER type: wer value: 10.07 - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: LibriSpeech (clean) type: librispeech_asr config: other split: test args: language: en metrics: - name: Test WER type: wer value: 1.63 - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: LibriSpeech (other) type: librispeech_asr config: other split: test args: language: en metrics: - name: Test WER type: wer value: 3.06 - task: type: Automatic Speech Recognition name: automatic-speech-recognition dataset: name: SPGI Speech type: kensho/spgispeech config: test split: test args: language: en metrics: - name: Test WER type: wer value: 3.47 - task: type: Automatic Speech Recognition name: automatic-speech-recognition dataset: name: tedlium-v3 type: LIUM/tedlium config: release1 split: test args: language: en metrics: - name: Test WER type: wer value: 3.86 - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: Vox Populi type: facebook/voxpopuli config: en split: test args: language: en metrics: - name: Test WER type: wer value: 6.05 - task: type: Automatic Speech Recognition name: automatic-speech-recognition dataset: name: Mozilla Common Voice 9.0 type: mozilla-foundation/common_voice_9_0 config: en split: test args: language: en metrics: - name: Test WER type: wer value: 8.07 metrics: - wer pipeline_tag: automatic-speech-recognition --- # Parakeet RNNT 0.6B (en) | | is an ASR model that transcribes speech in lower case English alphabet. This model is jointly developed by NVIDIA NeMo and Suno.ai teams. It is an XL version of FastConformer Transducer [1] (around 600M parameters) model. See the model architecture section and NeMo documentation for complete architecture details. ## NVIDIA NeMo: Training To train, fine-tune or play with the model you will need to install NVIDIA NeMo. We recommend you install it after you've installed latest PyTorch version. ## How to Use this Model The model is available for use in the NeMo toolkit [3], and can be used as a pre-trained checkpoint for inference or for fine-tuning on another dataset. ### Automatically instantiate the model ### Transcribing using Python First, let's get a sample Then simply do: ### Transcribing many audio files ### Input This model accepts 16000 Hz mono-channel audio (wav files) as input. ### Output This model provides transcribed speech as a string for a given audio sample. ## Model Architecture FastConformer [1] is an optimized version of the Conformer model with 8x depthwise-separable convolutional downsampling. The model is trained in a multitask setup with a Transducer decoder (RNNT) loss. You may find more information on the details of FastConformer here: Fast-Conformer Model. ## Training The NeMo toolkit [3] was used for training the models for over several hundred epochs. These model are trained with this example script and this base config. The tokenizers for these models were built using the text transcripts of the train set with this script. ### Datasets The model was trained on 64K hours of English speech collected and prepared by NVIDIA NeMo and Suno teams. The training dataset consists of private subset with 40K hours of English speech plus 24K hours from the following public datasets: - Librispeech 960 hours of English speech - Fisher Corpus - Switchboard-1 Dataset - WSJ-0 and WSJ-1 - National Speech Corpus (Part 1, Part 6) - VCTK - VoxPopuli (EN) - Europarl-ASR (EN) - Multilingual Librispeech (MLS EN) - 2,000 hour subset - Mozilla Common Voice (v7.0) - People's Speech - 12,000 hour subset ## Performance The performance of Automatic Speech Recognition models is measuring using Word Error Rate. Since this dataset is trained on multiple domains and a much larger corpus, it will generally perform better at transcribing audio in general. The following tables summarizes the performance of the available models in this collection with the Transducer decoder. Performances of the ASR models are reported in terms of Word Error Rate (WER%) with greedy decoding. |**Version**|**Tokenizer**|**Vocabulary Size**|**AMI**|**Earnings-22**|**Giga Speech**|**LS test-clean**|**SPGI Speech**|**TEDLIUM-v3**|**Vox Populi**|**Common Voice**| |---------|-----------------------|-----------------|---------------|---------------|------------|-----------|-----|-------|------|------| | 1.22.0 | SentencePiece Unigram | 1024 | 17.55 | 14.78 | 10.07 | 1.63 | 3.06 | 3.47 | 3.86 | 6.05 | 8.07 | These are greedy WER numbers without external LM. More details on evaluation can be found at HuggingFace ASR Leaderboard ## NVIDIA Riva: Deployment NVIDIA Riva, is an accelerated speech AI SDK deployable on-prem, in all clouds, multi-cloud, hybrid, on edge, and embedded. Additionally, Riva provides: * World-class out-of-the-box accuracy for the most common languages with model checkpoints trained on proprietary data with hundreds of thousands of GPU-compute hours * Best in class accuracy with run-time word boosting (e.g., brand and product names) and customization of acoustic model, language model, and inverse text normalization * Streaming speech recognition, Kubernetes compatible scaling, and enterprise-grade support. Although this model isn’t supported yet by Riva, the list of supported models is here. Check out Riva live demo. ## References [1] Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition [2] Google Sentencepiece Tokenizer [3] NVIDIA NeMo Toolkit [4] Suno.ai [5] HuggingFace ASR Leaderboard ## Licence License to use this model is covered by the CC-BY-4.0. By downloading the public and release version of the model, you accept the terms and conditions of the CC-BY-4.0 license.", + "model_explanation_gemini": "Performs automatic speech recognition (ASR) to transcribe English speech into lowercase text using a FastConformer Transducer architecture with ~600M parameters. \n\n**Features**: \n- Language: English \n- Architecture: FastConformer Transducer (RNNT) \n- Parameters: ~600M (XL size) \n- Input: 16kHz mono-channel audio (WAV) \n- Output: Lowercase transcribed text \n- Datasets: Trained on LibriSpeech, Fisher Corpus" +} \ No newline at end of file diff --git a/model_data_json/onnx-community_Kokoro-82M-v1.0-ONNX.json b/model_data_json/onnx-community_Kokoro-82M-v1.0-ONNX.json new file mode 100644 index 0000000000000000000000000000000000000000..7250e46afa039b8e3e8487b210169581a037c506 --- /dev/null +++ b/model_data_json/onnx-community_Kokoro-82M-v1.0-ONNX.json @@ -0,0 +1,17 @@ +{ + "model_id": "onnx-community/Kokoro-82M-v1.0-ONNX", + "downloads": 82437, + "tags": [ + "transformers.js", + "onnx", + "style_text_to_speech_2", + "text-to-speech", + "en", + "base_model:hexgrad/Kokoro-82M", + "base_model:quantized:hexgrad/Kokoro-82M", + "license:apache-2.0", + "region:us" + ], + "description": "--- license: apache-2.0 library_name: transformers.js language: - en base_model: - hexgrad/Kokoro-82M pipeline_tag: text-to-speech --- # Kokoro TTS Kokoro is a frontier TTS model for its size of 82 million parameters (text in/audio out). ## Table of contents - Usage - JavaScript - Python - Voices/Samples - Quantizations ## Usage ### JavaScript First, install the library from NPM using: You can then generate speech as follows: ### Python Optionally, save the audio to a file: ## Voices/Samples > Life is like a box of chocolates. You never know what you're gonna get. | Name | Nationality | Gender | Sample | | ------------ | ----------- | ------ | --------------------------------------------------------------------------------------------------------------------------------------- | | **af_heart** | American | Female |

\"Image \"diffusers\" \"Github\" \"Website\" \"arXiv\" \"Social\" \"License\"

This is a model card for the model for monocular depth estimation from a single image. The model is fine-tuned from the model as described in our CVPR'2024 paper titled \"Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation\". - Play with the interactive Hugging Face Spaces demo: check out how the model works with example images or upload your own. - Use it with diffusers to compute the results with a few lines of code. - Get to the bottom of things with our official codebase. ## Model Details - **Developed by:** Bingxin Ke, Anton Obukhov, Shengyu Huang, Nando Metzger, Rodrigo Caye Daudt, Konrad Schindler. - **Model type:** Generative latent diffusion-based affine-invariant monocular depth estimation from a single image. - **Language:** English. - **License:** Apache License License Version 2.0. - **Model Description:** This model can be used to generate an estimated depth map of an input image. - **Resolution**: Even though any resolution can be processed, the model inherits the base diffusion model's effective resolution of roughly **768** pixels. This means that for optimal predictions, any larger input image should be resized to make the longer side 768 pixels before feeding it into the model. - **Steps and scheduler**: This model was designed for usage with the **DDIM** scheduler and between **10 and 50** denoising steps. It is possible to obtain good predictions with just **one** step by overriding the setting in the scheduler configuration file or by adding after the pipeline is loaded in the code before the first usage. For compatibility reasons we kept this model identical to the paper setting and provided a newer v1-1 model with optimal settings for all possible step configurations. - **Outputs**: - **Affine-invariant depth map**: The predicted values are between 0 and 1, interpolating between the near and far planes of the model's choice. - **Uncertainty map**: Produced only when multiple predictions are ensembled with ensemble size larger than 2. - **Resources for more information:** Project Website, Paper, Code. - **Cite as:**", + "model_explanation_gemini": "Estimates monocular depth from a single image using a diffusion-based approach, generating affine-invariant depth maps and optional uncertainty predictions. \n\n**Features:** \n- Monocular depth estimation (single-image input) \n- Diffusion-based, affine-invariant output (values 0–1) \n- Supports resolutions up to ~768px (longer side) \n- Works with DDIM scheduler (10–50 steps, or 1-step override) \n- Optional uncertainty maps via ensemble predictions \n-" +} \ No newline at end of file diff --git a/model_data_json/pyannote_embedding.json b/model_data_json/pyannote_embedding.json new file mode 100644 index 0000000000000000000000000000000000000000..618ba7a5df92563fc27c9323545827ad1b1638ec --- /dev/null +++ b/model_data_json/pyannote_embedding.json @@ -0,0 +1,24 @@ +{ + "model_id": "pyannote/embedding", + "downloads": 468472, + "tags": [ + "pyannote-audio", + "pytorch", + "tensorboard", + "pyannote", + "pyannote-audio-model", + "audio", + "voice", + "speech", + "speaker", + "speaker-recognition", + "speaker-verification", + "speaker-identification", + "speaker-embedding", + "dataset:voxceleb", + "license:mit", + "region:us" + ], + "description": "--- tags: - pyannote - pyannote-audio - pyannote-audio-model - audio - voice - speech - speaker - speaker-recognition - speaker-verification - speaker-identification - speaker-embedding datasets: - voxceleb license: mit inference: false extra_gated_prompt: \"The collected information will help acquire a better knowledge of pyannote.audio userbase and help its maintainers apply for grants to improve it further. If you are an academic researcher, please cite the relevant papers in your own publications using the model. If you work for a company, please consider contributing back to pyannote.audio development (e.g. through unrestricted gifts). We also provide scientific consulting services around speaker diarization and machine listening.\" extra_gated_fields: Company/university: text Website: text I plan to use this model for (task, type of audio data, etc): text --- Using this open-source model in production? Consider switching to pyannoteAI for better and faster options. # 🎹 Speaker embedding Relies on pyannote.audio 2.1: see installation instructions. This model is based on the canonical x-vector TDNN-based architecture, but with filter banks replaced with trainable SincNet features. See []( architecture for implementation details. ## Basic usage Using cosine distance directly, this model reaches 2.8% equal error rate (EER) on VoxCeleb 1 test set. This is without voice activity detection (VAD) nor probabilistic linear discriminant analysis (PLDA). Expect even better results when adding one of those. ## Advanced usage ### Running on GPU ### Extract embedding from an excerpt ### Extract embeddings using a sliding window ## Citation", + "model_explanation_gemini": "\"Generates speaker embeddings for tasks like speaker recognition, verification, and identification using a TDNN-based architecture with SincNet features.\"\n\n**Features:** \n- X-vector TDNN architecture with trainable SincNet features \n- 2.8% EER on VoxCeleb 1 (without VAD or PLDA) \n- Supports GPU usage, single-excerpt embedding extraction, and sliding-window embeddings \n\n**Comparison:** \nThe description suggests switching to **pyannote" +} \ No newline at end of file diff --git a/model_data_json/pyannote_overlapped-speech-detection.json b/model_data_json/pyannote_overlapped-speech-detection.json new file mode 100644 index 0000000000000000000000000000000000000000..f17510706fa748319a91a43d5f86ba49b3548950 --- /dev/null +++ b/model_data_json/pyannote_overlapped-speech-detection.json @@ -0,0 +1,22 @@ +{ + "model_id": "pyannote/overlapped-speech-detection", + "downloads": 176956, + "tags": [ + "pyannote-audio", + "pyannote", + "pyannote-audio-pipeline", + "audio", + "voice", + "speech", + "speaker", + "overlapped-speech-detection", + "automatic-speech-recognition", + "dataset:ami", + "dataset:dihard", + "dataset:voxconverse", + "license:mit", + "region:us" + ], + "description": "--- tags: - pyannote - pyannote-audio - pyannote-audio-pipeline - audio - voice - speech - speaker - overlapped-speech-detection - automatic-speech-recognition datasets: - ami - dihard - voxconverse license: mit extra_gated_prompt: \"The collected information will help acquire a better knowledge of pyannote.audio userbase and help its maintainers apply for grants to improve it further. If you are an academic researcher, please cite the relevant papers in your own publications using the model. If you work for a company, please consider contributing back to pyannote.audio development (e.g. through unrestricted gifts). We also provide scientific consulting services around speaker diarization and machine listening.\" extra_gated_fields: Company/university: text Website: text I plan to use this model for (task, type of audio data, etc): text --- Using this open-source model in production? Consider switching to pyannoteAI for better and faster options. # 🎹 Overlapped speech detection Relies on pyannote.audio 2.1: see installation instructions. ## Support For commercial enquiries and scientific consulting, please contact me. For technical questions and bug reports, please check pyannote.audio Github repository. ## Citation", + "model_explanation_gemini": "Detects overlapping speech segments in audio recordings to identify when multiple speakers talk simultaneously.\n\nFeatures: \n- Detects overlapped speech \n- Built on pyannote.audio 2.1 framework \n- Trained on AMI, DIHARD, and VoxConverse datasets \n- MIT licensed \n- Commercial/scientific support available \n\nComparison: \nThe description suggests switching to \"pyannoteAI\" for production use, implying this model may be slower or less optimized than proprietary alternatives," +} \ No newline at end of file diff --git a/model_data_json/pyannote_segmentation-3.0.json b/model_data_json/pyannote_segmentation-3.0.json new file mode 100644 index 0000000000000000000000000000000000000000..b4fdd2315e1a992e1d6cbe65a2c68c96e7e0bdc0 --- /dev/null +++ b/model_data_json/pyannote_segmentation-3.0.json @@ -0,0 +1,24 @@ +{ + "model_id": "pyannote/segmentation-3.0", + "downloads": 11569174, + "tags": [ + "pyannote-audio", + "pytorch", + "pyannote", + "pyannote-audio-model", + "audio", + "voice", + "speech", + "speaker", + "speaker-diarization", + "speaker-change-detection", + "speaker-segmentation", + "voice-activity-detection", + "overlapped-speech-detection", + "resegmentation", + "license:mit", + "region:us" + ], + "description": "--- tags: - pyannote - pyannote-audio - pyannote-audio-model - audio - voice - speech - speaker - speaker-diarization - speaker-change-detection - speaker-segmentation - voice-activity-detection - overlapped-speech-detection - resegmentation license: mit inference: false extra_gated_prompt: \"The collected information will help acquire a better knowledge of pyannote.audio userbase and help its maintainers improve it further. Though this model uses MIT license and will always remain open-source, we will occasionnally email you about premium models and paid services around pyannote.\" extra_gated_fields: Company/university: text Website: text --- Using this open-source model in production? Consider switching to pyannoteAI for better and faster options. # 🎹 \"Powerset\" speaker segmentation This model ingests 10 seconds of mono audio sampled at 16kHz and outputs speaker diarization as a (num_frames, num_classes) matrix where the 7 classes are _non-speech_, _speaker #1_, _speaker #2_, _speaker #3_, _speakers #1 and #2_, _speakers #1 and #3_, and _speakers #2 and #3_. !Example output The various concepts behind this model are described in details in this paper. It has been trained by Séverin Baroudi with pyannote.audio using the combination of the training sets of AISHELL, AliMeeting, AMI, AVA-AVD, DIHARD, Ego4D, MSDWild, REPERE, and VoxConverse. This companion repository by Alexis Plaquet also provides instructions on how to train or finetune such a model on your own data. ## Requirements 1. Install []( with 2. Accept []( user conditions 3. Create access token at []( ## Usage ### Speaker diarization This model cannot be used to perform speaker diarization of full recordings on its own (it only processes 10s chunks). See pyannote/speaker-diarization-3.0 pipeline that uses an additional speaker embedding model to perform full recording speaker diarization. ### Voice activity detection ### Overlapped speech detection ## Citations", + "model_explanation_gemini": "\"Performs speaker segmentation by classifying 10s audio chunks into 7 classes (non-speech, individual speakers, and overlapping speaker combinations).\"\n\nFeatures: \n- Processes 10-second mono audio clips at 16kHz \n- Outputs a (num_frames, num_classes) matrix \n- Classifies: \n 1. Non-speech \n 2. Speaker #1 \n 3. Speaker #2 \n 4. Speaker #3 \n 5. Speakers #1+#" +} \ No newline at end of file diff --git a/model_data_json/pyannote_segmentation.json b/model_data_json/pyannote_segmentation.json new file mode 100644 index 0000000000000000000000000000000000000000..ace95373c7db812f7c2448134340df3b3bda3154 --- /dev/null +++ b/model_data_json/pyannote_segmentation.json @@ -0,0 +1,23 @@ +{ + "model_id": "pyannote/segmentation", + "downloads": 9567935, + "tags": [ + "pyannote-audio", + "pytorch", + "pyannote", + "pyannote-audio-model", + "audio", + "voice", + "speech", + "speaker", + "speaker-segmentation", + "voice-activity-detection", + "overlapped-speech-detection", + "resegmentation", + "arxiv:2104.04045", + "license:mit", + "region:us" + ], + "description": "--- tags: - pyannote - pyannote-audio - pyannote-audio-model - audio - voice - speech - speaker - speaker-segmentation - voice-activity-detection - overlapped-speech-detection - resegmentation license: mit inference: false extra_gated_prompt: \"The collected information will help acquire a better knowledge of pyannote.audio userbase and help its maintainers apply for grants to improve it further. If you are an academic researcher, please cite the relevant papers in your own publications using the model. If you work for a company, please consider contributing back to pyannote.audio development (e.g. through unrestricted gifts). We also provide scientific consulting services around speaker diarization and machine listening.\" extra_gated_fields: Company/university: text Website: text I plan to use this model for (task, type of audio data, etc): text --- Using this open-source model in production? Consider switching to pyannoteAI for better and faster options. # 🎹 Speaker segmentation Paper | Demo | Blog post !Example ## Usage Relies on pyannote.audio 2.1.1: see installation instructions. ### Voice activity detection ### Overlapped speech detection ### Resegmentation ### Raw scores ## Citation ## Reproducible research In order to reproduce the results of the paper \"End-to-end speaker segmentation for overlap-aware resegmentation \", use with the following hyper-parameters: | Voice activity detection | | | | | | ------------------------ | ------- | -------- | ----------------- | ------------------ | | AMI Mix-Headset | 0.684 | 0.577 | 0.181 | 0.037 | | DIHARD3 | 0.767 | 0.377 | 0.136 | 0.067 | | VoxConverse | 0.767 | 0.713 | 0.182 | 0.501 | | Overlapped speech detection | | | | | | --------------------------- | ------- | -------- | ----------------- | ------------------ | | AMI Mix-Headset | 0.448 | 0.362 | 0.116 | 0.187 | | DIHARD3 | 0.430 | 0.320 | 0.091 | 0.144 | | VoxConverse | 0.587 | 0.426 | 0.337 | 0.112 | | Resegmentation of VBx | | | | | | --------------------- | ------- | -------- | ----------------- | ------------------ | | AMI Mix-Headset | 0.542 | 0.527 | 0.044 | 0.705 | | DIHARD3 | 0.592 | 0.489 | 0.163 | 0.182 | | VoxConverse | 0.537 | 0.724 | 0.410 | 0.563 | Expected outputs (and VBx baseline) are also provided in the sub-directories.", + "model_explanation_gemini": "\"Performs speaker segmentation, voice activity detection, overlapped speech detection, and resegmentation in audio data.\"\n\nModel Features: \n- Speaker segmentation \n- Voice activity detection \n- Overlapped speech detection \n- Resegmentation capabilities \n\nComparative Explanation: \nThe description suggests switching to pyannoteAI for production use, implying this model may be slower or less optimized compared to pyannoteAI's commercial offering, while still providing core speaker diarization functionalities. Performance metrics are provided for" +} \ No newline at end of file diff --git a/model_data_json/pyannote_speaker-diarization-3.0.json b/model_data_json/pyannote_speaker-diarization-3.0.json new file mode 100644 index 0000000000000000000000000000000000000000..ecca713341a6f64e1a155fe2a3ff0d2bed96f8c4 --- /dev/null +++ b/model_data_json/pyannote_speaker-diarization-3.0.json @@ -0,0 +1,24 @@ +{ + "model_id": "pyannote/speaker-diarization-3.0", + "downloads": 446935, + "tags": [ + "pyannote-audio", + "pyannote", + "pyannote-audio-pipeline", + "audio", + "voice", + "speech", + "speaker", + "speaker-diarization", + "speaker-change-detection", + "voice-activity-detection", + "overlapped-speech-detection", + "automatic-speech-recognition", + "arxiv:2111.14448", + "arxiv:2012.01477", + "license:mit", + "region:us" + ], + "description": "--- tags: - pyannote - pyannote-audio - pyannote-audio-pipeline - audio - voice - speech - speaker - speaker-diarization - speaker-change-detection - voice-activity-detection - overlapped-speech-detection - automatic-speech-recognition license: mit extra_gated_prompt: \"The collected information will help acquire a better knowledge of pyannote.audio userbase and help its maintainers improve it further. Though this pipeline uses MIT license and will always remain open-source, we will occasionnally email you about premium pipelines and paid services around pyannote.\" extra_gated_fields: Company/university: text Website: text --- Using this open-source model in production? Consider switching to pyannoteAI for better and faster options. # 🎹 Speaker diarization 3.0 This pipeline has been trained by Séverin Baroudi with pyannote.audio using a combination of the training sets of AISHELL, AliMeeting, AMI, AVA-AVD, DIHARD, Ego4D, MSDWild, REPERE, and VoxConverse. It ingests mono audio sampled at 16kHz and outputs speaker diarization as an []( instance: * stereo or multi-channel audio files are automatically downmixed to mono by averaging the channels. * audio files sampled at a different rate are resampled to 16kHz automatically upon loading. ## Requirements 1. Install []( with 2. Accept []( user conditions 3. Accept []( user conditions 4. Create access token at []( ## Usage ### Processing on GPU pipelines run on CPU by default. You can send them to GPU with the following lines: Real-time factor is around 2.5% using one Nvidia Tesla V100 SXM2 GPU (for the neural inference part) and one Intel Cascade Lake 6248 CPU (for the clustering part). In other words, it takes approximately 1.5 minutes to process a one hour conversation. ### Processing from memory Pre-loading audio files in memory may result in faster processing: ### Monitoring progress Hooks are available to monitor the progress of the pipeline: ### Controlling the number of speakers In case the number of speakers is known in advance, one can use the option: One can also provide lower and/or upper bounds on the number of speakers using and options: ## Benchmark This pipeline has been benchmarked on a large collection of datasets. Processing is fully automatic: * no manual voice activity detection (as is sometimes the case in the literature) * no manual number of speakers (though it is possible to provide it to the pipeline) * no fine-tuning of the internal models nor tuning of the pipeline hyper-parameters to each dataset ... with the least forgiving diarization error rate (DER) setup (named *\"Full\"* in this paper): * no forgiveness collar * evaluation of overlapped speech | Benchmark | DER% | FA% | Miss% | Conf% | Expected output | File-level evaluation | | ------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------- | --------------------------- | ---------------------------------- | ----------------------------------- | ----------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------- | | AISHELL-4 | 12.3 | 3.8 | 4.4 | 4.1 | RTTM | eval | | AliMeeting (*channel 1*) | 24.3 | 4.4 | 10.0 | 9.9 | RTTM | eval | | AMI (*headset mix,* *only_words*) | 19.0 | 3.6 | 9.5 | 5.9 | RTTM | eval | | AMI (*array1, channel 1,* *only_words)* | 22.2 | 3.8 | 11.2 | 7.3 | RTTM | eval | | AVA-AVD | 49.1 | 10.8 | 15.7| 22.5 | RTTM | eval | | DIHARD 3 (*Full*) | 21.7 | 6.2 | 8.1 | 7.3 | RTTM | eval | | MSDWild | 24.6 | 5.8 | 8.0 | 10.7 | RTTM | eval | | REPERE (*phase 2*) | 7.8 | 1.8 | 2.6 | 3.5 | RTTM | eval | | VoxConverse (*v0.3*) | 11.3 | 4.1 | 3.4 | 3.8 | RTTM | eval | ## Citations", + "model_explanation_gemini": "\"Performs speaker diarization by identifying and segmenting different speakers in mono 16kHz audio, with capabilities for voice activity detection, speaker change detection, and overlapped speech detection.\"\n\nModel Features: \n- Processes mono 16kHz audio (auto-downmixes/resamples input) \n- Outputs speaker diarization as RTTM format \n- Supports GPU acceleration (~2.5% real-time factor on V100) \n- Allows manual control of speaker count bounds \n- Automatic" +} \ No newline at end of file diff --git a/model_data_json/pyannote_speaker-diarization-3.1.json b/model_data_json/pyannote_speaker-diarization-3.1.json new file mode 100644 index 0000000000000000000000000000000000000000..de4fee3c1d1653d5c1f09c55a07703cbf222b63c --- /dev/null +++ b/model_data_json/pyannote_speaker-diarization-3.1.json @@ -0,0 +1,25 @@ +{ + "model_id": "pyannote/speaker-diarization-3.1", + "downloads": 10614517, + "tags": [ + "pyannote-audio", + "pyannote", + "pyannote-audio-pipeline", + "audio", + "voice", + "speech", + "speaker", + "speaker-diarization", + "speaker-change-detection", + "voice-activity-detection", + "overlapped-speech-detection", + "automatic-speech-recognition", + "arxiv:2111.14448", + "arxiv:2012.01477", + "license:mit", + "endpoints_compatible", + "region:us" + ], + "description": "--- tags: - pyannote - pyannote-audio - pyannote-audio-pipeline - audio - voice - speech - speaker - speaker-diarization - speaker-change-detection - voice-activity-detection - overlapped-speech-detection - automatic-speech-recognition license: mit extra_gated_prompt: \"The collected information will help acquire a better knowledge of pyannote.audio userbase and help its maintainers improve it further. Though this pipeline uses MIT license and will always remain open-source, we will occasionnally email you about premium pipelines and paid services around pyannote.\" extra_gated_fields: Company/university: text Website: text --- Using this open-source model in production? Consider switching to pyannoteAI for better and faster options. # 🎹 Speaker diarization 3.1 This pipeline is the same as []( except it removes the problematic use of . Both speaker segmentation and embedding now run in pure PyTorch. This should ease deployment and possibly speed up inference. It requires pyannote.audio version 3.1 or higher. It ingests mono audio sampled at 16kHz and outputs speaker diarization as an []( instance: - stereo or multi-channel audio files are automatically downmixed to mono by averaging the channels. - audio files sampled at a different rate are resampled to 16kHz automatically upon loading. ## Requirements 1. Install []( with 2. Accept []( user conditions 3. Accept []( user conditions 4. Create access token at []( ## Usage ### Processing on GPU pipelines run on CPU by default. You can send them to GPU with the following lines: ### Processing from memory Pre-loading audio files in memory may result in faster processing: ### Monitoring progress Hooks are available to monitor the progress of the pipeline: ### Controlling the number of speakers In case the number of speakers is known in advance, one can use the option: One can also provide lower and/or upper bounds on the number of speakers using and options: ## Benchmark This pipeline has been benchmarked on a large collection of datasets. Processing is fully automatic: - no manual voice activity detection (as is sometimes the case in the literature) - no manual number of speakers (though it is possible to provide it to the pipeline) - no fine-tuning of the internal models nor tuning of the pipeline hyper-parameters to each dataset ... with the least forgiving diarization error rate (DER) setup (named _\"Full\"_ in this paper): - no forgiveness collar - evaluation of overlapped speech | Benchmark | DER% | FA% | Miss% | Conf% | Expected output | File-level evaluation | | ------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------- | --------------------------- | ---------------------------------- | ----------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- | | AISHELL-4 | 12.2 | 3.8 | 4.4 | 4.0 | RTTM | eval | | AliMeeting (_channel 1_) | 24.4 | 4.4 | 10.0 | 10.0 | RTTM | eval | | AMI (_headset mix,_ _only_words_) | 18.8 | 3.6 | 9.5 | 5.7 | RTTM | eval | | AMI (_array1, channel 1,_ _only_words)_ | 22.4 | 3.8 | 11.2 | 7.5 | RTTM | eval | | AVA-AVD | 50.0 | 10.8 | 15.7 | 23.4 | RTTM | eval | | DIHARD 3 (_Full_) | 21.7 | 6.2 | 8.1 | 7.3 | RTTM | eval | | MSDWild | 25.3 | 5.8 | 8.0 | 11.5 | RTTM | eval | | REPERE (_phase 2_) | 7.8 | 1.8 | 2.6 | 3.5 | RTTM | eval | | VoxConverse (_v0.3_) | 11.3 | 4.1 | 3.4 | 3.8 | RTTM | eval | ## Citations", + "model_explanation_gemini": "\"Performs speaker diarization by identifying and segmenting different speakers in 16kHz mono audio, with capabilities for voice activity detection, speaker change detection, and overlapped speech detection.\"\n\nModel Features: \n- Speaker diarization (segmenting speakers in audio) \n- Voice activity detection \n- Speaker change detection \n- Overlapped speech detection \n- Automatic downmixing of multi-channel audio to mono \n- Resampling to 16kHz if needed \n- Pure PyTorch implementation" +} \ No newline at end of file diff --git a/model_data_json/pyannote_speaker-diarization.json b/model_data_json/pyannote_speaker-diarization.json new file mode 100644 index 0000000000000000000000000000000000000000..147613c77a819f7bf9326db75dc803a984c9b4b7 --- /dev/null +++ b/model_data_json/pyannote_speaker-diarization.json @@ -0,0 +1,31 @@ +{ + "model_id": "pyannote/speaker-diarization", + "downloads": 864999, + "tags": [ + "pyannote-audio", + "pyannote", + "pyannote-audio-pipeline", + "audio", + "voice", + "speech", + "speaker", + "speaker-diarization", + "speaker-change-detection", + "voice-activity-detection", + "overlapped-speech-detection", + "automatic-speech-recognition", + "dataset:ami", + "dataset:dihard", + "dataset:voxconverse", + "dataset:aishell", + "dataset:repere", + "dataset:voxceleb", + "arxiv:2012.01477", + "arxiv:2110.07058", + "arxiv:2005.08072", + "license:mit", + "region:us" + ], + "description": "--- tags: - pyannote - pyannote-audio - pyannote-audio-pipeline - audio - voice - speech - speaker - speaker-diarization - speaker-change-detection - voice-activity-detection - overlapped-speech-detection - automatic-speech-recognition datasets: - ami - dihard - voxconverse - aishell - repere - voxceleb license: mit extra_gated_prompt: \"The collected information will help acquire a better knowledge of pyannote.audio userbase and help its maintainers apply for grants to improve it further. If you are an academic researcher, please cite the relevant papers in your own publications using the model. If you work for a company, please consider contributing back to pyannote.audio development (e.g. through unrestricted gifts). We also provide scientific consulting services around speaker diarization and machine listening.\" extra_gated_fields: Company/university: text Website: text I plan to use this model for (task, type of audio data, etc): text --- Using this open-source model in production? Consider switching to pyannoteAI for better and faster options. # 🎹 Speaker diarization Relies on pyannote.audio 2.1.1: see installation instructions. ## TL;DR ## Advanced usage In case the number of speakers is known in advance, one can use the option: One can also provide lower and/or upper bounds on the number of speakers using and options: ## Benchmark ### Real-time factor Real-time factor is around 2.5% using one Nvidia Tesla V100 SXM2 GPU (for the neural inference part) and one Intel Cascade Lake 6248 CPU (for the clustering part). In other words, it takes approximately 1.5 minutes to process a one hour conversation. ### Accuracy This pipeline is benchmarked on a growing collection of datasets. Processing is fully automatic: * no manual voice activity detection (as is sometimes the case in the literature) * no manual number of speakers (though it is possible to provide it to the pipeline) * no fine-tuning of the internal models nor tuning of the pipeline hyper-parameters to each dataset ... with the least forgiving diarization error rate (DER) setup (named *\"Full\"* in this paper): * no forgiveness collar * evaluation of overlapped speech | Benchmark | DER% | FA% | Miss% | Conf% | Expected output | File-level evaluation | | ------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------- | --------------------------- | ---------------------------------- | ----------------------------------- | ----------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------- | | AISHELL-4 | 14.09 | 5.17 | 3.27 | 5.65 | RTTM | eval | | Albayzin (*RTVE 2022*) | 25.60 | 5.58 | 6.84 | 13.18 | RTTM | eval | | AliMeeting (*channel 1*) | 27.42 | 4.84 | 14.00 | 8.58 | RTTM | eval | | AMI (*headset mix,* *only_words*) | 18.91 | 4.48 | 9.51 | 4.91 | RTTM | eval | | AMI (*array1, channel 1,* *only_words)* | 27.12 | 4.11 | 17.78 | 5.23 | RTTM | eval | | CALLHOME (*part2*) | 32.37 | 6.30 | 13.72 | 12.35 | RTTM | eval | | DIHARD 3 (*Full*) | 26.94 | 10.50 | 8.41 | 8.03 | RTTM | eval | | Ego4D *v1 (validation)* | 63.99 | 3.91 | 44.42 | 15.67 | RTTM | eval | | REPERE (*phase 2*) | 8.17 | 2.23 | 2.49 | 3.45 | RTTM | eval | | This American Life | 20.82 | 2.03 | 11.89 | 6.90 | RTTM | eval | | VoxConverse (*v0.3*) | 11.24 | 4.42 | 2.88 | 3.94 | RTTM | eval | ## Technical report This report describes the main principles behind version of pyannote.audio speaker diarization pipeline. It also provides recipes explaining how to adapt the pipeline to your own set of annotated data. In particular, those are applied to the above benchmark and consistently leads to significant performance improvement over the above out-of-the-box performance. ## Citations", + "model_explanation_gemini": "\"Performs speaker diarization by identifying and segmenting different speakers in audio, including voice activity detection, speaker change detection, and overlapped speech detection.\"\n\nFeatures: \n- Speaker diarization \n- Voice activity detection \n- Speaker change detection \n- Overlapped speech detection \n- Supports known speaker count constraints \n- Benchmarked on multiple datasets (AMI, DIHARD, VoxConverse, etc.) \n- Real-time processing (2.5% RTF on V100" +} \ No newline at end of file diff --git a/model_data_json/pyannote_voice-activity-detection.json b/model_data_json/pyannote_voice-activity-detection.json new file mode 100644 index 0000000000000000000000000000000000000000..3835ab9e8caa6ce404c15b4808958e4c62ef8ab5 --- /dev/null +++ b/model_data_json/pyannote_voice-activity-detection.json @@ -0,0 +1,22 @@ +{ + "model_id": "pyannote/voice-activity-detection", + "downloads": 8061538, + "tags": [ + "pyannote-audio", + "pyannote", + "pyannote-audio-pipeline", + "audio", + "voice", + "speech", + "speaker", + "voice-activity-detection", + "automatic-speech-recognition", + "dataset:ami", + "dataset:dihard", + "dataset:voxconverse", + "license:mit", + "region:us" + ], + "description": "--- tags: - pyannote - pyannote-audio - pyannote-audio-pipeline - audio - voice - speech - speaker - voice-activity-detection - automatic-speech-recognition datasets: - ami - dihard - voxconverse license: mit extra_gated_prompt: \"The collected information will help acquire a better knowledge of pyannote.audio userbase and help its maintainers apply for grants to improve it further. If you are an academic researcher, please cite the relevant papers in your own publications using the model. If you work for a company, please consider contributing back to pyannote.audio development (e.g. through unrestricted gifts). We also provide scientific consulting services around speaker diarization and machine listening.\" extra_gated_fields: Company/university: text Website: text I plan to use this model for (task, type of audio data, etc): text --- Using this open-source model in production? Consider switching to pyannoteAI for better and faster options. # 🎹 Voice activity detection Relies on pyannote.audio 2.1: see installation instructions. ## Citation", + "model_explanation_gemini": "Detects voice activity in audio to identify speech segments.\n\nFeatures: \n- Voice activity detection \n- Supports datasets: AMI, DIHARD, VoxConverse \n- MIT license \n- Requires pyannote.audio 2.1 \n\nComparison: \nThe description suggests switching to \"pyannoteAI\" for production use, implying it offers superior performance or efficiency compared to this open-source model." +} \ No newline at end of file diff --git a/model_data_json/qanastek_51-languages-classifier.json b/model_data_json/qanastek_51-languages-classifier.json new file mode 100644 index 0000000000000000000000000000000000000000..8fe7f25bd12dc93273aeeed307c12b5cec41f0de --- /dev/null +++ b/model_data_json/qanastek_51-languages-classifier.json @@ -0,0 +1,18 @@ +{ + "model_id": "qanastek/51-languages-classifier", + "downloads": 83055, + "tags": [ + "transformers", + "pytorch", + "Transformers", + "text-classification", + "multi-class-classification", + "dataset:qanastek/MASSIVE", + "arxiv:1911.02116", + "license:cc-by-4.0", + "endpoints_compatible", + "region:us" + ], + "description": "--- tags: - Transformers - text-classification - multi-class-classification languages: - af-ZA - am-ET - ar-SA - az-AZ - bn-BD - cy-GB - da-DK - de-DE - el-GR - en-US - es-ES - fa-IR - fi-FI - fr-FR - he-IL - hi-IN - hu-HU - hy-AM - id-ID - is-IS - it-IT - ja-JP - jv-ID - ka-GE - km-KH - kn-IN - ko-KR - lv-LV - ml-IN - mn-MN - ms-MY - my-MM - nb-NO - nl-NL - pl-PL - pt-PT - ro-RO - ru-RU - sl-SL - sq-AL - sv-SE - sw-KE - ta-IN - te-IN - th-TH - tl-PH - tr-TR - ur-PK - vi-VN - zh-CN - zh-TW multilinguality: - af-ZA - am-ET - ar-SA - az-AZ - bn-BD - cy-GB - da-DK - de-DE - el-GR - en-US - es-ES - fa-IR - fi-FI - fr-FR - he-IL - hi-IN - hu-HU - hy-AM - id-ID - is-IS - it-IT - ja-JP - jv-ID - ka-GE - km-KH - kn-IN - ko-KR - lv-LV - ml-IN - mn-MN - ms-MY - my-MM - nb-NO - nl-NL - pl-PL - pt-PT - ro-RO - ru-RU - sl-SL - sq-AL - sv-SE - sw-KE - ta-IN - te-IN - th-TH - tl-PH - tr-TR - ur-PK - vi-VN - zh-CN - zh-TW datasets: - qanastek/MASSIVE widget: - text: \"wake me up at five am this week\" - text: \"je veux écouter la chanson de jacques brel encore une fois\" - text: \"quiero escuchar la canción de arijit singh una vez más\" - text: \"olly onde é que á um parque por perto onde eu possa correr\" - text: \"פרק הבא בפודקאסט בבקשה\" - text: \"亚马逊股价\" - text: \"найди билет на поезд в санкт-петербург\" license: cc-by-4.0 --- **People Involved** * LABRAK Yanis (1) **Affiliations** 1. LIA, NLP team, Avignon University, Avignon, France. ## Model XLM-Roberta : Paper : Unsupervised Cross-lingual Representation Learning at Scale ## Demo: How to use in HuggingFace Transformers Pipeline Requires transformers: Outputs: ## Training data MASSIVE is a parallel dataset of > 1M utterances across 51 languages with annotations for the Natural Language Understanding tasks of intent prediction and slot annotation. Utterances span 60 intents and include 55 slot types. MASSIVE was created by localizing the SLURP dataset, composed of general Intelligent Voice Assistant single-shot interactions. ### Languages Thee model is capable of distinguish 51 languages : - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ## Evaluation results Keywords : language identification ; language identification ; multilingual ; classification", + "model_explanation_gemini": "Classifies text into one of 51 languages using multilingual XLM-Roberta trained on the MASSIVE dataset. \n\n**Features:** \n- **Task:** Multi-class text classification (language identification) \n- **Languages Supported:** 51 (e.g., English, French, Spanish, Chinese, Arabic, etc.) \n- **Model Architecture:** XLM-Roberta (cross-lingual representation learning) \n- **Training Data:** MASSIVE dataset (>1M utterances," +} \ No newline at end of file diff --git a/model_data_json/sentence-transformers_all-MiniLM-L6-v1.json b/model_data_json/sentence-transformers_all-MiniLM-L6-v1.json new file mode 100644 index 0000000000000000000000000000000000000000..ba1332476b0c49c94e484be2c3d9fd50a387f714 --- /dev/null +++ b/model_data_json/sentence-transformers_all-MiniLM-L6-v1.json @@ -0,0 +1,28 @@ +{ + "model_id": "sentence-transformers/all-MiniLM-L6-v1", + "downloads": 72497, + "tags": [ + "sentence-transformers", + "pytorch", + "onnx", + "safetensors", + "openvino", + "bert", + "feature-extraction", + "sentence-similarity", + "transformers", + "en", + "arxiv:1904.06472", + "arxiv:2102.07033", + "arxiv:2104.08727", + "arxiv:1704.05179", + "arxiv:1810.09305", + "license:apache-2.0", + "autotrain_compatible", + "text-embeddings-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: en license: apache-2.0 library_name: sentence-transformers tags: - sentence-transformers - feature-extraction - sentence-similarity - transformers pipeline_tag: sentence-similarity new_version: sentence-transformers/all-MiniLM-L6-v2 --- # all-MiniLM-L6-v1 This is a sentence-transformers model: It maps sentences & paragraphs to a 384 dimensional dense vector space and can be used for tasks like clustering or semantic search. ## Usage (Sentence-Transformers) Using this model becomes easy when you have sentence-transformers installed: Then you can use the model like this: ## Usage (HuggingFace Transformers) Without sentence-transformers, you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings. ------ ## Background The project aims to train sentence embedding models on very large sentence level datasets using a self-supervised contrastive learning objective. We used the pretrained []( model and fine-tuned in on a 1B sentence pairs dataset. We use a contrastive learning objective: given a sentence from the pair, the model should predict which out of a set of randomly sampled other sentences, was actually paired with it in our dataset. We developped this model during the Community week using JAX/Flax for NLP & CV, organized by Hugging Face. We developped this model as part of the project: Train the Best Sentence Embedding Model Ever with 1B Training Pairs. We benefited from efficient hardware infrastructure to run the project: 7 TPUs v3-8, as well as intervention from Googles Flax, JAX, and Cloud team member about efficient deep learning frameworks. ## Intended uses Our model is intented to be used as a sentence and short paragraph encoder. Given an input text, it ouptuts a vector which captures the semantic information. The sentence vector may be used for information retrieval, clustering or sentence similarity tasks. By default, input text longer than 128 word pieces is truncated. ## Training procedure ### Pre-training We use the pretrained []( model. Please refer to the model card for more detailed information about the pre-training procedure. ### Fine-tuning We fine-tune the model using a contrastive objective. Formally, we compute the cosine similarity from each possible sentence pairs from the batch. We then apply the cross entropy loss by comparing with true pairs. #### Hyper parameters We trained ou model on a TPU v3-8. We train the model during 100k steps using a batch size of 1024 (128 per TPU core). We use a learning rate warm up of 500. The sequence length was limited to 128 tokens. We used the AdamW optimizer with a 2e-5 learning rate. The full training script is accessible in this current repository: . #### Training data We use the concatenation from multiple datasets to fine-tune our model. The total number of sentence pairs is above 1 billion sentences. We sampled each dataset given a weighted probability which configuration is detailed in the file. | Dataset | Paper | Number of training tuples | |--------------------------------------------------------|:----------------------------------------:|:--------------------------:| | Reddit comments (2015-2018) | paper | 726,484,430 | | S2ORC Citation pairs (Abstracts) | paper | 116,288,806 | | WikiAnswers Duplicate question pairs | paper | 77,427,422 | | PAQ (Question, Answer) pairs | paper | 64,371,441 | | S2ORC Citation pairs (Titles) | paper | 52,603,982 | | S2ORC (Title, Abstract) | paper | 41,769,185 | | Stack Exchange (Title, Body) pairs | - | 25,316,456 | | MS MARCO triplets | paper | 9,144,553 | | GOOAQ: Open Question Answering with Diverse Answer Types | paper | 3,012,496 | | Yahoo Answers (Title, Answer) | paper | 1,198,260 | | Code Search | - | 1,151,414 | | COCO Image captions | paper | 828,395| | SPECTER citation triplets | paper | 684,100 | | Yahoo Answers (Question, Answer) | paper | 681,164 | | Yahoo Answers (Title, Question) | paper | 659,896 | | SearchQA | paper | 582,261 | | Eli5 | paper | 325,475 | | Flickr 30k | paper | 317,695 | | Stack Exchange Duplicate questions (titles) | | 304,525 | | AllNLI (SNLI and MultiNLI | paper SNLI, paper MultiNLI | 277,230 | | Stack Exchange Duplicate questions (bodies) | | 250,519 | | Stack Exchange Duplicate questions (titles+bodies) | | 250,460 | | Sentence Compression | paper | 180,000 | | Wikihow | paper | 128,542 | | Altlex | paper | 112,696 | | Quora Question Triplets | - | 103,663 | | Simple Wikipedia | paper | 102,225 | | Natural Questions (NQ) | paper | 100,231 | | SQuAD2.0 | paper | 87,599 | | TriviaQA | - | 73,346 | | **Total** | | **1,124,818,467** |", + "model_explanation_gemini": "Maps sentences and paragraphs to 384-dimensional vectors for tasks like clustering and semantic search.\n\nFeatures: \n- Sentence/paragraph embedding into 384D vector space \n- Supports clustering, semantic search, and similarity tasks \n- Trained on 1B+ sentence pairs using contrastive learning \n- Inputs truncated at 128 word pieces \n- Fine-tuned from a pretrained model \n\nComparison: \nThis model is smaller (L6) and faster than larger alternatives while maintaining competitive performance for" +} \ No newline at end of file diff --git a/model_data_json/sentence-transformers_sentence-t5-base.json b/model_data_json/sentence-transformers_sentence-t5-base.json new file mode 100644 index 0000000000000000000000000000000000000000..7a82d8a498b9cfc6e9b61543c1396df04423fe5f --- /dev/null +++ b/model_data_json/sentence-transformers_sentence-t5-base.json @@ -0,0 +1,21 @@ +{ + "model_id": "sentence-transformers/sentence-t5-base", + "downloads": 83144, + "tags": [ + "sentence-transformers", + "pytorch", + "rust", + "safetensors", + "t5", + "feature-extraction", + "sentence-similarity", + "en", + "arxiv:2108.08877", + "license:apache-2.0", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: en license: apache-2.0 library_name: sentence-transformers tags: - sentence-transformers - feature-extraction - sentence-similarity pipeline_tag: sentence-similarity --- # sentence-transformers/sentence-t5-base This is a sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space. The model works well for sentence similarity tasks, but doesn't perform that well for semantic search tasks. This model was converted from the Tensorflow model st5-base-1 to PyTorch. When using this model, have a look at the publication: Sentence-T5: Scalable sentence encoders from pre-trained text-to-text models. The tfhub model and this PyTorch model can produce slightly different embeddings, however, when run on the same benchmarks, they produce identical results. The model uses only the encoder from a T5-base model. The weights are stored in FP16. ## Usage (Sentence-Transformers) Using this model becomes easy when you have sentence-transformers installed: Then you can use the model like this: The model requires sentence-transformers version 2.2.0 or newer. ## Citing & Authors If you find this model helpful, please cite the respective publication: Sentence-T5: Scalable sentence encoders from pre-trained text-to-text models", + "model_explanation_gemini": "Maps sentences and paragraphs to 768-dimensional vectors for sentence similarity tasks but performs poorly for semantic search.\n\nFeatures: \n- Converts text to 768-dimensional dense vectors \n- Optimized for sentence similarity tasks \n- Based on T5-base encoder (weights in FP16) \n- Converted from TensorFlow to PyTorch \n- Requires sentence-transformers ≥2.2.0 \n\nComparison: \nUnlike some other sentence-transformers models, this one excels at sentence similarity but underper" +} \ No newline at end of file diff --git a/model_data_json/sentence-transformers_sentence-t5-xl.json b/model_data_json/sentence-transformers_sentence-t5-xl.json new file mode 100644 index 0000000000000000000000000000000000000000..fe0a70b90c3105ecc1cc3105fb9f9fd3a0ccf502 --- /dev/null +++ b/model_data_json/sentence-transformers_sentence-t5-xl.json @@ -0,0 +1,20 @@ +{ + "model_id": "sentence-transformers/sentence-t5-xl", + "downloads": 76312, + "tags": [ + "sentence-transformers", + "pytorch", + "safetensors", + "t5", + "feature-extraction", + "sentence-similarity", + "en", + "arxiv:2108.08877", + "license:apache-2.0", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: en license: apache-2.0 library_name: sentence-transformers tags: - sentence-transformers - feature-extraction - sentence-similarity pipeline_tag: sentence-similarity --- # sentence-transformers/sentence-t5-xl This is a sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space. The model works well for sentence similarity tasks, but doesn't perform that well for semantic search tasks. This model was converted from the Tensorflow model st5-3b-1 to PyTorch. When using this model, have a look at the publication: Sentence-T5: Scalable sentence encoders from pre-trained text-to-text models. The tfhub model and this PyTorch model can produce slightly different embeddings, however, when run on the same benchmarks, they produce identical results. The model uses only the encoder from a T5-3B model. The weights are stored in FP16. ## Usage (Sentence-Transformers) Using this model becomes easy when you have sentence-transformers installed: Then you can use the model like this: The model requires sentence-transformers version 2.2.0 or newer. ## Citing & Authors If you find this model helpful, please cite the respective publication: Sentence-T5: Scalable sentence encoders from pre-trained text-to-text models", + "model_explanation_gemini": "Maps sentences and paragraphs to 768-dimensional vectors for sentence similarity tasks, but is less effective for semantic search. \n\n**Features:** \n- Converts text to dense 768-dimensional embeddings \n- Optimized for sentence similarity \n- Based on T5-3B encoder (FP16 weights) \n- Converted from TensorFlow (st5-3b-1) to PyTorch \n\n**Comparison:** \nPerforms similarly to its TensorFlow counterpart in benchmarks but may produce slightly different" +} \ No newline at end of file diff --git a/model_data_json/sergeyzh_rubert-tiny-turbo.json b/model_data_json/sergeyzh_rubert-tiny-turbo.json new file mode 100644 index 0000000000000000000000000000000000000000..994de23fc89026b4f436382cc89975c2e27f6945 --- /dev/null +++ b/model_data_json/sergeyzh_rubert-tiny-turbo.json @@ -0,0 +1,30 @@ +{ + "model_id": "sergeyzh/rubert-tiny-turbo", + "downloads": 76818, + "tags": [ + "sentence-transformers", + "safetensors", + "bert", + "feature-extraction", + "russian", + "pretraining", + "embeddings", + "tiny", + "sentence-similarity", + "transformers", + "mteb", + "ru", + "dataset:IlyaGusev/gazeta", + "dataset:zloelias/lenta-ru", + "base_model:cointegrated/rubert-tiny2", + "base_model:finetune:cointegrated/rubert-tiny2", + "license:mit", + "model-index", + "autotrain_compatible", + "text-embeddings-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: - ru pipeline_tag: sentence-similarity tags: - russian - pretraining - embeddings - tiny - feature-extraction - sentence-similarity - sentence-transformers - transformers - mteb datasets: - IlyaGusev/gazeta - zloelias/lenta-ru license: mit base_model: cointegrated/rubert-tiny2 model-index: - name: sergeyzh/rubert-tiny-turbo results: - dataset: config: default name: MTEB AILACasedocs (default) revision: 4106e6bcc72e0698d714ea8b101355e3e238431a split: test type: mteb/AILA_casedocs metrics: - type: main_score value: 7.432999999999999 - type: map_at_1 value: 0.604 - type: map_at_10 value: 3.8989999999999996 - type: map_at_100 value: 7.89 - type: map_at_1000 value: 8.417 - type: map_at_20 value: 5.007000000000001 - type: map_at_3 value: 2.688 - type: map_at_5 value: 3.0380000000000003 - type: mrr_at_1 value: 6.0 - type: mrr_at_10 value: 11.799999999999999 - type: mrr_at_100 value: 14.417998426795965 - type: mrr_at_1000 value: 14.474056627618499 - type: mrr_at_20 value: 13.017532467532467 - type: mrr_at_3 value: 10.333333333333334 - type: mrr_at_5 value: 10.733333333333333 - type: nauc_map_at_1000_diff1 value: -18.649405381116548 - type: nauc_map_at_1000_max value: 53.92467833877199 - type: nauc_map_at_1000_std value: -37.567628121407296 - type: nauc_map_at_100_diff1 value: -19.053926237591206 - type: nauc_map_at_100_max value: 53.442907236002725 - type: nauc_map_at_100_std value: -37.310817568902884 - type: nauc_map_at_10_diff1 value: -13.464050841785403 - type: nauc_map_at_10_max value: 48.093886298979946 - type: nauc_map_at_10_std value: -34.85388157835729 - type: nauc_map_at_1_diff1 value: -13.741863044507388 - type: nauc_map_at_1_max value: 88.80266056441289 - type: nauc_map_at_1_std value: -52.44805080502242 - type: nauc_map_at_20_diff1 value: -14.561491138058782 - type: nauc_map_at_20_max value: 48.97477701904 - type: nauc_map_at_20_std value: -31.218577996781537 - type: nauc_map_at_3_diff1 value: -15.370170931276068 - type: nauc_map_at_3_max value: 53.443631887225486 - type: nauc_map_at_3_std value: -40.92344513873499 - type: nauc_map_at_5_diff1 value: -12.899827975508286 - type: nauc_map_at_5_max value: 56.55724779187716 - type: nauc_map_at_5_std value: -38.50107328981899 - type: nauc_mrr_at_1000_diff1 value: -20.480388426956775 - type: nauc_mrr_at_1000_max value: 59.34434186773745 - type: nauc_mrr_at_1000_std value: -38.78219708358511 - type: nauc_mrr_at_100_diff1 value: -20.733217227513638 - type: nauc_mrr_at_100_max value: 59.338571965753026 - type: nauc_mrr_at_100_std value: -38.905241386083524 - type: nauc_mrr_at_10_diff1 value: -23.191503817950903 - type: nauc_mrr_at_10_max value: 59.40585262343663 - type: nauc_mrr_at_10_std value: -39.558082853802894 - type: nauc_mrr_at_1_diff1 value: -18.978624452195685 - type: nauc_mrr_at_1_max value: 88.73088274751811 - type: nauc_mrr_at_1_std value: -52.46400143099903 - type: nauc_mrr_at_20_diff1 value: -20.110327257289537 - type: nauc_mrr_at_20_max value: 57.24590011894607 - type: nauc_mrr_at_20_std value: -36.76057923211494 - type: nauc_mrr_at_3_diff1 value: -20.292924276357084 - type: nauc_mrr_at_3_max value: 62.92624417852826 - type: nauc_mrr_at_3_std value: -42.31284612573441 - type: nauc_mrr_at_5_diff1 value: -22.088780368608298 - type: nauc_mrr_at_5_max value: 61.62928734634482 - type: nauc_mrr_at_5_std value: -38.47155384792127 - type: nauc_ndcg_at_1000_diff1 value: -21.96644342707332 - type: nauc_ndcg_at_1000_max value: 54.04115629470727 - type: nauc_ndcg_at_1000_std value: -38.60954619686922 - type: nauc_ndcg_at_100_diff1 value: -28.508933576201116 - type: nauc_ndcg_at_100_max value: 53.62925134001747 - type: nauc_ndcg_at_100_std value: -41.66742945815351 - type: nauc_ndcg_at_10_diff1 value: -19.22314681419278 - type: nauc_ndcg_at_10_max value: 44.88305374351992 - type: nauc_ndcg_at_10_std value: -32.86086137849654 - type: nauc_ndcg_at_1_diff1 value: -18.978624452195685 - type: nauc_ndcg_at_1_max value: 88.73088274751811 - type: nauc_ndcg_at_1_std value: -52.46400143099903 - type: nauc_ndcg_at_20_diff1 value: -14.037813797353552 - type: nauc_ndcg_at_20_max value: 43.01748289241327 - type: nauc_ndcg_at_20_std value: -23.548077008049674 - type: nauc_ndcg_at_3_diff1 value: -19.9659903984576 - type: nauc_ndcg_at_3_max value: 64.99817864354436 - type: nauc_ndcg_at_3_std value: -45.246163550721796 - type: nauc_ndcg_at_5_diff1 value: -20.389688306447788 - type: nauc_ndcg_at_5_max value: 61.370293646369454 - type: nauc_ndcg_at_5_std value: -39.9134710853091 - type: nauc_precision_at_1000_diff1 value: -26.69952361901621 - type: nauc_precision_at_1000_max value: 46.40932456102013 - type: nauc_precision_at_1000_std value: -37.38094677778857 - type: nauc_precision_at_100_diff1 value: -29.692268260058146 - type: nauc_precision_at_100_max value: 49.265913223173584 - type: nauc_precision_at_100_std value: -41.45888232985447 - type: nauc_precision_at_10_diff1 value: -20.974428245377048 - type: nauc_precision_at_10_max value: 53.924262890679564 - type: nauc_precision_at_10_std value: -35.74456192649867 - type: nauc_precision_at_1_diff1 value: -18.978624452195685 - type: nauc_precision_at_1_max value: 88.73088274751811 - type: nauc_precision_at_1_std value: -52.46400143099903 - type: nauc_precision_at_20_diff1 value: -23.03848763224966 - type: nauc_precision_at_20_max value: 51.19001778609016 - type: nauc_precision_at_20_std value: -33.25265416139501 - type: nauc_precision_at_3_diff1 value: -19.497362250879267 - type: nauc_precision_at_3_max value: 64.71277842907384 - type: nauc_precision_at_3_std value: -44.512016412661204 - type: nauc_precision_at_5_diff1 value: -18.918918918918912 - type: nauc_precision_at_5_max value: 64.89456489456494 - type: nauc_precision_at_5_std value: -37.37960880818024 - type: nauc_recall_at_1000_diff1 value: .nan - type: nauc_recall_at_1000_max value: .nan - type: nauc_recall_at_1000_std value: .nan - type: nauc_recall_at_100_diff1 value: -44.51937508102329 - type: nauc_recall_at_100_max value: 25.75429602376942 - type: nauc_recall_at_100_std value: -33.30783195688129 - type: nauc_recall_at_10_diff1 value: -18.776401920240275 - type: nauc_recall_at_10_max value: 23.00791681188562 - type: nauc_recall_at_10_std value: -21.576198296256532 - type: nauc_recall_at_1_diff1 value: -13.741863044507388 - type: nauc_recall_at_1_max value: 88.80266056441289 - type: nauc_recall_at_1_std value: -52.44805080502242 - type: nauc_recall_at_20_diff1 value: -3.8724115673803343 - type: nauc_recall_at_20_max value: 21.50124528790692 - type: nauc_recall_at_20_std value: -1.6719812367243132 - type: nauc_recall_at_3_diff1 value: -20.21079163108882 - type: nauc_recall_at_3_max value: 42.152167178196684 - type: nauc_recall_at_3_std value: -36.258746145318526 - type: nauc_recall_at_5_diff1 value: -22.10269915203519 - type: nauc_recall_at_5_max value: 43.30767031613079 - type: nauc_recall_at_5_std value: -27.398704255640478 - type: ndcg_at_1 value: 6.0 - type: ndcg_at_10 value: 7.432999999999999 - type: ndcg_at_100 value: 26.354 - type: ndcg_at_1000 value: 30.558000000000003 - type: ndcg_at_20 value: 11.143 - type: ndcg_at_3 value: 7.979 - type: ndcg_at_5 value: 6.81 - type: precision_at_1 value: 6.0 - type: precision_at_10 value: 4.2 - type: precision_at_100 value: 3.1199999999999997 - type: precision_at_1000 value: 0.38999999999999996 - type: precision_at_20 value: 4.2 - type: precision_at_3 value: 8.0 - type: precision_at_5 value: 5.6000000000000005 - type: recall_at_1 value: 0.604 - type: recall_at_10 value: 9.678 - type: recall_at_100 value: 78.645 - type: recall_at_1000 value: 100.0 - type: recall_at_20 value: 20.79 - type: recall_at_3 value: 4.261 - type: recall_at_5 value: 5.011 task: type: Retrieval - dataset: config: default name: MTEB AILAStatutes (default) revision: ebfcd844eadd3d667efa3c57fc5c8c87f5c2867e split: test type: mteb/AILA_statutes metrics: - type: main_score value: 13.624 - type: map_at_1 value: 1.7999999999999998 - type: map_at_10 value: 6.41 - type: map_at_100 value: 11.995000000000001 - type: map_at_1000 value: 11.995000000000001 - type: map_at_20 value: 7.33 - type: map_at_3 value: 4.089 - type: map_at_5 value: 5.192 - type: mrr_at_1 value: 8.0 - type: mrr_at_10 value: 20.935714285714287 - type: mrr_at_100 value: 23.02755974294914 - type: mrr_at_1000 value: 23.02755974294914 - type: mrr_at_20 value: 22.1038126476207 - type: mrr_at_3 value: 15.333333333333332 - type: mrr_at_5 value: 19.533333333333335 - type: nauc_map_at_1000_diff1 value: 5.278882422253006 - type: nauc_map_at_1000_max value: 3.7333073133608896 - type: nauc_map_at_1000_std value: -4.5637189871999775 - type: nauc_map_at_100_diff1 value: 5.278882422253006 - type: nauc_map_at_100_max value: 3.7333073133608896 - type: nauc_map_at_100_std value: -4.5637189871999775 - type: nauc_map_at_10_diff1 value: 8.570212263630141 - type: nauc_map_at_10_max value: -6.6489980060039295 - type: nauc_map_at_10_std value: -12.162352126704402 - type: nauc_map_at_1_diff1 value: 7.476969859583216 - type: nauc_map_at_1_max value: -26.629997316876853 - type: nauc_map_at_1_std value: -23.469874489461308 - type: nauc_map_at_20_diff1 value: 7.222345063366828 - type: nauc_map_at_20_max value: -2.5103197323267223 - type: nauc_map_at_20_std value: -10.997015623527455 - type: nauc_map_at_3_diff1 value: 14.924734426277178 - type: nauc_map_at_3_max value: -11.92937537932614 - type: nauc_map_at_3_std value: -4.9319666083973255 - type: nauc_map_at_5_diff1 value: 8.080773945621521 - type: nauc_map_at_5_max value: -3.8175754142607836 - type: nauc_map_at_5_std value: -4.541639774033337 - type: nauc_mrr_at_1000_diff1 value: 2.4122089783406646 - type: nauc_mrr_at_1000_max value: -15.876004562207497 - type: nauc_mrr_at_1000_std value: -12.985028057822372 - type: nauc_mrr_at_100_diff1 value: 2.4122089783406646 - type: nauc_mrr_at_100_max value: -15.876004562207497 - type: nauc_mrr_at_100_std value: -12.985028057822372 - type: nauc_mrr_at_10_diff1 value: 0.2857311186354727 - type: nauc_mrr_at_10_max value: -14.63697545190418 - type: nauc_mrr_at_10_std value: -12.056570964159198 - type: nauc_mrr_at_1_diff1 value: 6.868795277703242 - type: nauc_mrr_at_1_max value: -24.845720418567222 - type: nauc_mrr_at_1_std value: -20.686879527770337 - type: nauc_mrr_at_20_diff1 value: 1.8452171261188577 - type: nauc_mrr_at_20_max value: -15.538023663956924 - type: nauc_mrr_at_20_std value: -13.690749771450164 - type: nauc_mrr_at_3_diff1 value: 10.557261573838256 - type: nauc_mrr_at_3_max value: -20.946427791765498 - type: nauc_mrr_at_3_std value: -9.815750025468983 - type: nauc_mrr_at_5_diff1 value: 4.101442020672411 - type: nauc_mrr_at_5_max value: -14.963605604722682 - type: nauc_mrr_at_5_std value: -9.917384084595511 - type: nauc_ndcg_at_1000_diff1 value: 0.04370368246080858 - type: nauc_ndcg_at_1000_max value: -0.818088536466922 - type: nauc_ndcg_at_1000_std value: -4.74569960455296 - type: nauc_ndcg_at_100_diff1 value: 0.04370368246080858 - type: nauc_ndcg_at_100_max value: -0.818088536466922 - type: nauc_ndcg_at_100_std value: -4.74569960455296 - type: nauc_ndcg_at_10_diff1 value: 1.2847289677534977 - type: nauc_ndcg_at_10_max value: -6.3756503900224955 - type: nauc_ndcg_at_10_std value: -12.98730478286347 - type: nauc_ndcg_at_1_diff1 value: 6.868795277703242 - type: nauc_ndcg_at_1_max value: -24.845720418567222 - type: nauc_ndcg_at_1_std value: -20.686879527770337 - type: nauc_ndcg_at_20_diff1 value: 0.777375339231765 - type: nauc_ndcg_at_20_max value: -0.9649148688381876 - type: nauc_ndcg_at_20_std value: -14.374528790697976 - type: nauc_ndcg_at_3_diff1 value: 11.34233767766492 - type: nauc_ndcg_at_3_max value: -13.185097340604685 - type: nauc_ndcg_at_3_std value: -1.42817114044502 - type: nauc_ndcg_at_5_diff1 value: 3.6861855424314394 - type: nauc_ndcg_at_5_max value: -3.8049446945965877 - type: nauc_ndcg_at_5_std value: -3.627047155464453 - type: nauc_precision_at_1000_diff1 value: -23.534146832293555 - type: nauc_precision_at_1000_max value: 7.621521743107654 - type: nauc_precision_at_1000_std value: 31.79231993560317 - type: nauc_precision_at_100_diff1 value: -23.534146832293136 - type: nauc_precision_at_100_max value: 7.6215217431077615 - type: nauc_precision_at_100_std value: 31.792319935603174 - type: nauc_precision_at_10_diff1 value: -9.295902835532825 - type: nauc_precision_at_10_max value: -3.516562838357381 - type: nauc_precision_at_10_std value: -9.542266229384722 - type: nauc_precision_at_1_diff1 value: 6.868795277703242 - type: nauc_precision_at_1_max value: -24.845720418567222 - type: nauc_precision_at_1_std value: -20.686879527770337 - type: nauc_precision_at_20_diff1 value: -9.74438544160727 - type: nauc_precision_at_20_max value: 8.895012105242024 - type: nauc_precision_at_20_std value: -10.653950589210957 - type: nauc_precision_at_3_diff1 value: 8.920936116382022 - type: nauc_precision_at_3_max value: -10.246679316888065 - type: nauc_precision_at_3_std value: 5.611638203668553 - type: nauc_precision_at_5_diff1 value: -8.265025821338345 - type: nauc_precision_at_5_max value: 7.359630809801093 - type: nauc_precision_at_5_std value: 7.003625975167535 - type: nauc_recall_at_1000_diff1 value: .nan - type: nauc_recall_at_1000_max value: .nan - type: nauc_recall_at_1000_std value: .nan - type: nauc_recall_at_100_diff1 value: .nan - type: nauc_recall_at_100_max value: .nan - type: nauc_recall_at_100_std value: .nan - type: nauc_recall_at_10_diff1 value: -1.798034642140945 - type: nauc_recall_at_10_max value: 0.6924952930762724 - type: nauc_recall_at_10_std value: -13.706398349868037 - type: nauc_recall_at_1_diff1 value: 7.476969859583216 - type: nauc_recall_at_1_max value: -26.629997316876853 - type: nauc_recall_at_1_std value: -23.469874489461308 - type: nauc_recall_at_20_diff1 value: -2.659819202817919 - type: nauc_recall_at_20_max value: 10.517274540935807 - type: nauc_recall_at_20_std value: -14.235421011543991 - type: nauc_recall_at_3_diff1 value: 15.662853297442803 - type: nauc_recall_at_3_max value: -11.663877606927189 - type: nauc_recall_at_3_std value: -2.341470241427359 - type: nauc_recall_at_5_diff1 value: 2.273326115596832 - type: nauc_recall_at_5_max value: 2.8669632025879537 - type: nauc_recall_at_5_std value: -0.3450165007891684 - type: ndcg_at_1 value: 8.0 - type: ndcg_at_10 value: 13.624 - type: ndcg_at_100 value: 38.109 - type: ndcg_at_1000 value: 38.109 - type: ndcg_at_20 value: 16.907 - type: ndcg_at_3 value: 9.45 - type: ndcg_at_5 value: 10.598 - type: precision_at_1 value: 8.0 - type: precision_at_10 value: 7.3999999999999995 - type: precision_at_100 value: 4.34 - type: precision_at_1000 value: 0.434 - type: precision_at_20 value: 5.5 - type: precision_at_3 value: 10.0 - type: precision_at_5 value: 10.0 - type: recall_at_1 value: 1.7999999999999998 - type: recall_at_10 value: 18.333 - type: recall_at_100 value: 100.0 - type: recall_at_1000 value: 100.0 - type: recall_at_20 value: 26.333000000000002 - type: recall_at_3 value: 7.867 - type: recall_at_5 value: 12.333 task: type: Retrieval - dataset: config: default name: MTEB ARCChallenge (default) revision: c481e0da3dcbbad8bce7721dea9085b74320a0a3 split: test type: RAR-b/ARC-Challenge metrics: - type: main_score value: 3.8449999999999998 - type: map_at_1 value: 1.536 - type: map_at_10 value: 2.902 - type: map_at_100 value: 3.2259999999999995 - type: map_at_1000 value: 3.309 - type: map_at_20 value: 3.061 - type: map_at_3 value: 2.204 - type: map_at_5 value: 2.656 - type: mrr_at_1 value: 1.5358361774744027 - type: mrr_at_10 value: 2.902107373097134 - type: mrr_at_100 value: 3.2259697277173585 - type: mrr_at_1000 value: 3.309141234079007 - type: mrr_at_20 value: 3.0608339226581975 - type: mrr_at_3 value: 2.204209328782707 - type: mrr_at_5 value: 2.6564277588168363 - type: nauc_map_at_1000_diff1 value: 6.6349335671175 - type: nauc_map_at_1000_max value: 10.045752081479547 - type: nauc_map_at_1000_std value: 5.17373675499246 - type: nauc_map_at_100_diff1 value: 6.6240618235225135 - type: nauc_map_at_100_max value: 10.244151375429777 - type: nauc_map_at_100_std value: 5.305639061848512 - type: nauc_map_at_10_diff1 value: 7.5024069352343 - type: nauc_map_at_10_max value: 11.928684625428838 - type: nauc_map_at_10_std value: 5.016380398843673 - type: nauc_map_at_1_diff1 value: 17.26912687174127 - type: nauc_map_at_1_max value: 6.265273970269121 - type: nauc_map_at_1_std value: -4.8796731336600825 - type: nauc_map_at_20_diff1 value: 7.120932496690847 - type: nauc_map_at_20_max value: 11.15762860873897 - type: nauc_map_at_20_std value: 5.342837705336892 - type: nauc_map_at_3_diff1 value: 7.138259469017607 - type: nauc_map_at_3_max value: 8.348409228816523 - type: nauc_map_at_3_std value: 6.767314043423357 - type: nauc_map_at_5_diff1 value: 7.239963996009633 - type: nauc_map_at_5_max value: 11.068225118567208 - type: nauc_map_at_5_std value: 5.0851302044955835 - type: nauc_mrr_at_1000_diff1 value: 6.6349335671175 - type: nauc_mrr_at_1000_max value: 10.045752081479547 - type: nauc_mrr_at_1000_std value: 5.17373675499246 - type: nauc_mrr_at_100_diff1 value: 6.6240618235225135 - type: nauc_mrr_at_100_max value: 10.244151375429777 - type: nauc_mrr_at_100_std value: 5.305639061848512 - type: nauc_mrr_at_10_diff1 value: 7.5024069352343 - type: nauc_mrr_at_10_max value: 11.928684625428838 - type: nauc_mrr_at_10_std value: 5.016380398843673 - type: nauc_mrr_at_1_diff1 value: 17.26912687174127 - type: nauc_mrr_at_1_max value: 6.265273970269121 - type: nauc_mrr_at_1_std value: -4.8796731336600825 - type: nauc_mrr_at_20_diff1 value: 7.120932496690847 - type: nauc_mrr_at_20_max value: 11.15762860873897 - type: nauc_mrr_at_20_std value: 5.342837705336892 - type: nauc_mrr_at_3_diff1 value: 7.138259469017607 - type: nauc_mrr_at_3_max value: 8.348409228816523 - type: nauc_mrr_at_3_std value: 6.767314043423357 - type: nauc_mrr_at_5_diff1 value: 7.239963996009633 - type: nauc_mrr_at_5_max value: 11.068225118567208 - type: nauc_mrr_at_5_std value: 5.0851302044955835 - type: nauc_ndcg_at_1000_diff1 value: 3.49547273108029 - type: nauc_ndcg_at_1000_max value: 4.987679792326471 - type: nauc_ndcg_at_1000_std value: 4.792386661474078 - type: nauc_ndcg_at_100_diff1 value: 3.423765430486521 - type: nauc_ndcg_at_100_max value: 7.215346434617728 - type: nauc_ndcg_at_100_std value: 6.1334416812657055 - type: nauc_ndcg_at_10_diff1 value: 6.211453661355799 - type: nauc_ndcg_at_10_max value: 13.686949611790244 - type: nauc_ndcg_at_10_std value: 5.334521959588366 - type: nauc_ndcg_at_1_diff1 value: 17.26912687174127 - type: nauc_ndcg_at_1_max value: 6.265273970269121 - type: nauc_ndcg_at_1_std value: -4.8796731336600825 - type: nauc_ndcg_at_20_diff1 value: 5.269692894653953 - type: nauc_ndcg_at_20_max value: 11.466483119515134 - type: nauc_ndcg_at_20_std value: 6.208531132010362 - type: nauc_ndcg_at_3_diff1 value: 4.841534563021528 - type: nauc_ndcg_at_3_max value: 8.715299190678648 - type: nauc_ndcg_at_3_std value: 8.889648909403514 - type: nauc_ndcg_at_5_diff1 value: 5.5149763431777385 - type: nauc_ndcg_at_5_max value: 12.41579830649011 - type: nauc_ndcg_at_5_std value: 5.8568738487427865 - type: nauc_precision_at_1000_diff1 value: 1.0890041942217588 - type: nauc_precision_at_1000_max value: -1.074889035912781 - type: nauc_precision_at_1000_std value: 3.7386321369399207 - type: nauc_precision_at_100_diff1 value: 0.24898034725209317 - type: nauc_precision_at_100_max value: 2.6625432444853345 - type: nauc_precision_at_100_std value: 6.760865885892171 - type: nauc_precision_at_10_diff1 value: 4.728605530960451 - type: nauc_precision_at_10_max value: 16.098011324014156 - type: nauc_precision_at_10_std value: 5.294918338481019 - type: nauc_precision_at_1_diff1 value: 17.26912687174127 - type: nauc_precision_at_1_max value: 6.265273970269121 - type: nauc_precision_at_1_std value: -4.8796731336600825 - type: nauc_precision_at_20_diff1 value: 3.1605384012118063 - type: nauc_precision_at_20_max value: 11.228945826678288 - type: nauc_precision_at_20_std value: 7.0587619686895975 - type: nauc_precision_at_3_diff1 value: 0.15384889210192554 - type: nauc_precision_at_3_max value: 9.441612052649862 - type: nauc_precision_at_3_std value: 13.110663421557597 - type: nauc_precision_at_5_diff1 value: 2.9177590765544803 - type: nauc_precision_at_5_max value: 14.583883090410385 - type: nauc_precision_at_5_std value: 6.761154902844139 - type: nauc_recall_at_1000_diff1 value: 1.0890041942217838 - type: nauc_recall_at_1000_max value: -1.0748890359127414 - type: nauc_recall_at_1000_std value: 3.7386321369399447 - type: nauc_recall_at_100_diff1 value: 0.2489803472520955 - type: nauc_recall_at_100_max value: 2.6625432444853385 - type: nauc_recall_at_100_std value: 6.7608658858921835 - type: nauc_recall_at_10_diff1 value: 4.728605530960435 - type: nauc_recall_at_10_max value: 16.09801132401412 - type: nauc_recall_at_10_std value: 5.294918338481006 - type: nauc_recall_at_1_diff1 value: 17.26912687174127 - type: nauc_recall_at_1_max value: 6.265273970269121 - type: nauc_recall_at_1_std value: -4.8796731336600825 - type: nauc_recall_at_20_diff1 value: 3.1605384012117814 - type: nauc_recall_at_20_max value: 11.22894582667827 - type: nauc_recall_at_20_std value: 7.0587619686895655 - type: nauc_recall_at_3_diff1 value: 0.15384889210195152 - type: nauc_recall_at_3_max value: 9.441612052649868 - type: nauc_recall_at_3_std value: 13.110663421557629 - type: nauc_recall_at_5_diff1 value: 2.917759076554466 - type: nauc_recall_at_5_max value: 14.583883090410346 - type: nauc_recall_at_5_std value: 6.761154902844119 - type: ndcg_at_1 value: 1.536 - type: ndcg_at_10 value: 3.8449999999999998 - type: ndcg_at_100 value: 5.772 - type: ndcg_at_1000 value: 8.509 - type: ndcg_at_20 value: 4.426 - type: ndcg_at_3 value: 2.447 - type: ndcg_at_5 value: 3.258 - type: precision_at_1 value: 1.536 - type: precision_at_10 value: 0.6910000000000001 - type: precision_at_100 value: 0.168 - type: precision_at_1000 value: 0.04 - type: precision_at_20 value: 0.461 - type: precision_at_3 value: 1.052 - type: precision_at_5 value: 1.024 - type: recall_at_1 value: 1.536 - type: recall_at_10 value: 6.9110000000000005 - type: recall_at_100 value: 16.808999999999997 - type: recall_at_1000 value: 39.505 - type: recall_at_20 value: 9.215 - type: recall_at_3 value: 3.157 - type: recall_at_5 value: 5.119 task: type: Retrieval - dataset: config: default name: MTEB AlphaNLI (default) revision: 303f40ef3d50918d3dc43577d33f2f7344ad72c1 split: test type: RAR-b/alphanli metrics: - type: main_score value: 14.155000000000001 - type: map_at_1 value: 8.616 - type: map_at_10 value: 12.151 - type: map_at_100 value: 12.713 - type: map_at_1000 value: 12.790000000000001 - type: map_at_20 value: 12.478 - type: map_at_3 value: 10.955 - type: map_at_5 value: 11.68 - type: mrr_at_1 value: 8.616187989556137 - type: mrr_at_10 value: 12.151197728873969 - type: mrr_at_100 value: 12.713435989405935 - type: mrr_at_1000 value: 12.789534083463522 - type: mrr_at_20 value: 12.478389119397455 - type: mrr_at_3 value: 10.955178416013926 - type: mrr_at_5 value: 11.679721496953876 - type: nauc_map_at_1000_diff1 value: 38.986525912703435 - type: nauc_map_at_1000_max value: 12.219692225747707 - type: nauc_map_at_1000_std value: 1.2585343212684903 - type: nauc_map_at_100_diff1 value: 39.02868722054371 - type: nauc_map_at_100_max value: 12.248003227250122 - type: nauc_map_at_100_std value: 1.2163208553030314 - type: nauc_map_at_10_diff1 value: 40.110717683039525 - type: nauc_map_at_10_max value: 12.78605835422205 - type: nauc_map_at_10_std value: 0.6481692151906001 - type: nauc_map_at_1_diff1 value: 48.456097345786745 - type: nauc_map_at_1_max value: 14.981869102701411 - type: nauc_map_at_1_std value: -3.0707717911327226 - type: nauc_map_at_20_diff1 value: 39.42161381753684 - type: nauc_map_at_20_max value: 12.341429085851182 - type: nauc_map_at_20_std value: 0.8391480542456798 - type: nauc_map_at_3_diff1 value: 42.64699229741736 - type: nauc_map_at_3_max value: 13.681396294884618 - type: nauc_map_at_3_std value: -1.3518984290812146 - type: nauc_map_at_5_diff1 value: 41.32077190616691 - type: nauc_map_at_5_max value: 13.136429689834436 - type: nauc_map_at_5_std value: 0.32856286589434136 - type: nauc_mrr_at_1000_diff1 value: 38.98652591920884 - type: nauc_mrr_at_1000_max value: 12.219692104355413 - type: nauc_mrr_at_1000_std value: 1.2585339367622461 - type: nauc_mrr_at_100_diff1 value: 39.02868722054371 - type: nauc_mrr_at_100_max value: 12.248003227250122 - type: nauc_mrr_at_100_std value: 1.2163208553030314 - type: nauc_mrr_at_10_diff1 value: 40.110717683039525 - type: nauc_mrr_at_10_max value: 12.78605835422205 - type: nauc_mrr_at_10_std value: 0.6481692151906001 - type: nauc_mrr_at_1_diff1 value: 48.456097345786745 - type: nauc_mrr_at_1_max value: 14.981869102701411 - type: nauc_mrr_at_1_std value: -3.0707717911327226 - type: nauc_mrr_at_20_diff1 value: 39.42161381753684 - type: nauc_mrr_at_20_max value: 12.341429085851182 - type: nauc_mrr_at_20_std value: 0.8391480542456798 - type: nauc_mrr_at_3_diff1 value: 42.64699229741736 - type: nauc_mrr_at_3_max value: 13.681396294884618 - type: nauc_mrr_at_3_std value: -1.3518984290812146 - type: nauc_mrr_at_5_diff1 value: 41.32077190616691 - type: nauc_mrr_at_5_max value: 13.136429689834436 - type: nauc_mrr_at_5_std value: 0.32856286589434136 - type: nauc_ndcg_at_1000_diff1 value: 31.611075970442926 - type: nauc_ndcg_at_1000_max value: 9.936393145930218 - type: nauc_ndcg_at_1000_std value: 6.71067891152211 - type: nauc_ndcg_at_100_diff1 value: 32.58290081795884 - type: nauc_ndcg_at_100_max value: 9.842659588765363 - type: nauc_ndcg_at_100_std value: 5.498554329517975 - type: nauc_ndcg_at_10_diff1 value: 36.75293874754393 - type: nauc_ndcg_at_10_max value: 11.803286140726776 - type: nauc_ndcg_at_10_std value: 2.5976940855692074 - type: nauc_ndcg_at_1_diff1 value: 48.456097345786745 - type: nauc_ndcg_at_1_max value: 14.981869102701411 - type: nauc_ndcg_at_1_std value: -3.0707717911327226 - type: nauc_ndcg_at_20_diff1 value: 34.638144952713866 - type: nauc_ndcg_at_20_max value: 10.449640737261305 - type: nauc_ndcg_at_20_std value: 3.2195824007114675 - type: nauc_ndcg_at_3_diff1 value: 41.24511499401773 - type: nauc_ndcg_at_3_max value: 13.384003644595388 - type: nauc_ndcg_at_3_std value: -0.7628562047692254 - type: nauc_ndcg_at_5_diff1 value: 39.2155849544026 - type: nauc_ndcg_at_5_max value: 12.577199638671265 - type: nauc_ndcg_at_5_std value: 2.0185641778476127 - type: nauc_precision_at_1000_diff1 value: 11.879578040836442 - type: nauc_precision_at_1000_max value: 5.358855936542234 - type: nauc_precision_at_1000_std value: 23.471172109373907 - type: nauc_precision_at_100_diff1 value: 18.24569021314919 - type: nauc_precision_at_100_max value: 4.309548949123852 - type: nauc_precision_at_100_std value: 15.884619703445772 - type: nauc_precision_at_10_diff1 value: 29.512994402519226 - type: nauc_precision_at_10_max value: 9.634695132770453 - type: nauc_precision_at_10_std value: 6.795536654948908 - type: nauc_precision_at_1_diff1 value: 48.456097345786745 - type: nauc_precision_at_1_max value: 14.981869102701411 - type: nauc_precision_at_1_std value: -3.0707717911327226 - type: nauc_precision_at_20_diff1 value: 24.18871405534599 - type: nauc_precision_at_20_max value: 6.090279031407053 - type: nauc_precision_at_20_std value: 8.291882200513058 - type: nauc_precision_at_3_diff1 value: 37.926451300682054 - type: nauc_precision_at_3_max value: 12.684618853985219 - type: nauc_precision_at_3_std value: 0.6806740647349011 - type: nauc_precision_at_5_diff1 value: 34.550519136938384 - type: nauc_precision_at_5_max value: 11.344674575354038 - type: nauc_precision_at_5_std value: 5.985578706127787 - type: nauc_recall_at_1000_diff1 value: 11.879578040836519 - type: nauc_recall_at_1000_max value: 5.358855936542304 - type: nauc_recall_at_1000_std value: 23.47117210937398 - type: nauc_recall_at_100_diff1 value: 18.245690213149167 - type: nauc_recall_at_100_max value: 4.3095489491238155 - type: nauc_recall_at_100_std value: 15.88461970344576 - type: nauc_recall_at_10_diff1 value: 29.512994402519215 - type: nauc_recall_at_10_max value: 9.634695132770442 - type: nauc_recall_at_10_std value: 6.795536654948889 - type: nauc_recall_at_1_diff1 value: 48.456097345786745 - type: nauc_recall_at_1_max value: 14.981869102701411 - type: nauc_recall_at_1_std value: -3.0707717911327226 - type: nauc_recall_at_20_diff1 value: 24.188714055346 - type: nauc_recall_at_20_max value: 6.09027903140705 - type: nauc_recall_at_20_std value: 8.291882200513056 - type: nauc_recall_at_3_diff1 value: 37.92645130068206 - type: nauc_recall_at_3_max value: 12.684618853985235 - type: nauc_recall_at_3_std value: 0.6806740647349308 - type: nauc_recall_at_5_diff1 value: 34.55051913693838 - type: nauc_recall_at_5_max value: 11.344674575354015 - type: nauc_recall_at_5_std value: 5.985578706127789 - type: ndcg_at_1 value: 8.616 - type: ndcg_at_10 value: 14.155000000000001 - type: ndcg_at_100 value: 17.102 - type: ndcg_at_1000 value: 19.631 - type: ndcg_at_20 value: 15.344 - type: ndcg_at_3 value: 11.728 - type: ndcg_at_5 value: 13.025999999999998 - type: precision_at_1 value: 8.616 - type: precision_at_10 value: 2.056 - type: precision_at_100 value: 0.349 - type: precision_at_1000 value: 0.055999999999999994 - type: precision_at_20 value: 1.2630000000000001 - type: precision_at_3 value: 4.656 - type: precision_at_5 value: 3.42 - type: recall_at_1 value: 8.616 - type: recall_at_10 value: 20.561 - type: recall_at_100 value: 34.855999999999995 - type: recall_at_1000 value: 55.875 - type: recall_at_20 value: 25.261 - type: recall_at_3 value: 13.969000000000001 - type: recall_at_5 value: 17.102 task: type: Retrieval - dataset: config: default name: MTEB AmazonPolarityClassification (default) revision: e2d317d38cd51312af73b3d32a06d1a08b442046 split: test type: mteb/amazon_polarity metrics: - type: accuracy value: 68.359575 - type: ap value: 63.04430514461716 - type: ap_weighted value: 63.04430514461716 - type: f1 value: 68.12645282836293 - type: f1_weighted value: 68.12645282836293 - type: main_score value: 68.359575 task: type: Classification - dataset: config: default name: MTEB ArguAna (default) revision: c22ab2a51041ffd869aaddef7af8d8215647e41a split: test type: mteb/arguana metrics: - type: main_score value: 32.031 - type: map_at_1 value: 15.363 - type: map_at_10 value: 25.629999999999995 - type: map_at_100 value: 26.851999999999997 - type: map_at_1000 value: 26.916 - type: map_at_20 value: 26.401999999999997 - type: map_at_3 value: 21.764 - type: map_at_5 value: 23.798 - type: mrr_at_1 value: 15.647226173541965 - type: mrr_at_10 value: 25.74270699270699 - type: mrr_at_100 value: 26.95759156481371 - type: mrr_at_1000 value: 27.02192945787223 - type: mrr_at_20 value: 26.50752832488611 - type: mrr_at_3 value: 21.894262683736372 - type: mrr_at_5 value: 23.889284020862938 - type: nauc_map_at_1000_diff1 value: 9.717094498857836 - type: nauc_map_at_1000_max value: 0.006128824635771366 - type: nauc_map_at_1000_std value: 9.951724867994008 - type: nauc_map_at_100_diff1 value: 9.720746167116648 - type: nauc_map_at_100_max value: 0.03921480687966482 - type: nauc_map_at_100_std value: 10.01422840642898 - type: nauc_map_at_10_diff1 value: 9.629884802439925 - type: nauc_map_at_10_max value: -0.18895622006721804 - type: nauc_map_at_10_std value: 8.801754758016564 - type: nauc_map_at_1_diff1 value: 10.255415606776134 - type: nauc_map_at_1_max value: -2.7429221309654044 - type: nauc_map_at_1_std value: 6.866297123270523 - type: nauc_map_at_20_diff1 value: 9.707948736975794 - type: nauc_map_at_20_max value: 0.01892213753638095 - type: nauc_map_at_20_std value: 9.681790764357237 - type: nauc_map_at_3_diff1 value: 8.344213156710568 - type: nauc_map_at_3_max value: -2.0132121856529483 - type: nauc_map_at_3_std value: 8.554071405515435 - type: nauc_map_at_5_diff1 value: 9.14495583661473 - type: nauc_map_at_5_max value: -1.379873148644914 - type: nauc_map_at_5_std value: 9.044652095982553 - type: nauc_mrr_at_1000_diff1 value: 8.520276824384093 - type: nauc_mrr_at_1000_max value: -0.41053299382643904 - type: nauc_mrr_at_1000_std value: 9.770616411797125 - type: nauc_mrr_at_100_diff1 value: 8.526357726757498 - type: nauc_mrr_at_100_max value: -0.37675957362198204 - type: nauc_mrr_at_100_std value: 9.833172972935825 - type: nauc_mrr_at_10_diff1 value: 8.504469942302443 - type: nauc_mrr_at_10_max value: -0.5555290478828475 - type: nauc_mrr_at_10_std value: 8.67347986151777 - type: nauc_mrr_at_1_diff1 value: 8.924965691375194 - type: nauc_mrr_at_1_max value: -2.472212128016505 - type: nauc_mrr_at_1_std value: 6.727737069169365 - type: nauc_mrr_at_20_diff1 value: 8.527008337552795 - type: nauc_mrr_at_20_max value: -0.39130673567011953 - type: nauc_mrr_at_20_std value: 9.504234612175194 - type: nauc_mrr_at_3_diff1 value: 7.028185998793612 - type: nauc_mrr_at_3_max value: -2.531551924396665 - type: nauc_mrr_at_3_std value: 8.36654956798548 - type: nauc_mrr_at_5_diff1 value: 7.946200662893088 - type: nauc_mrr_at_5_max value: -1.8450232157342275 - type: nauc_mrr_at_5_std value: 8.855536533297968 - type: nauc_ndcg_at_1000_diff1 value: 10.148046270962398 - type: nauc_ndcg_at_1000_max value: 1.696424601847897 - type: nauc_ndcg_at_1000_std value: 13.134595506556405 - type: nauc_ndcg_at_100_diff1 value: 10.478061817612778 - type: nauc_ndcg_at_100_max value: 2.790758084465661 - type: nauc_ndcg_at_100_std value: 14.964733623242607 - type: nauc_ndcg_at_10_diff1 value: 10.372927964606154 - type: nauc_ndcg_at_10_max value: 1.9588405301435734 - type: nauc_ndcg_at_10_std value: 9.558148538160015 - type: nauc_ndcg_at_1_diff1 value: 10.255415606776134 - type: nauc_ndcg_at_1_max value: -2.7429221309654044 - type: nauc_ndcg_at_1_std value: 6.866297123270523 - type: nauc_ndcg_at_20_diff1 value: 10.807055510827903 - type: nauc_ndcg_at_20_max value: 2.873981784514884 - type: nauc_ndcg_at_20_std value: 12.684265114648849 - type: nauc_ndcg_at_3_diff1 value: 7.99043332908002 - type: nauc_ndcg_at_3_max value: -1.7537467389545258 - type: nauc_ndcg_at_3_std value: 9.282365459725794 - type: nauc_ndcg_at_5_diff1 value: 9.291919447241343 - type: nauc_ndcg_at_5_max value: -0.6986840661830845 - type: nauc_ndcg_at_5_std value: 10.155119795280289 - type: nauc_precision_at_1000_diff1 value: 5.534567864242971 - type: nauc_precision_at_1000_max value: 9.529106078051697 - type: nauc_precision_at_1000_std value: 62.0873447350283 - type: nauc_precision_at_100_diff1 value: 13.636774071684679 - type: nauc_precision_at_100_max value: 17.905397264353912 - type: nauc_precision_at_100_std value: 49.22170039944941 - type: nauc_precision_at_10_diff1 value: 12.676219389202528 - type: nauc_precision_at_10_max value: 8.164707652448252 - type: nauc_precision_at_10_std value: 11.361740427515855 - type: nauc_precision_at_1_diff1 value: 10.255415606776134 - type: nauc_precision_at_1_max value: -2.7429221309654044 - type: nauc_precision_at_1_std value: 6.866297123270523 - type: nauc_precision_at_20_diff1 value: 15.006293628353006 - type: nauc_precision_at_20_max value: 12.931321039045368 - type: nauc_precision_at_20_std value: 23.758750045585586 - type: nauc_precision_at_3_diff1 value: 7.18325478518931 - type: nauc_precision_at_3_max value: -1.1161637595134446 - type: nauc_precision_at_3_std value: 11.09645301286272 - type: nauc_precision_at_5_diff1 value: 9.780765614595015 - type: nauc_precision_at_5_max value: 1.0082157901430149 - type: nauc_precision_at_5_std value: 12.92929121494741 - type: nauc_recall_at_1000_diff1 value: 5.534567864242688 - type: nauc_recall_at_1000_max value: 9.529106078051411 - type: nauc_recall_at_1000_std value: 62.08734473502826 - type: nauc_recall_at_100_diff1 value: 13.63677407168474 - type: nauc_recall_at_100_max value: 17.905397264353898 - type: nauc_recall_at_100_std value: 49.2217003994493 - type: nauc_recall_at_10_diff1 value: 12.676219389202512 - type: nauc_recall_at_10_max value: 8.164707652448225 - type: nauc_recall_at_10_std value: 11.361740427515835 - type: nauc_recall_at_1_diff1 value: 10.255415606776134 - type: nauc_recall_at_1_max value: -2.7429221309654044 - type: nauc_recall_at_1_std value: 6.866297123270523 - type: nauc_recall_at_20_diff1 value: 15.006293628353069 - type: nauc_recall_at_20_max value: 12.931321039045434 - type: nauc_recall_at_20_std value: 23.75875004558557 - type: nauc_recall_at_3_diff1 value: 7.183254785189315 - type: nauc_recall_at_3_max value: -1.1161637595134306 - type: nauc_recall_at_3_std value: 11.096453012862733 - type: nauc_recall_at_5_diff1 value: 9.780765614595012 - type: nauc_recall_at_5_max value: 1.008215790143006 - type: nauc_recall_at_5_std value: 12.929291214947403 - type: ndcg_at_1 value: 15.363 - type: ndcg_at_10 value: 32.031 - type: ndcg_at_100 value: 38.122 - type: ndcg_at_1000 value: 39.864 - type: ndcg_at_20 value: 34.849999999999994 - type: ndcg_at_3 value: 23.965 - type: ndcg_at_5 value: 27.659 - type: precision_at_1 value: 15.363 - type: precision_at_10 value: 5.277 - type: precision_at_100 value: 0.8170000000000001 - type: precision_at_1000 value: 0.095 - type: precision_at_20 value: 3.197 - type: precision_at_3 value: 10.123 - type: precision_at_5 value: 7.881 - type: recall_at_1 value: 15.363 - type: recall_at_10 value: 52.774 - type: recall_at_100 value: 81.65 - type: recall_at_1000 value: 95.448 - type: recall_at_20 value: 63.94 - type: recall_at_3 value: 30.37 - type: recall_at_5 value: 39.403 task: type: Retrieval - dataset: config: default name: MTEB ArxivClassification (default) revision: f9bd92144ed76200d6eb3ce73a8bd4eba9ffdc85 split: test type: ccdv/arxiv-classification metrics: - type: accuracy value: 43.611999999999995 - type: f1 value: 40.930383763906484 - type: f1_weighted value: 41.404367816744276 - type: main_score value: 43.611999999999995 task: type: Classification - dataset: config: default name: MTEB ArxivClusteringP2P (default) revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d split: test type: mteb/arxiv-clustering-p2p metrics: - type: main_score value: 24.827354215343842 - type: v_measure value: 24.827354215343842 - type: v_measure_std value: 14.761042346861815 task: type: Clustering - dataset: config: default name: MTEB ArxivClusteringP2P.v2 (default) revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d split: test type: mteb/arxiv-clustering-p2p metrics: - type: main_score value: 29.14326814807588 - type: v_measure value: 29.14326814807588 - type: v_measure_std value: 16.354623518770328 task: type: Clustering - dataset: config: default name: MTEB ArxivClusteringS2S (default) revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53 split: test type: mteb/arxiv-clustering-s2s metrics: - type: main_score value: 16.681456170594032 - type: v_measure value: 16.681456170594032 - type: v_measure_std value: 15.806408628434077 task: type: Clustering - dataset: config: default name: MTEB Banking77Classification (default) revision: 0fd18e25b25c072e09e0d92ab615fda904d66300 split: test type: mteb/banking77 metrics: - type: accuracy value: 59.86363636363635 - type: f1 value: 58.3300719763065 - type: f1_weighted value: 58.3300719763065 - type: main_score value: 59.86363636363635 task: type: Classification - dataset: config: default name: MTEB BigPatentClustering (default) revision: 62d5330920bca426ce9d3c76ea914f15fc83e891 split: test type: jinaai/big-patent-clustering metrics: - type: main_score value: 17.208517091148714 - type: v_measure value: 17.208517091148714 - type: v_measure_std value: 0.698644666463382 task: type: Clustering - dataset: config: default name: MTEB BiorxivClusteringP2P (default) revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40 split: test type: mteb/biorxiv-clustering-p2p metrics: - type: main_score value: 19.998032819841395 - type: v_measure value: 19.998032819841395 - type: v_measure_std value: 0.7272995954630507 task: type: Clustering - dataset: config: default name: MTEB BiorxivClusteringS2S (default) revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908 split: test type: mteb/biorxiv-clustering-s2s metrics: - type: main_score value: 12.672050490076508 - type: v_measure value: 12.672050490076508 - type: v_measure_std value: 0.7252965151579489 task: type: Clustering - dataset: config: default name: MTEB CEDRClassification (default) revision: c0ba03d058e3e1b2f3fd20518875a4563dd12db4 split: test type: ai-forever/cedr-classification metrics: - type: accuracy value: 38.95324123273113 - type: f1 value: 30.695742042129776 - type: lrap value: 64.53134962805646 - type: main_score value: 38.95324123273113 task: type: MultilabelClassification - dataset: config: default name: MTEB CPUSpeedTask (default) revision: '1.0' split: test type: 'CPUSpeedTask' metrics: - type: avg_words_per_sec value: 1171249.8059068616 - type: main_score value: 1171249.8059068616 - type: physical_cores value: 3600 - type: time_mean value: 31.018148149762837 - type: time_std value: 10.887230129351211 - type: total_cores value: 7200 task: type: Speed - dataset: config: default name: MTEB CQADupstackAndroidRetrieval (default) revision: f46a197baaae43b4f621051089b82a364682dfeb split: test type: mteb/cqadupstack-android metrics: - type: main_score value: 27.686 - type: map_at_1 value: 17.864 - type: map_at_10 value: 23.842 - type: map_at_100 value: 24.648999999999997 - type: map_at_1000 value: 24.771 - type: map_at_20 value: 24.277 - type: map_at_3 value: 21.938 - type: map_at_5 value: 23.058999999999997 - type: mrr_at_1 value: 21.888412017167383 - type: mrr_at_10 value: 27.934691282330764 - type: mrr_at_100 value: 28.58815942555481 - type: mrr_at_1000 value: 28.669575168001604 - type: mrr_at_20 value: 28.259041893075693 - type: mrr_at_3 value: 25.96566523605151 - type: mrr_at_5 value: 27.145922746781114 - type: nauc_map_at_1000_diff1 value: 38.9362657863528 - type: nauc_map_at_1000_max value: 26.39064664437522 - type: nauc_map_at_1000_std value: -0.3507878980807277 - type: nauc_map_at_100_diff1 value: 38.9305380779697 - type: nauc_map_at_100_max value: 26.37667481671251 - type: nauc_map_at_100_std value: -0.4107785241043359 - type: nauc_map_at_10_diff1 value: 38.90352635552967 - type: nauc_map_at_10_max value: 26.04843561328241 - type: nauc_map_at_10_std value: -1.0213929777227249 - type: nauc_map_at_1_diff1 value: 44.891250111700664 - type: nauc_map_at_1_max value: 27.415379429330695 - type: nauc_map_at_1_std value: -2.083016588225919 - type: nauc_map_at_20_diff1 value: 38.94728598104626 - type: nauc_map_at_20_max value: 26.321985371933916 - type: nauc_map_at_20_std value: -0.6740389120283213 - type: nauc_map_at_3_diff1 value: 40.75408309900131 - type: nauc_map_at_3_max value: 26.81466083992981 - type: nauc_map_at_3_std value: -1.3446416472047542 - type: nauc_map_at_5_diff1 value: 39.55391899732806 - type: nauc_map_at_5_max value: 26.73952942989369 - type: nauc_map_at_5_std value: -0.9241166864360354 - type: nauc_mrr_at_1000_diff1 value: 37.49322259212407 - type: nauc_mrr_at_1000_max value: 26.791861376982645 - type: nauc_mrr_at_1000_std value: -0.12058632966589165 - type: nauc_mrr_at_100_diff1 value: 37.47912707778518 - type: nauc_mrr_at_100_max value: 26.780040228801354 - type: nauc_mrr_at_100_std value: -0.13375233513915044 - type: nauc_mrr_at_10_diff1 value: 37.44982182358103 - type: nauc_mrr_at_10_max value: 26.579194370161574 - type: nauc_mrr_at_10_std value: -0.5519796223426987 - type: nauc_mrr_at_1_diff1 value: 43.78241372037574 - type: nauc_mrr_at_1_max value: 29.62575208874629 - type: nauc_mrr_at_1_std value: -0.7403872780711277 - type: nauc_mrr_at_20_diff1 value: 37.413002156119 - type: nauc_mrr_at_20_max value: 26.71157844066263 - type: nauc_mrr_at_20_std value: -0.3418018168926074 - type: nauc_mrr_at_3_diff1 value: 39.36718212836755 - type: nauc_mrr_at_3_max value: 27.755919798148643 - type: nauc_mrr_at_3_std value: -0.5118015715447669 - type: nauc_mrr_at_5_diff1 value: 38.108343388995614 - type: nauc_mrr_at_5_max value: 27.255156457755536 - type: nauc_mrr_at_5_std value: -0.33152296202161974 - type: nauc_ndcg_at_1000_diff1 value: 35.45874849790142 - type: nauc_ndcg_at_1000_max value: 26.06624958789977 - type: nauc_ndcg_at_1000_std value: 2.8510315350747746 - type: nauc_ndcg_at_100_diff1 value: 35.22563491603818 - type: nauc_ndcg_at_100_max value: 25.482125642505167 - type: nauc_ndcg_at_100_std value: 1.7230614371120136 - type: nauc_ndcg_at_10_diff1 value: 35.442027092978336 - type: nauc_ndcg_at_10_max value: 24.43872310681677 - type: nauc_ndcg_at_10_std value: -0.8836727526012238 - type: nauc_ndcg_at_1_diff1 value: 43.78241372037574 - type: nauc_ndcg_at_1_max value: 29.62575208874629 - type: nauc_ndcg_at_1_std value: -0.7403872780711277 - type: nauc_ndcg_at_20_diff1 value: 35.532620958116226 - type: nauc_ndcg_at_20_max value: 24.9995407161472 - type: nauc_ndcg_at_20_std value: 0.09407090543637946 - type: nauc_ndcg_at_3_diff1 value: 38.771875097129474 - type: nauc_ndcg_at_3_max value: 26.88398760762366 - type: nauc_ndcg_at_3_std value: -0.7925347887124169 - type: nauc_ndcg_at_5_diff1 value: 36.83295698854961 - type: nauc_ndcg_at_5_max value: 26.254070953306602 - type: nauc_ndcg_at_5_std value: -0.5384138224839687 - type: nauc_precision_at_1000_diff1 value: 3.830797202509721 - type: nauc_precision_at_1000_max value: 11.845342201460761 - type: nauc_precision_at_1000_std value: 9.148785863457954 - type: nauc_precision_at_100_diff1 value: 13.997075774954821 - type: nauc_precision_at_100_max value: 21.8795221100872 - type: nauc_precision_at_100_std value: 8.373324931296871 - type: nauc_precision_at_10_diff1 value: 22.14226604167402 - type: nauc_precision_at_10_max value: 21.908333662820144 - type: nauc_precision_at_10_std value: 2.023219601124639 - type: nauc_precision_at_1_diff1 value: 43.78241372037574 - type: nauc_precision_at_1_max value: 29.62575208874629 - type: nauc_precision_at_1_std value: -0.7403872780711277 - type: nauc_precision_at_20_diff1 value: 20.193510781013575 - type: nauc_precision_at_20_max value: 21.47063363375231 - type: nauc_precision_at_20_std value: 5.073093391207243 - type: nauc_precision_at_3_diff1 value: 33.320150724486965 - type: nauc_precision_at_3_max value: 28.42063777288856 - type: nauc_precision_at_3_std value: 1.3535730617388522 - type: nauc_precision_at_5_diff1 value: 26.972979755151126 - type: nauc_precision_at_5_max value: 27.35114981308005 - type: nauc_precision_at_5_std value: 1.5457768965552783 - type: nauc_recall_at_1000_diff1 value: 19.86231350512352 - type: nauc_recall_at_1000_max value: 24.527676453832008 - type: nauc_recall_at_1000_std value: 22.21772883429467 - type: nauc_recall_at_100_diff1 value: 23.132801377646004 - type: nauc_recall_at_100_max value: 20.988835029134467 - type: nauc_recall_at_100_std value: 8.793975445583824 - type: nauc_recall_at_10_diff1 value: 25.796766681233457 - type: nauc_recall_at_10_max value: 17.634361086885264 - type: nauc_recall_at_10_std value: -0.4776257668185774 - type: nauc_recall_at_1_diff1 value: 44.891250111700664 - type: nauc_recall_at_1_max value: 27.415379429330695 - type: nauc_recall_at_1_std value: -2.083016588225919 - type: nauc_recall_at_20_diff1 value: 25.714655008602115 - type: nauc_recall_at_20_max value: 19.791963050086874 - type: nauc_recall_at_20_std value: 1.9596491600238453 - type: nauc_recall_at_3_diff1 value: 34.63094367351514 - type: nauc_recall_at_3_max value: 23.49028309758934 - type: nauc_recall_at_3_std value: -0.8832533681499335 - type: nauc_recall_at_5_diff1 value: 30.296413916201175 - type: nauc_recall_at_5_max value: 22.27559868081795 - type: nauc_recall_at_5_std value: 0.7320693658757037 - type: ndcg_at_1 value: 21.887999999999998 - type: ndcg_at_10 value: 27.686 - type: ndcg_at_100 value: 31.363999999999997 - type: ndcg_at_1000 value: 34.605000000000004 - type: ndcg_at_20 value: 28.93 - type: ndcg_at_3 value: 24.576999999999998 - type: ndcg_at_5 value: 26.144000000000002 - type: precision_at_1 value: 21.887999999999998 - type: precision_at_10 value: 5.0360000000000005 - type: precision_at_100 value: 0.828 - type: precision_at_1000 value: 0.135 - type: precision_at_20 value: 2.9690000000000003 - type: precision_at_3 value: 11.445 - type: precision_at_5 value: 8.269 - type: recall_at_1 value: 17.864 - type: recall_at_10 value: 34.977999999999994 - type: recall_at_100 value: 51.366 - type: recall_at_1000 value: 74.505 - type: recall_at_20 value: 39.587 - type: recall_at_3 value: 25.856 - type: recall_at_5 value: 30.215999999999998 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackEnglishRetrieval (default) revision: ad9991cb51e31e31e430383c75ffb2885547b5f0 split: test type: mteb/cqadupstack-english metrics: - type: main_score value: 17.534 - type: map_at_1 value: 11.354000000000001 - type: map_at_10 value: 14.847 - type: map_at_100 value: 15.49 - type: map_at_1000 value: 15.588 - type: map_at_20 value: 15.17 - type: map_at_3 value: 13.501 - type: map_at_5 value: 14.221 - type: mrr_at_1 value: 14.26751592356688 - type: mrr_at_10 value: 18.05727428975836 - type: mrr_at_100 value: 18.690847238016758 - type: mrr_at_1000 value: 18.764726106731445 - type: mrr_at_20 value: 18.395670843598797 - type: mrr_at_3 value: 16.64543524416137 - type: mrr_at_5 value: 17.333333333333336 - type: nauc_map_at_1000_diff1 value: 43.301676769305494 - type: nauc_map_at_1000_max value: 16.06805541449501 - type: nauc_map_at_1000_std value: 12.507510564248166 - type: nauc_map_at_100_diff1 value: 43.34383366787733 - type: nauc_map_at_100_max value: 16.049871088358675 - type: nauc_map_at_100_std value: 12.45712935804974 - type: nauc_map_at_10_diff1 value: 43.688675805930785 - type: nauc_map_at_10_max value: 16.41613903348705 - type: nauc_map_at_10_std value: 12.219643122219239 - type: nauc_map_at_1_diff1 value: 50.609096395200005 - type: nauc_map_at_1_max value: 18.78413464500168 - type: nauc_map_at_1_std value: 10.90744028944332 - type: nauc_map_at_20_diff1 value: 43.49084704145287 - type: nauc_map_at_20_max value: 16.182371186268703 - type: nauc_map_at_20_std value: 12.299197289134225 - type: nauc_map_at_3_diff1 value: 45.751823982563266 - type: nauc_map_at_3_max value: 17.192711563068457 - type: nauc_map_at_3_std value: 11.16466159721384 - type: nauc_map_at_5_diff1 value: 44.53444696379338 - type: nauc_map_at_5_max value: 16.559164547974103 - type: nauc_map_at_5_std value: 11.928445405766698 - type: nauc_mrr_at_1000_diff1 value: 42.29550571785051 - type: nauc_mrr_at_1000_max value: 15.642122643175679 - type: nauc_mrr_at_1000_std value: 12.21491820640565 - type: nauc_mrr_at_100_diff1 value: 42.301744065140404 - type: nauc_mrr_at_100_max value: 15.61733477074953 - type: nauc_mrr_at_100_std value: 12.181221737579532 - type: nauc_mrr_at_10_diff1 value: 42.670586100296646 - type: nauc_mrr_at_10_max value: 15.926109333510835 - type: nauc_mrr_at_10_std value: 12.192068681943583 - type: nauc_mrr_at_1_diff1 value: 51.89198697276755 - type: nauc_mrr_at_1_max value: 19.325504911863643 - type: nauc_mrr_at_1_std value: 12.282190963023766 - type: nauc_mrr_at_20_diff1 value: 42.39065015069134 - type: nauc_mrr_at_20_max value: 15.693533741719229 - type: nauc_mrr_at_20_std value: 12.145452140370937 - type: nauc_mrr_at_3_diff1 value: 44.715851634047944 - type: nauc_mrr_at_3_max value: 16.790849616314052 - type: nauc_mrr_at_3_std value: 12.056098541376208 - type: nauc_mrr_at_5_diff1 value: 43.87033674228477 - type: nauc_mrr_at_5_max value: 16.270118452872623 - type: nauc_mrr_at_5_std value: 12.268005300025886 - type: nauc_ndcg_at_1000_diff1 value: 38.01640412131576 - type: nauc_ndcg_at_1000_max value: 14.409491835566401 - type: nauc_ndcg_at_1000_std value: 14.292607075384597 - type: nauc_ndcg_at_100_diff1 value: 38.57310899261012 - type: nauc_ndcg_at_100_max value: 13.847832990597306 - type: nauc_ndcg_at_100_std value: 13.318671226615844 - type: nauc_ndcg_at_10_diff1 value: 40.02384031953078 - type: nauc_ndcg_at_10_max value: 15.18313865997875 - type: nauc_ndcg_at_10_std value: 12.662598128357672 - type: nauc_ndcg_at_1_diff1 value: 51.89198697276755 - type: nauc_ndcg_at_1_max value: 19.325504911863643 - type: nauc_ndcg_at_1_std value: 12.282190963023766 - type: nauc_ndcg_at_20_diff1 value: 39.357302335202725 - type: nauc_ndcg_at_20_max value: 14.497857343754966 - type: nauc_ndcg_at_20_std value: 12.630113736826498 - type: nauc_ndcg_at_3_diff1 value: 43.58418967840297 - type: nauc_ndcg_at_3_max value: 16.597491536723943 - type: nauc_ndcg_at_3_std value: 11.650784883274328 - type: nauc_ndcg_at_5_diff1 value: 42.02130435072668 - type: nauc_ndcg_at_5_max value: 15.627518090215247 - type: nauc_ndcg_at_5_std value: 12.533489817270919 - type: nauc_precision_at_1000_diff1 value: 3.679521880714478 - type: nauc_precision_at_1000_max value: 0.7919025640437954 - type: nauc_precision_at_1000_std value: 11.047727940811521 - type: nauc_precision_at_100_diff1 value: 19.4078130462856 - type: nauc_precision_at_100_max value: 4.3715506402771425 - type: nauc_precision_at_100_std value: 16.956899011609643 - type: nauc_precision_at_10_diff1 value: 28.437045098011527 - type: nauc_precision_at_10_max value: 11.734386703789056 - type: nauc_precision_at_10_std value: 15.714063626213687 - type: nauc_precision_at_1_diff1 value: 51.89198697276755 - type: nauc_precision_at_1_max value: 19.325504911863643 - type: nauc_precision_at_1_std value: 12.282190963023766 - type: nauc_precision_at_20_diff1 value: 26.61622384998239 - type: nauc_precision_at_20_max value: 9.031660188586937 - type: nauc_precision_at_20_std value: 16.20337620782593 - type: nauc_precision_at_3_diff1 value: 38.065037328678045 - type: nauc_precision_at_3_max value: 15.242914979757064 - type: nauc_precision_at_3_std value: 13.448074137354654 - type: nauc_precision_at_5_diff1 value: 34.74896073477683 - type: nauc_precision_at_5_max value: 13.347547367557508 - type: nauc_precision_at_5_std value: 15.211527933339694 - type: nauc_recall_at_1000_diff1 value: 22.478800979463685 - type: nauc_recall_at_1000_max value: 11.13145140021939 - type: nauc_recall_at_1000_std value: 20.050008624461874 - type: nauc_recall_at_100_diff1 value: 25.988786568304555 - type: nauc_recall_at_100_max value: 8.089785168176974 - type: nauc_recall_at_100_std value: 14.262619130209112 - type: nauc_recall_at_10_diff1 value: 30.866722162291687 - type: nauc_recall_at_10_max value: 12.14019760016012 - type: nauc_recall_at_10_std value: 12.8097154636935 - type: nauc_recall_at_1_diff1 value: 50.609096395200005 - type: nauc_recall_at_1_max value: 18.78413464500168 - type: nauc_recall_at_1_std value: 10.90744028944332 - type: nauc_recall_at_20_diff1 value: 28.832935090203225 - type: nauc_recall_at_20_max value: 10.309594281852648 - type: nauc_recall_at_20_std value: 12.251157275647977 - type: nauc_recall_at_3_diff1 value: 40.105712098235315 - type: nauc_recall_at_3_max value: 15.165723469178264 - type: nauc_recall_at_3_std value: 10.99744165240917 - type: nauc_recall_at_5_diff1 value: 36.09241435581379 - type: nauc_recall_at_5_max value: 13.032542349570054 - type: nauc_recall_at_5_std value: 12.802627519053681 - type: ndcg_at_1 value: 14.268 - type: ndcg_at_10 value: 17.534 - type: ndcg_at_100 value: 20.78 - type: ndcg_at_1000 value: 23.526 - type: ndcg_at_20 value: 18.567 - type: ndcg_at_3 value: 15.218000000000002 - type: ndcg_at_5 value: 16.164 - type: precision_at_1 value: 14.268 - type: precision_at_10 value: 3.312 - type: precision_at_100 value: 0.603 - type: precision_at_1000 value: 0.105 - type: precision_at_20 value: 1.9869999999999999 - type: precision_at_3 value: 7.219 - type: precision_at_5 value: 5.1209999999999996 - type: recall_at_1 value: 11.354000000000001 - type: recall_at_10 value: 22.511 - type: recall_at_100 value: 37.24 - type: recall_at_1000 value: 56.718 - type: recall_at_20 value: 26.362999999999996 - type: recall_at_3 value: 15.53 - type: recall_at_5 value: 18.322 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackGamingRetrieval (default) revision: 4885aa143210c98657558c04aaf3dc47cfb54340 split: test type: mteb/cqadupstack-gaming metrics: - type: main_score value: 29.03 - type: map_at_1 value: 19.307 - type: map_at_10 value: 25.453 - type: map_at_100 value: 26.33 - type: map_at_1000 value: 26.419999999999998 - type: map_at_20 value: 25.896 - type: map_at_3 value: 23.572000000000003 - type: map_at_5 value: 24.694 - type: mrr_at_1 value: 22.00626959247649 - type: mrr_at_10 value: 27.87858884410605 - type: mrr_at_100 value: 28.652814969242712 - type: mrr_at_1000 value: 28.725946491824235 - type: mrr_at_20 value: 28.276271334002978 - type: mrr_at_3 value: 25.997910135841156 - type: mrr_at_5 value: 27.11703239289442 - type: nauc_map_at_1000_diff1 value: 43.50604073464055 - type: nauc_map_at_1000_max value: 30.480004310005544 - type: nauc_map_at_1000_std value: 0.18281635239684302 - type: nauc_map_at_100_diff1 value: 43.51057034900177 - type: nauc_map_at_100_max value: 30.463453039114537 - type: nauc_map_at_100_std value: 0.1392213813651391 - type: nauc_map_at_10_diff1 value: 43.680704548271024 - type: nauc_map_at_10_max value: 30.639431323648626 - type: nauc_map_at_10_std value: -0.17722097946115797 - type: nauc_map_at_1_diff1 value: 49.51121570705665 - type: nauc_map_at_1_max value: 31.820851746100594 - type: nauc_map_at_1_std value: -2.635315036488275 - type: nauc_map_at_20_diff1 value: 43.519636427140746 - type: nauc_map_at_20_max value: 30.479309603785193 - type: nauc_map_at_20_std value: -0.04034004401117608 - type: nauc_map_at_3_diff1 value: 44.660054248758726 - type: nauc_map_at_3_max value: 30.35371167828995 - type: nauc_map_at_3_std value: -1.4381463631334364 - type: nauc_map_at_5_diff1 value: 44.14458335553869 - type: nauc_map_at_5_max value: 30.49464687257249 - type: nauc_map_at_5_std value: -0.7069576298198817 - type: nauc_mrr_at_1000_diff1 value: 43.49091070845857 - type: nauc_mrr_at_1000_max value: 30.904217260073207 - type: nauc_mrr_at_1000_std value: 0.6030969099528762 - type: nauc_mrr_at_100_diff1 value: 43.48206732167152 - type: nauc_mrr_at_100_max value: 30.885805566023013 - type: nauc_mrr_at_100_std value: 0.5769328589498474 - type: nauc_mrr_at_10_diff1 value: 43.55457392824764 - type: nauc_mrr_at_10_max value: 31.139789286663294 - type: nauc_mrr_at_10_std value: 0.39137312166360116 - type: nauc_mrr_at_1_diff1 value: 49.7476817055079 - type: nauc_mrr_at_1_max value: 33.35487810786589 - type: nauc_mrr_at_1_std value: -2.335419312527886 - type: nauc_mrr_at_20_diff1 value: 43.48827825669483 - type: nauc_mrr_at_20_max value: 30.983317516254566 - type: nauc_mrr_at_20_std value: 0.4846694988872726 - type: nauc_mrr_at_3_diff1 value: 44.66661877146986 - type: nauc_mrr_at_3_max value: 31.31121111690094 - type: nauc_mrr_at_3_std value: -0.5970753554262374 - type: nauc_mrr_at_5_diff1 value: 44.05287141220467 - type: nauc_mrr_at_5_max value: 31.185044083863524 - type: nauc_mrr_at_5_std value: 0.03276041839131263 - type: nauc_ndcg_at_1000_diff1 value: 40.64648189672279 - type: nauc_ndcg_at_1000_max value: 29.851206560241867 - type: nauc_ndcg_at_1000_std value: 3.7885804314712423 - type: nauc_ndcg_at_100_diff1 value: 40.54660606744312 - type: nauc_ndcg_at_100_max value: 29.52262097274987 - type: nauc_ndcg_at_100_std value: 3.1313695052884087 - type: nauc_ndcg_at_10_diff1 value: 41.189151331147364 - type: nauc_ndcg_at_10_max value: 30.257730735981376 - type: nauc_ndcg_at_10_std value: 1.483283884208919 - type: nauc_ndcg_at_1_diff1 value: 49.7476817055079 - type: nauc_ndcg_at_1_max value: 33.35487810786589 - type: nauc_ndcg_at_1_std value: -2.335419312527886 - type: nauc_ndcg_at_20_diff1 value: 40.69940555374264 - type: nauc_ndcg_at_20_max value: 29.67596434757782 - type: nauc_ndcg_at_20_std value: 1.8670302698321029 - type: nauc_ndcg_at_3_diff1 value: 43.313981749068034 - type: nauc_ndcg_at_3_max value: 29.92612987963682 - type: nauc_ndcg_at_3_std value: -0.7629159307364975 - type: nauc_ndcg_at_5_diff1 value: 42.25367609444526 - type: nauc_ndcg_at_5_max value: 30.011822025139217 - type: nauc_ndcg_at_5_std value: 0.4228958959339596 - type: nauc_precision_at_1000_diff1 value: 6.294045364733051 - type: nauc_precision_at_1000_max value: 13.003287301353916 - type: nauc_precision_at_1000_std value: 19.672009407091075 - type: nauc_precision_at_100_diff1 value: 18.900847000430282 - type: nauc_precision_at_100_max value: 19.89805341000471 - type: nauc_precision_at_100_std value: 14.097381220216437 - type: nauc_precision_at_10_diff1 value: 32.019287482758315 - type: nauc_precision_at_10_max value: 28.868719930088588 - type: nauc_precision_at_10_std value: 7.067713684120723 - type: nauc_precision_at_1_diff1 value: 49.7476817055079 - type: nauc_precision_at_1_max value: 33.35487810786589 - type: nauc_precision_at_1_std value: -2.335419312527886 - type: nauc_precision_at_20_diff1 value: 27.442952211039866 - type: nauc_precision_at_20_max value: 25.51570310142488 - type: nauc_precision_at_20_std value: 8.001107746535538 - type: nauc_precision_at_3_diff1 value: 38.33881569586195 - type: nauc_precision_at_3_max value: 28.995385801766826 - type: nauc_precision_at_3_std value: 0.46426597601937036 - type: nauc_precision_at_5_diff1 value: 35.93052673151141 - type: nauc_precision_at_5_max value: 28.77086703745561 - type: nauc_precision_at_5_std value: 3.020792681159482 - type: nauc_recall_at_1000_diff1 value: 27.413733064523722 - type: nauc_recall_at_1000_max value: 25.640071347285847 - type: nauc_recall_at_1000_std value: 23.024726525628747 - type: nauc_recall_at_100_diff1 value: 30.238748775488382 - type: nauc_recall_at_100_max value: 24.83445535706549 - type: nauc_recall_at_100_std value: 13.213229148027994 - type: nauc_recall_at_10_diff1 value: 33.660824128432765 - type: nauc_recall_at_10_max value: 28.239711759937826 - type: nauc_recall_at_10_std value: 5.259078451819804 - type: nauc_recall_at_1_diff1 value: 49.51121570705665 - type: nauc_recall_at_1_max value: 31.820851746100594 - type: nauc_recall_at_1_std value: -2.635315036488275 - type: nauc_recall_at_20_diff1 value: 31.77661434800746 - type: nauc_recall_at_20_max value: 25.949306594350592 - type: nauc_recall_at_20_std value: 6.611875576453824 - type: nauc_recall_at_3_diff1 value: 39.16095910728281 - type: nauc_recall_at_3_max value: 27.64955581506583 - type: nauc_recall_at_3_std value: 0.10121363216139175 - type: nauc_recall_at_5_diff1 value: 36.32968291714543 - type: nauc_recall_at_5_max value: 27.325678767283694 - type: nauc_recall_at_5_std value: 2.653663972529844 - type: ndcg_at_1 value: 22.006 - type: ndcg_at_10 value: 29.03 - type: ndcg_at_100 value: 33.318999999999996 - type: ndcg_at_1000 value: 35.89 - type: ndcg_at_20 value: 30.503999999999998 - type: ndcg_at_3 value: 25.348 - type: ndcg_at_5 value: 27.267000000000003 - type: precision_at_1 value: 22.006 - type: precision_at_10 value: 4.627 - type: precision_at_100 value: 0.744 - type: precision_at_1000 value: 0.10300000000000001 - type: precision_at_20 value: 2.702 - type: precision_at_3 value: 11.033999999999999 - type: precision_at_5 value: 7.861999999999999 - type: recall_at_1 value: 19.307 - type: recall_at_10 value: 37.624 - type: recall_at_100 value: 56.997 - type: recall_at_1000 value: 76.62299999999999 - type: recall_at_20 value: 43.086 - type: recall_at_3 value: 27.724 - type: recall_at_5 value: 32.421 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackGisRetrieval (default) revision: 5003b3064772da1887988e05400cf3806fe491f2 split: test type: mteb/cqadupstack-gis metrics: - type: main_score value: 14.097000000000001 - type: map_at_1 value: 9.109 - type: map_at_10 value: 12.062000000000001 - type: map_at_100 value: 12.603 - type: map_at_1000 value: 12.690000000000001 - type: map_at_20 value: 12.335 - type: map_at_3 value: 10.882 - type: map_at_5 value: 11.445 - type: mrr_at_1 value: 9.6045197740113 - type: mrr_at_10 value: 13.001390009864586 - type: mrr_at_100 value: 13.541388076434767 - type: mrr_at_1000 value: 13.622995527273426 - type: mrr_at_20 value: 13.261213704134942 - type: mrr_at_3 value: 11.75141242937853 - type: mrr_at_5 value: 12.3728813559322 - type: nauc_map_at_1000_diff1 value: 41.25399941751793 - type: nauc_map_at_1000_max value: 17.60637208770784 - type: nauc_map_at_1000_std value: 3.8997877056955876 - type: nauc_map_at_100_diff1 value: 41.3047772590663 - type: nauc_map_at_100_max value: 17.593792209003684 - type: nauc_map_at_100_std value: 3.8624300256381883 - type: nauc_map_at_10_diff1 value: 41.918994248720736 - type: nauc_map_at_10_max value: 17.523107069845093 - type: nauc_map_at_10_std value: 3.3289332906481333 - type: nauc_map_at_1_diff1 value: 50.853111369434835 - type: nauc_map_at_1_max value: 20.441039981572366 - type: nauc_map_at_1_std value: 2.9730312951046747 - type: nauc_map_at_20_diff1 value: 41.676967823092156 - type: nauc_map_at_20_max value: 17.611142954564 - type: nauc_map_at_20_std value: 3.7507161629892516 - type: nauc_map_at_3_diff1 value: 45.15865999101332 - type: nauc_map_at_3_max value: 17.51828209554345 - type: nauc_map_at_3_std value: 3.125254352308741 - type: nauc_map_at_5_diff1 value: 43.518873099840164 - type: nauc_map_at_5_max value: 18.096843812930256 - type: nauc_map_at_5_std value: 3.501264664850646 - type: nauc_mrr_at_1000_diff1 value: 39.65049616843269 - type: nauc_mrr_at_1000_max value: 18.992312109540187 - type: nauc_mrr_at_1000_std value: 3.8630526743174602 - type: nauc_mrr_at_100_diff1 value: 39.67790321701619 - type: nauc_mrr_at_100_max value: 18.99280796073833 - type: nauc_mrr_at_100_std value: 3.831281556686595 - type: nauc_mrr_at_10_diff1 value: 40.40664164207995 - type: nauc_mrr_at_10_max value: 18.9789911833429 - type: nauc_mrr_at_10_std value: 3.389250639709206 - type: nauc_mrr_at_1_diff1 value: 48.90268334274423 - type: nauc_mrr_at_1_max value: 22.148416208142038 - type: nauc_mrr_at_1_std value: 3.482278486678414 - type: nauc_mrr_at_20_diff1 value: 40.12944011033672 - type: nauc_mrr_at_20_max value: 19.01229852858854 - type: nauc_mrr_at_20_std value: 3.721020072685762 - type: nauc_mrr_at_3_diff1 value: 43.53442474531623 - type: nauc_mrr_at_3_max value: 18.98665230786941 - type: nauc_mrr_at_3_std value: 3.141188860380207 - type: nauc_mrr_at_5_diff1 value: 41.792381222269306 - type: nauc_mrr_at_5_max value: 19.564109785495027 - type: nauc_mrr_at_5_std value: 3.447599289829289 - type: nauc_ndcg_at_1000_diff1 value: 33.75036088168543 - type: nauc_ndcg_at_1000_max value: 17.552395174719724 - type: nauc_ndcg_at_1000_std value: 6.019653809238646 - type: nauc_ndcg_at_100_diff1 value: 34.46011549407109 - type: nauc_ndcg_at_100_max value: 17.261093331357706 - type: nauc_ndcg_at_100_std value: 5.4268706575162104 - type: nauc_ndcg_at_10_diff1 value: 37.83747527779143 - type: nauc_ndcg_at_10_max value: 17.044974102007092 - type: nauc_ndcg_at_10_std value: 3.5111959818349603 - type: nauc_ndcg_at_1_diff1 value: 48.90268334274423 - type: nauc_ndcg_at_1_max value: 22.148416208142038 - type: nauc_ndcg_at_1_std value: 3.482278486678414 - type: nauc_ndcg_at_20_diff1 value: 37.138695182061525 - type: nauc_ndcg_at_20_max value: 17.22387592023126 - type: nauc_ndcg_at_20_std value: 4.770921048488158 - type: nauc_ndcg_at_3_diff1 value: 43.268967346255074 - type: nauc_ndcg_at_3_max value: 17.20602008989898 - type: nauc_ndcg_at_3_std value: 3.19589477459749 - type: nauc_ndcg_at_5_diff1 value: 40.7884752761726 - type: nauc_ndcg_at_5_max value: 18.121892702668045 - type: nauc_ndcg_at_5_std value: 3.8369089974368573 - type: nauc_precision_at_1000_diff1 value: 7.089909563758634 - type: nauc_precision_at_1000_max value: 19.071511820051107 - type: nauc_precision_at_1000_std value: 8.71710715708378 - type: nauc_precision_at_100_diff1 value: 17.577598014207858 - type: nauc_precision_at_100_max value: 18.757305391811315 - type: nauc_precision_at_100_std value: 8.571496733416154 - type: nauc_precision_at_10_diff1 value: 28.943153297767832 - type: nauc_precision_at_10_max value: 16.38624587520458 - type: nauc_precision_at_10_std value: 3.437574061625469 - type: nauc_precision_at_1_diff1 value: 48.90268334274423 - type: nauc_precision_at_1_max value: 22.148416208142038 - type: nauc_precision_at_1_std value: 3.482278486678414 - type: nauc_precision_at_20_diff1 value: 26.474908278743044 - type: nauc_precision_at_20_max value: 16.47527151110289 - type: nauc_precision_at_20_std value: 7.5305698853598 - type: nauc_precision_at_3_diff1 value: 39.54288018891221 - type: nauc_precision_at_3_max value: 17.284449255178835 - type: nauc_precision_at_3_std value: 2.8714843759024866 - type: nauc_precision_at_5_diff1 value: 34.480901699228006 - type: nauc_precision_at_5_max value: 19.44159427138771 - type: nauc_precision_at_5_std value: 3.9140233563987525 - type: nauc_recall_at_1000_diff1 value: 14.656193188687894 - type: nauc_recall_at_1000_max value: 15.810571367218888 - type: nauc_recall_at_1000_std value: 12.334573972835202 - type: nauc_recall_at_100_diff1 value: 18.594617672285707 - type: nauc_recall_at_100_max value: 15.15863525459292 - type: nauc_recall_at_100_std value: 9.115505114921058 - type: nauc_recall_at_10_diff1 value: 29.13269929764077 - type: nauc_recall_at_10_max value: 15.059218016523301 - type: nauc_recall_at_10_std value: 3.7696923586295137 - type: nauc_recall_at_1_diff1 value: 50.853111369434835 - type: nauc_recall_at_1_max value: 20.441039981572366 - type: nauc_recall_at_1_std value: 2.9730312951046747 - type: nauc_recall_at_20_diff1 value: 27.544653538434776 - type: nauc_recall_at_20_max value: 15.420518066694445 - type: nauc_recall_at_20_std value: 7.101778539671523 - type: nauc_recall_at_3_diff1 value: 40.00397565193035 - type: nauc_recall_at_3_max value: 14.717415584208013 - type: nauc_recall_at_3_std value: 3.658957442260116 - type: nauc_recall_at_5_diff1 value: 35.35853159550963 - type: nauc_recall_at_5_max value: 17.049909921279315 - type: nauc_recall_at_5_std value: 4.839540342554651 - type: ndcg_at_1 value: 9.605 - type: ndcg_at_10 value: 14.097000000000001 - type: ndcg_at_100 value: 17.098 - type: ndcg_at_1000 value: 19.948 - type: ndcg_at_20 value: 15.043999999999999 - type: ndcg_at_3 value: 11.683 - type: ndcg_at_5 value: 12.656999999999998 - type: precision_at_1 value: 9.605 - type: precision_at_10 value: 2.215 - type: precision_at_100 value: 0.395 - type: precision_at_1000 value: 0.068 - type: precision_at_20 value: 1.322 - type: precision_at_3 value: 4.859 - type: precision_at_5 value: 3.435 - type: recall_at_1 value: 9.109 - type: recall_at_10 value: 19.618 - type: recall_at_100 value: 34.056 - type: recall_at_1000 value: 56.75599999999999 - type: recall_at_20 value: 23.168 - type: recall_at_3 value: 12.982 - type: recall_at_5 value: 15.315000000000001 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackMathematicaRetrieval (default) revision: 90fceea13679c63fe563ded68f3b6f06e50061de split: test type: mteb/cqadupstack-mathematica metrics: - type: main_score value: 8.895 - type: map_at_1 value: 4.444 - type: map_at_10 value: 6.789000000000001 - type: map_at_100 value: 7.362 - type: map_at_1000 value: 7.455 - type: map_at_20 value: 7.112 - type: map_at_3 value: 5.819 - type: map_at_5 value: 6.237 - type: mrr_at_1 value: 5.970149253731343 - type: mrr_at_10 value: 8.807500197425577 - type: mrr_at_100 value: 9.458867441952432 - type: mrr_at_1000 value: 9.550029897135536 - type: mrr_at_20 value: 9.191142267117858 - type: mrr_at_3 value: 7.669983416252076 - type: mrr_at_5 value: 8.229684908789391 - type: nauc_map_at_1000_diff1 value: 14.923575664521396 - type: nauc_map_at_1000_max value: 14.637382629018258 - type: nauc_map_at_1000_std value: 7.583317007693739 - type: nauc_map_at_100_diff1 value: 14.914938787317187 - type: nauc_map_at_100_max value: 14.57831256590049 - type: nauc_map_at_100_std value: 7.481458525605025 - type: nauc_map_at_10_diff1 value: 15.009158630868363 - type: nauc_map_at_10_max value: 14.587168521042992 - type: nauc_map_at_10_std value: 6.30675561821182 - type: nauc_map_at_1_diff1 value: 23.073067396533048 - type: nauc_map_at_1_max value: 22.526518534617583 - type: nauc_map_at_1_std value: 3.2886460233623356 - type: nauc_map_at_20_diff1 value: 14.55856812493529 - type: nauc_map_at_20_max value: 14.445922336763791 - type: nauc_map_at_20_std value: 7.0979435052536815 - type: nauc_map_at_3_diff1 value: 17.401011477759774 - type: nauc_map_at_3_max value: 16.448773676590882 - type: nauc_map_at_3_std value: 4.181405616554917 - type: nauc_map_at_5_diff1 value: 15.690380485853476 - type: nauc_map_at_5_max value: 15.435047584962474 - type: nauc_map_at_5_std value: 5.232971650136294 - type: nauc_mrr_at_1000_diff1 value: 15.064019296100401 - type: nauc_mrr_at_1000_max value: 15.23275181655676 - type: nauc_mrr_at_1000_std value: 6.62512228446261 - type: nauc_mrr_at_100_diff1 value: 15.04422899632206 - type: nauc_mrr_at_100_max value: 15.180132969802102 - type: nauc_mrr_at_100_std value: 6.569986365469756 - type: nauc_mrr_at_10_diff1 value: 15.513288408498664 - type: nauc_mrr_at_10_max value: 15.639652887265692 - type: nauc_mrr_at_10_std value: 6.08058172017529 - type: nauc_mrr_at_1_diff1 value: 23.174960802057807 - type: nauc_mrr_at_1_max value: 23.10505027161953 - type: nauc_mrr_at_1_std value: 5.000535690775217 - type: nauc_mrr_at_20_diff1 value: 14.944086344466943 - type: nauc_mrr_at_20_max value: 15.058772912777219 - type: nauc_mrr_at_20_std value: 6.406714993528487 - type: nauc_mrr_at_3_diff1 value: 16.945928540219413 - type: nauc_mrr_at_3_max value: 16.999490982460667 - type: nauc_mrr_at_3_std value: 4.2783371592240185 - type: nauc_mrr_at_5_diff1 value: 15.724845028203049 - type: nauc_mrr_at_5_max value: 16.374268642724658 - type: nauc_mrr_at_5_std value: 4.955417882432664 - type: nauc_ndcg_at_1000_diff1 value: 12.64441384439761 - type: nauc_ndcg_at_1000_max value: 12.544144311249642 - type: nauc_ndcg_at_1000_std value: 12.203401112537147 - type: nauc_ndcg_at_100_diff1 value: 12.856101621820079 - type: nauc_ndcg_at_100_max value: 12.15851341921588 - type: nauc_ndcg_at_100_std value: 11.352600283831114 - type: nauc_ndcg_at_10_diff1 value: 12.453755697243285 - type: nauc_ndcg_at_10_max value: 11.750014509834587 - type: nauc_ndcg_at_10_std value: 8.203127809929466 - type: nauc_ndcg_at_1_diff1 value: 23.174960802057807 - type: nauc_ndcg_at_1_max value: 23.10505027161953 - type: nauc_ndcg_at_1_std value: 5.000535690775217 - type: nauc_ndcg_at_20_diff1 value: 11.324071030247564 - type: nauc_ndcg_at_20_max value: 11.094964112045453 - type: nauc_ndcg_at_20_std value: 9.840879835834757 - type: nauc_ndcg_at_3_diff1 value: 15.323525692434862 - type: nauc_ndcg_at_3_max value: 14.559998492898632 - type: nauc_ndcg_at_3_std value: 4.027895180138566 - type: nauc_ndcg_at_5_diff1 value: 13.165086940669635 - type: nauc_ndcg_at_5_max value: 13.32440977723948 - type: nauc_ndcg_at_5_std value: 5.813837007263122 - type: nauc_precision_at_1000_diff1 value: 0.8928955587806005 - type: nauc_precision_at_1000_max value: 4.446218508931589 - type: nauc_precision_at_1000_std value: 5.877977195844953 - type: nauc_precision_at_100_diff1 value: 8.33525852681901 - type: nauc_precision_at_100_max value: 7.830647914480539 - type: nauc_precision_at_100_std value: 14.216797498501176 - type: nauc_precision_at_10_diff1 value: 7.765203936267145 - type: nauc_precision_at_10_max value: 7.141939768201643 - type: nauc_precision_at_10_std value: 9.60008810493683 - type: nauc_precision_at_1_diff1 value: 23.174960802057807 - type: nauc_precision_at_1_max value: 23.10505027161953 - type: nauc_precision_at_1_std value: 5.000535690775217 - type: nauc_precision_at_20_diff1 value: 4.810680914106181 - type: nauc_precision_at_20_max value: 4.6628595108449655 - type: nauc_precision_at_20_std value: 12.601430694735827 - type: nauc_precision_at_3_diff1 value: 13.474943796383625 - type: nauc_precision_at_3_max value: 11.709775106648399 - type: nauc_precision_at_3_std value: 3.207743252795555 - type: nauc_precision_at_5_diff1 value: 9.95810736829039 - type: nauc_precision_at_5_max value: 10.456953224514239 - type: nauc_precision_at_5_std value: 5.623208634930042 - type: nauc_recall_at_1000_diff1 value: 9.834451295472817 - type: nauc_recall_at_1000_max value: 9.848949382055148 - type: nauc_recall_at_1000_std value: 20.975606313150834 - type: nauc_recall_at_100_diff1 value: 10.217335772749356 - type: nauc_recall_at_100_max value: 9.152943313782552 - type: nauc_recall_at_100_std value: 17.31335628449071 - type: nauc_recall_at_10_diff1 value: 7.002474541545711 - type: nauc_recall_at_10_max value: 5.600453872340962 - type: nauc_recall_at_10_std value: 11.697537334063615 - type: nauc_recall_at_1_diff1 value: 23.073067396533048 - type: nauc_recall_at_1_max value: 22.526518534617583 - type: nauc_recall_at_1_std value: 3.2886460233623356 - type: nauc_recall_at_20_diff1 value: 5.418370604760854 - type: nauc_recall_at_20_max value: 5.4952006102593085 - type: nauc_recall_at_20_std value: 14.413914588580981 - type: nauc_recall_at_3_diff1 value: 12.321251599365478 - type: nauc_recall_at_3_max value: 10.062822926598114 - type: nauc_recall_at_3_std value: 5.2675756103944735 - type: nauc_recall_at_5_diff1 value: 7.540388296514483 - type: nauc_recall_at_5_max value: 7.803110889019699 - type: nauc_recall_at_5_std value: 8.317325637513246 - type: ndcg_at_1 value: 5.970000000000001 - type: ndcg_at_10 value: 8.895 - type: ndcg_at_100 value: 11.964 - type: ndcg_at_1000 value: 14.860000000000001 - type: ndcg_at_20 value: 10.104000000000001 - type: ndcg_at_3 value: 6.859999999999999 - type: ndcg_at_5 value: 7.573 - type: precision_at_1 value: 5.970000000000001 - type: precision_at_10 value: 1.779 - type: precision_at_100 value: 0.384 - type: precision_at_1000 value: 0.073 - type: precision_at_20 value: 1.2189999999999999 - type: precision_at_3 value: 3.4000000000000004 - type: precision_at_5 value: 2.537 - type: recall_at_1 value: 4.444 - type: recall_at_10 value: 13.751 - type: recall_at_100 value: 27.537 - type: recall_at_1000 value: 49.079 - type: recall_at_20 value: 18.182000000000002 - type: recall_at_3 value: 7.731000000000001 - type: recall_at_5 value: 9.636 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackPhysicsRetrieval (default) revision: 79531abbd1fb92d06c6d6315a0cbbbf5bb247ea4 split: test type: mteb/cqadupstack-physics metrics: - type: main_score value: 19.902 - type: map_at_1 value: 12.928999999999998 - type: map_at_10 value: 16.833000000000002 - type: map_at_100 value: 17.615 - type: map_at_1000 value: 17.732 - type: map_at_20 value: 17.207 - type: map_at_3 value: 15.463 - type: map_at_5 value: 16.128999999999998 - type: mrr_at_1 value: 15.976900866217516 - type: mrr_at_10 value: 20.444757627144526 - type: mrr_at_100 value: 21.18213748325402 - type: mrr_at_1000 value: 21.25972081056743 - type: mrr_at_20 value: 20.799603260475223 - type: mrr_at_3 value: 18.928456849534818 - type: mrr_at_5 value: 19.72248957330767 - type: nauc_map_at_1000_diff1 value: 41.27196577011274 - type: nauc_map_at_1000_max value: 30.04254002251132 - type: nauc_map_at_1000_std value: 6.570333369920046 - type: nauc_map_at_100_diff1 value: 41.27551384135304 - type: nauc_map_at_100_max value: 29.99043897557097 - type: nauc_map_at_100_std value: 6.472408363055328 - type: nauc_map_at_10_diff1 value: 41.85444301121017 - type: nauc_map_at_10_max value: 29.81212191843452 - type: nauc_map_at_10_std value: 5.93398567449617 - type: nauc_map_at_1_diff1 value: 46.839384517121886 - type: nauc_map_at_1_max value: 33.10314951759653 - type: nauc_map_at_1_std value: 3.473962823858065 - type: nauc_map_at_20_diff1 value: 41.4328465682072 - type: nauc_map_at_20_max value: 29.97742898678745 - type: nauc_map_at_20_std value: 6.104796006386177 - type: nauc_map_at_3_diff1 value: 43.02691416463743 - type: nauc_map_at_3_max value: 30.42366456898119 - type: nauc_map_at_3_std value: 5.155164523235761 - type: nauc_map_at_5_diff1 value: 42.50855309235288 - type: nauc_map_at_5_max value: 30.268005050849005 - type: nauc_map_at_5_std value: 5.5087675809592955 - type: nauc_mrr_at_1000_diff1 value: 39.918304151052496 - type: nauc_mrr_at_1000_max value: 32.3633242335842 - type: nauc_mrr_at_1000_std value: 9.821534513339788 - type: nauc_mrr_at_100_diff1 value: 39.88894200397407 - type: nauc_mrr_at_100_max value: 32.35005140436353 - type: nauc_mrr_at_100_std value: 9.798405855994671 - type: nauc_mrr_at_10_diff1 value: 40.398911825307096 - type: nauc_mrr_at_10_max value: 32.431125056382164 - type: nauc_mrr_at_10_std value: 9.607804963814376 - type: nauc_mrr_at_1_diff1 value: 44.710224260402306 - type: nauc_mrr_at_1_max value: 34.810999361965784 - type: nauc_mrr_at_1_std value: 6.666781318158904 - type: nauc_mrr_at_20_diff1 value: 40.00961756059491 - type: nauc_mrr_at_20_max value: 32.37658164628154 - type: nauc_mrr_at_20_std value: 9.668733699272558 - type: nauc_mrr_at_3_diff1 value: 41.57115214419929 - type: nauc_mrr_at_3_max value: 32.68793918495075 - type: nauc_mrr_at_3_std value: 9.040233893300375 - type: nauc_mrr_at_5_diff1 value: 41.06814071330848 - type: nauc_mrr_at_5_max value: 32.8245640568574 - type: nauc_mrr_at_5_std value: 9.58857119627648 - type: nauc_ndcg_at_1000_diff1 value: 36.80739838454769 - type: nauc_ndcg_at_1000_max value: 29.789668331458618 - type: nauc_ndcg_at_1000_std value: 11.39764916900706 - type: nauc_ndcg_at_100_diff1 value: 37.11213770959871 - type: nauc_ndcg_at_100_max value: 29.081591038980903 - type: nauc_ndcg_at_100_std value: 10.108782506088897 - type: nauc_ndcg_at_10_diff1 value: 39.5849935712723 - type: nauc_ndcg_at_10_max value: 28.96898719826389 - type: nauc_ndcg_at_10_std value: 7.961681263212508 - type: nauc_ndcg_at_1_diff1 value: 44.710224260402306 - type: nauc_ndcg_at_1_max value: 34.810999361965784 - type: nauc_ndcg_at_1_std value: 6.666781318158904 - type: nauc_ndcg_at_20_diff1 value: 38.12032626231077 - type: nauc_ndcg_at_20_max value: 29.18302919363044 - type: nauc_ndcg_at_20_std value: 8.263802202822081 - type: nauc_ndcg_at_3_diff1 value: 41.69966283174317 - type: nauc_ndcg_at_3_max value: 30.929246645213066 - type: nauc_ndcg_at_3_std value: 7.216761468782046 - type: nauc_ndcg_at_5_diff1 value: 41.01584530945962 - type: nauc_ndcg_at_5_max value: 30.289879950898214 - type: nauc_ndcg_at_5_std value: 7.4367837578277936 - type: nauc_precision_at_1000_diff1 value: 5.296272992814253 - type: nauc_precision_at_1000_max value: 19.76310705995752 - type: nauc_precision_at_1000_std value: 24.704985621130156 - type: nauc_precision_at_100_diff1 value: 16.46333749868499 - type: nauc_precision_at_100_max value: 26.043739871376527 - type: nauc_precision_at_100_std value: 26.092651162394155 - type: nauc_precision_at_10_diff1 value: 30.365327315976653 - type: nauc_precision_at_10_max value: 28.924585920344946 - type: nauc_precision_at_10_std value: 17.70407674779879 - type: nauc_precision_at_1_diff1 value: 44.710224260402306 - type: nauc_precision_at_1_max value: 34.810999361965784 - type: nauc_precision_at_1_std value: 6.666781318158904 - type: nauc_precision_at_20_diff1 value: 24.315922316558428 - type: nauc_precision_at_20_max value: 28.874260987195967 - type: nauc_precision_at_20_std value: 19.72374746122734 - type: nauc_precision_at_3_diff1 value: 37.37798681409137 - type: nauc_precision_at_3_max value: 32.308460896865824 - type: nauc_precision_at_3_std value: 12.279945415003562 - type: nauc_precision_at_5_diff1 value: 35.30318091103882 - type: nauc_precision_at_5_max value: 31.820548127213062 - type: nauc_precision_at_5_std value: 14.503599559616163 - type: nauc_recall_at_1000_diff1 value: 19.795948815823216 - type: nauc_recall_at_1000_max value: 24.278386660959896 - type: nauc_recall_at_1000_std value: 22.837222421253944 - type: nauc_recall_at_100_diff1 value: 24.472612415292573 - type: nauc_recall_at_100_max value: 21.91143710710276 - type: nauc_recall_at_100_std value: 15.053133349737896 - type: nauc_recall_at_10_diff1 value: 33.4020176737161 - type: nauc_recall_at_10_max value: 23.033614175897377 - type: nauc_recall_at_10_std value: 8.767203112156356 - type: nauc_recall_at_1_diff1 value: 46.839384517121886 - type: nauc_recall_at_1_max value: 33.10314951759653 - type: nauc_recall_at_1_std value: 3.473962823858065 - type: nauc_recall_at_20_diff1 value: 28.830072771517113 - type: nauc_recall_at_20_max value: 23.489066180696092 - type: nauc_recall_at_20_std value: 9.12579757868168 - type: nauc_recall_at_3_diff1 value: 39.908834198934215 - type: nauc_recall_at_3_max value: 27.068809545101175 - type: nauc_recall_at_3_std value: 6.530892914334164 - type: nauc_recall_at_5_diff1 value: 37.48709101560424 - type: nauc_recall_at_5_max value: 26.081573648351025 - type: nauc_recall_at_5_std value: 7.183952029055236 - type: ndcg_at_1 value: 15.977 - type: ndcg_at_10 value: 19.902 - type: ndcg_at_100 value: 24.086 - type: ndcg_at_1000 value: 27.01 - type: ndcg_at_20 value: 21.175 - type: ndcg_at_3 value: 17.330000000000002 - type: ndcg_at_5 value: 18.342 - type: precision_at_1 value: 15.977 - type: precision_at_10 value: 3.542 - type: precision_at_100 value: 0.679 - type: precision_at_1000 value: 0.109 - type: precision_at_20 value: 2.161 - type: precision_at_3 value: 8.053 - type: precision_at_5 value: 5.679 - type: recall_at_1 value: 12.928999999999998 - type: recall_at_10 value: 25.916 - type: recall_at_100 value: 44.836 - type: recall_at_1000 value: 65.22200000000001 - type: recall_at_20 value: 30.493 - type: recall_at_3 value: 18.241 - type: recall_at_5 value: 21.078 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackProgrammersRetrieval (default) revision: 6184bc1440d2dbc7612be22b50686b8826d22b32 split: test type: mteb/cqadupstack-programmers metrics: - type: main_score value: 15.862000000000002 - type: map_at_1 value: 9.831 - type: map_at_10 value: 13.256 - type: map_at_100 value: 14.008000000000001 - type: map_at_1000 value: 14.113000000000001 - type: map_at_20 value: 13.636999999999999 - type: map_at_3 value: 11.814 - type: map_at_5 value: 12.583 - type: mrr_at_1 value: 11.757990867579908 - type: mrr_at_10 value: 15.494808654055237 - type: mrr_at_100 value: 16.291820589502283 - type: mrr_at_1000 value: 16.374533932974945 - type: mrr_at_20 value: 15.933671804388336 - type: mrr_at_3 value: 13.83181126331811 - type: mrr_at_5 value: 14.6765601217656 - type: nauc_map_at_1000_diff1 value: 33.93453741920144 - type: nauc_map_at_1000_max value: 15.653730492995432 - type: nauc_map_at_1000_std value: 7.8758696471921175 - type: nauc_map_at_100_diff1 value: 33.93938109119093 - type: nauc_map_at_100_max value: 15.600263725191917 - type: nauc_map_at_100_std value: 7.765619322590685 - type: nauc_map_at_10_diff1 value: 34.54464331832195 - type: nauc_map_at_10_max value: 15.612792960561228 - type: nauc_map_at_10_std value: 6.7557841221613915 - type: nauc_map_at_1_diff1 value: 40.25943612185486 - type: nauc_map_at_1_max value: 17.181254846998176 - type: nauc_map_at_1_std value: 4.311873998223975 - type: nauc_map_at_20_diff1 value: 34.286604224077294 - type: nauc_map_at_20_max value: 15.557596686810724 - type: nauc_map_at_20_std value: 7.278138397108883 - type: nauc_map_at_3_diff1 value: 36.73973255367738 - type: nauc_map_at_3_max value: 16.83994296407283 - type: nauc_map_at_3_std value: 6.223159115827186 - type: nauc_map_at_5_diff1 value: 35.141424690409735 - type: nauc_map_at_5_max value: 15.992920926050328 - type: nauc_map_at_5_std value: 6.351250600055855 - type: nauc_mrr_at_1000_diff1 value: 34.73310032530598 - type: nauc_mrr_at_1000_max value: 19.015226556944313 - type: nauc_mrr_at_1000_std value: 9.222546150737514 - type: nauc_mrr_at_100_diff1 value: 34.726753216593245 - type: nauc_mrr_at_100_max value: 18.99769748963775 - type: nauc_mrr_at_100_std value: 9.174113672327863 - type: nauc_mrr_at_10_diff1 value: 35.44871459634613 - type: nauc_mrr_at_10_max value: 19.123376102993888 - type: nauc_mrr_at_10_std value: 8.400683156036651 - type: nauc_mrr_at_1_diff1 value: 41.66420742315266 - type: nauc_mrr_at_1_max value: 20.29699577568541 - type: nauc_mrr_at_1_std value: 6.552893551004773 - type: nauc_mrr_at_20_diff1 value: 34.97080168567599 - type: nauc_mrr_at_20_max value: 18.93820346421597 - type: nauc_mrr_at_20_std value: 8.88369463529979 - type: nauc_mrr_at_3_diff1 value: 37.82881961939195 - type: nauc_mrr_at_3_max value: 20.23353217486363 - type: nauc_mrr_at_3_std value: 8.335430576995872 - type: nauc_mrr_at_5_diff1 value: 36.39194951225287 - type: nauc_mrr_at_5_max value: 19.51895403281475 - type: nauc_mrr_at_5_std value: 8.109986680725223 - type: nauc_ndcg_at_1000_diff1 value: 29.082397825054134 - type: nauc_ndcg_at_1000_max value: 16.79542535678252 - type: nauc_ndcg_at_1000_std value: 13.862883511514385 - type: nauc_ndcg_at_100_diff1 value: 29.052598252998568 - type: nauc_ndcg_at_100_max value: 15.498427568714371 - type: nauc_ndcg_at_100_std value: 11.726792940214132 - type: nauc_ndcg_at_10_diff1 value: 32.1345507923688 - type: nauc_ndcg_at_10_max value: 15.522253057572243 - type: nauc_ndcg_at_10_std value: 8.033462171395978 - type: nauc_ndcg_at_1_diff1 value: 41.66420742315266 - type: nauc_ndcg_at_1_max value: 20.29699577568541 - type: nauc_ndcg_at_1_std value: 6.552893551004773 - type: nauc_ndcg_at_20_diff1 value: 30.9118537718024 - type: nauc_ndcg_at_20_max value: 15.015691320922405 - type: nauc_ndcg_at_20_std value: 9.48348066099931 - type: nauc_ndcg_at_3_diff1 value: 36.00136268031041 - type: nauc_ndcg_at_3_max value: 18.106666639494865 - type: nauc_ndcg_at_3_std value: 7.641902435989431 - type: nauc_ndcg_at_5_diff1 value: 33.39201547133596 - type: nauc_ndcg_at_5_max value: 16.476689691452638 - type: nauc_ndcg_at_5_std value: 7.369674781372547 - type: nauc_precision_at_1000_diff1 value: 6.471252357066656 - type: nauc_precision_at_1000_max value: 19.69714506243997 - type: nauc_precision_at_1000_std value: 19.55604767049242 - type: nauc_precision_at_100_diff1 value: 14.901264085785481 - type: nauc_precision_at_100_max value: 18.109459081509822 - type: nauc_precision_at_100_std value: 21.114563137000474 - type: nauc_precision_at_10_diff1 value: 27.5518231119986 - type: nauc_precision_at_10_max value: 15.967381663307059 - type: nauc_precision_at_10_std value: 11.45892974481074 - type: nauc_precision_at_1_diff1 value: 41.66420742315266 - type: nauc_precision_at_1_max value: 20.29699577568541 - type: nauc_precision_at_1_std value: 6.552893551004773 - type: nauc_precision_at_20_diff1 value: 24.871167172495863 - type: nauc_precision_at_20_max value: 16.035625528276007 - type: nauc_precision_at_20_std value: 16.40037479366967 - type: nauc_precision_at_3_diff1 value: 35.34609472177138 - type: nauc_precision_at_3_max value: 20.28057060245756 - type: nauc_precision_at_3_std value: 9.58695451354911 - type: nauc_precision_at_5_diff1 value: 31.12453786882641 - type: nauc_precision_at_5_max value: 17.714809323391766 - type: nauc_precision_at_5_std value: 9.540687572068887 - type: nauc_recall_at_1000_diff1 value: 13.176944792680187 - type: nauc_recall_at_1000_max value: 17.215938373520867 - type: nauc_recall_at_1000_std value: 31.763351387419913 - type: nauc_recall_at_100_diff1 value: 15.598307875167269 - type: nauc_recall_at_100_max value: 11.571312022801102 - type: nauc_recall_at_100_std value: 18.72066053860531 - type: nauc_recall_at_10_diff1 value: 25.20073017671981 - type: nauc_recall_at_10_max value: 12.05920538584769 - type: nauc_recall_at_10_std value: 9.127287803525167 - type: nauc_recall_at_1_diff1 value: 40.25943612185486 - type: nauc_recall_at_1_max value: 17.181254846998176 - type: nauc_recall_at_1_std value: 4.311873998223975 - type: nauc_recall_at_20_diff1 value: 21.87476573323018 - type: nauc_recall_at_20_max value: 10.324185189089619 - type: nauc_recall_at_20_std value: 12.342028690096459 - type: nauc_recall_at_3_diff1 value: 32.78814063821437 - type: nauc_recall_at_3_max value: 16.638784171801436 - type: nauc_recall_at_3_std value: 8.529115114779637 - type: nauc_recall_at_5_diff1 value: 28.192900822422317 - type: nauc_recall_at_5_max value: 13.974726351715857 - type: nauc_recall_at_5_std value: 8.09305084632621 - type: ndcg_at_1 value: 11.758000000000001 - type: ndcg_at_10 value: 15.862000000000002 - type: ndcg_at_100 value: 19.949 - type: ndcg_at_1000 value: 22.917 - type: ndcg_at_20 value: 17.249 - type: ndcg_at_3 value: 12.992 - type: ndcg_at_5 value: 14.266000000000002 - type: precision_at_1 value: 11.758000000000001 - type: precision_at_10 value: 2.82 - type: precision_at_100 value: 0.575 - type: precision_at_1000 value: 0.098 - type: precision_at_20 value: 1.7870000000000001 - type: precision_at_3 value: 5.822 - type: precision_at_5 value: 4.315 - type: recall_at_1 value: 9.831 - type: recall_at_10 value: 21.762999999999998 - type: recall_at_100 value: 40.207 - type: recall_at_1000 value: 61.635 - type: recall_at_20 value: 26.826 - type: recall_at_3 value: 13.969999999999999 - type: recall_at_5 value: 17.154 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackRetrieval (default) revision: CQADupstackRetrieval is a combined dataset split: test type: CQADupstackRetrieval metrics: - type: main_score value: 17.016083333333334 - type: ndcg_at_10 value: 17.016083333333334 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackStatsRetrieval (default) revision: 65ac3a16b8e91f9cee4c9828cc7c335575432a2a split: test type: mteb/cqadupstack-stats metrics: - type: main_score value: 11.457 - type: map_at_1 value: 6.798 - type: map_at_10 value: 9.513 - type: map_at_100 value: 10.11 - type: map_at_1000 value: 10.181999999999999 - type: map_at_20 value: 9.852 - type: map_at_3 value: 8.459999999999999 - type: map_at_5 value: 9.095 - type: mrr_at_1 value: 8.43558282208589 - type: mrr_at_10 value: 11.242818190670953 - type: mrr_at_100 value: 11.841115877888047 - type: mrr_at_1000 value: 11.910635997616325 - type: mrr_at_20 value: 11.596258015622588 - type: mrr_at_3 value: 10.122699386503067 - type: mrr_at_5 value: 10.782208588957056 - type: nauc_map_at_1000_diff1 value: 33.754657655521825 - type: nauc_map_at_1000_max value: 20.457874599194977 - type: nauc_map_at_1000_std value: 4.356173597738065 - type: nauc_map_at_100_diff1 value: 33.75222679569881 - type: nauc_map_at_100_max value: 20.373956157972724 - type: nauc_map_at_100_std value: 4.252302912475765 - type: nauc_map_at_10_diff1 value: 34.77872705587748 - type: nauc_map_at_10_max value: 20.93118729929346 - type: nauc_map_at_10_std value: 3.481910641472398 - type: nauc_map_at_1_diff1 value: 42.058523271621276 - type: nauc_map_at_1_max value: 19.398661310678737 - type: nauc_map_at_1_std value: -1.9329828695069966 - type: nauc_map_at_20_diff1 value: 34.32132356844234 - type: nauc_map_at_20_max value: 20.836011847513134 - type: nauc_map_at_20_std value: 3.410902073845993 - type: nauc_map_at_3_diff1 value: 36.8129992491477 - type: nauc_map_at_3_max value: 21.49364083314497 - type: nauc_map_at_3_std value: 2.8543672506917117 - type: nauc_map_at_5_diff1 value: 35.945765614409595 - type: nauc_map_at_5_max value: 21.821959253251073 - type: nauc_map_at_5_std value: 3.1795889661755754 - type: nauc_mrr_at_1000_diff1 value: 33.022280754336535 - type: nauc_mrr_at_1000_max value: 20.31974398955361 - type: nauc_mrr_at_1000_std value: 6.915574901994777 - type: nauc_mrr_at_100_diff1 value: 32.98012701377776 - type: nauc_mrr_at_100_max value: 20.217936050257485 - type: nauc_mrr_at_100_std value: 6.853368541174533 - type: nauc_mrr_at_10_diff1 value: 34.0521482962105 - type: nauc_mrr_at_10_max value: 20.594837283745004 - type: nauc_mrr_at_10_std value: 6.58219400975866 - type: nauc_mrr_at_1_diff1 value: 40.45214208803864 - type: nauc_mrr_at_1_max value: 20.246074459121917 - type: nauc_mrr_at_1_std value: 3.6861996527886007 - type: nauc_mrr_at_20_diff1 value: 33.40956751827326 - type: nauc_mrr_at_20_max value: 20.570275995460932 - type: nauc_mrr_at_20_std value: 6.243011136595918 - type: nauc_mrr_at_3_diff1 value: 36.31911031414795 - type: nauc_mrr_at_3_max value: 21.695701449295836 - type: nauc_mrr_at_3_std value: 6.71267279773233 - type: nauc_mrr_at_5_diff1 value: 35.13580430980389 - type: nauc_mrr_at_5_max value: 21.723293067977693 - type: nauc_mrr_at_5_std value: 6.269186070012771 - type: nauc_ndcg_at_1000_diff1 value: 26.716650512928574 - type: nauc_ndcg_at_1000_max value: 18.323227051095493 - type: nauc_ndcg_at_1000_std value: 10.182374858813544 - type: nauc_ndcg_at_100_diff1 value: 27.023329777242445 - type: nauc_ndcg_at_100_max value: 17.4041094989256 - type: nauc_ndcg_at_100_std value: 8.607201276878204 - type: nauc_ndcg_at_10_diff1 value: 31.921453307307818 - type: nauc_ndcg_at_10_max value: 20.328563944294817 - type: nauc_ndcg_at_10_std value: 5.531328567900397 - type: nauc_ndcg_at_1_diff1 value: 40.45214208803864 - type: nauc_ndcg_at_1_max value: 20.246074459121917 - type: nauc_ndcg_at_1_std value: 3.6861996527886007 - type: nauc_ndcg_at_20_diff1 value: 30.279986443553863 - type: nauc_ndcg_at_20_max value: 20.274259234859194 - type: nauc_ndcg_at_20_std value: 5.0661641286538925 - type: nauc_ndcg_at_3_diff1 value: 35.40139952163887 - type: nauc_ndcg_at_3_max value: 21.8390120280498 - type: nauc_ndcg_at_3_std value: 5.417193004461638 - type: nauc_ndcg_at_5_diff1 value: 34.323991615044044 - type: nauc_ndcg_at_5_max value: 22.44454175298003 - type: nauc_ndcg_at_5_std value: 5.058913656381477 - type: nauc_precision_at_1000_diff1 value: 8.13341460956022 - type: nauc_precision_at_1000_max value: 13.380869610400731 - type: nauc_precision_at_1000_std value: 25.77566088719011 - type: nauc_precision_at_100_diff1 value: 12.028198307574947 - type: nauc_precision_at_100_max value: 9.99491259218647 - type: nauc_precision_at_100_std value: 20.26038939641748 - type: nauc_precision_at_10_diff1 value: 25.497863066445802 - type: nauc_precision_at_10_max value: 19.951934819022966 - type: nauc_precision_at_10_std value: 13.029428588116488 - type: nauc_precision_at_1_diff1 value: 40.45214208803864 - type: nauc_precision_at_1_max value: 20.246074459121917 - type: nauc_precision_at_1_std value: 3.6861996527886007 - type: nauc_precision_at_20_diff1 value: 21.270433967723527 - type: nauc_precision_at_20_max value: 20.20704051155486 - type: nauc_precision_at_20_std value: 10.606697205011349 - type: nauc_precision_at_3_diff1 value: 34.304974107764636 - type: nauc_precision_at_3_max value: 24.786027767206704 - type: nauc_precision_at_3_std value: 12.919584289443248 - type: nauc_precision_at_5_diff1 value: 31.235010233089454 - type: nauc_precision_at_5_max value: 25.888178221422027 - type: nauc_precision_at_5_std value: 12.04974180403603 - type: nauc_recall_at_1000_diff1 value: 10.70347303527697 - type: nauc_recall_at_1000_max value: 11.531776655259092 - type: nauc_recall_at_1000_std value: 20.09518174937834 - type: nauc_recall_at_100_diff1 value: 12.277161162587646 - type: nauc_recall_at_100_max value: 9.031651314357903 - type: nauc_recall_at_100_std value: 14.946530478779566 - type: nauc_recall_at_10_diff1 value: 25.751282561301597 - type: nauc_recall_at_10_max value: 18.410538940956624 - type: nauc_recall_at_10_std value: 7.052566618916148 - type: nauc_recall_at_1_diff1 value: 42.058523271621276 - type: nauc_recall_at_1_max value: 19.398661310678737 - type: nauc_recall_at_1_std value: -1.9329828695069966 - type: nauc_recall_at_20_diff1 value: 21.876105916783473 - type: nauc_recall_at_20_max value: 18.14029808306082 - type: nauc_recall_at_20_std value: 5.721370338729993 - type: nauc_recall_at_3_diff1 value: 32.349105117433645 - type: nauc_recall_at_3_max value: 22.475284730157217 - type: nauc_recall_at_3_std value: 6.577737452085277 - type: nauc_recall_at_5_diff1 value: 30.45726437530916 - type: nauc_recall_at_5_max value: 22.993204324458517 - type: nauc_recall_at_5_std value: 6.237822274407502 - type: ndcg_at_1 value: 8.436 - type: ndcg_at_10 value: 11.457 - type: ndcg_at_100 value: 14.618 - type: ndcg_at_1000 value: 16.803 - type: ndcg_at_20 value: 12.67 - type: ndcg_at_3 value: 9.396 - type: ndcg_at_5 value: 10.458 - type: precision_at_1 value: 8.436 - type: precision_at_10 value: 2.025 - type: precision_at_100 value: 0.391 - type: precision_at_1000 value: 0.063 - type: precision_at_20 value: 1.304 - type: precision_at_3 value: 4.192 - type: precision_at_5 value: 3.221 - type: recall_at_1 value: 6.798 - type: recall_at_10 value: 15.878999999999998 - type: recall_at_100 value: 30.768 - type: recall_at_1000 value: 47.451 - type: recall_at_20 value: 20.466 - type: recall_at_3 value: 10.224 - type: recall_at_5 value: 12.881 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackTexRetrieval (default) revision: 46989137a86843e03a6195de44b09deda022eec7 split: test type: mteb/cqadupstack-tex metrics: - type: main_score value: 9.754999999999999 - type: map_at_1 value: 5.489999999999999 - type: map_at_10 value: 7.9350000000000005 - type: map_at_100 value: 8.376999999999999 - type: map_at_1000 value: 8.458 - type: map_at_20 value: 8.14 - type: map_at_3 value: 7.166 - type: map_at_5 value: 7.5840000000000005 - type: mrr_at_1 value: 7.054370268410186 - type: mrr_at_10 value: 9.948655764209787 - type: mrr_at_100 value: 10.44089540191581 - type: mrr_at_1000 value: 10.510808098620316 - type: mrr_at_20 value: 10.18549289814409 - type: mrr_at_3 value: 9.027299839412715 - type: mrr_at_5 value: 9.52626749254416 - type: nauc_map_at_1000_diff1 value: 32.76388527748132 - type: nauc_map_at_1000_max value: 26.76472945437023 - type: nauc_map_at_1000_std value: 5.076773141116664 - type: nauc_map_at_100_diff1 value: 32.84910041131489 - type: nauc_map_at_100_max value: 26.776649275369763 - type: nauc_map_at_100_std value: 4.982288267487467 - type: nauc_map_at_10_diff1 value: 33.69288297350157 - type: nauc_map_at_10_max value: 27.030787162656093 - type: nauc_map_at_10_std value: 4.319996549665479 - type: nauc_map_at_1_diff1 value: 45.07110295953283 - type: nauc_map_at_1_max value: 31.183919870403624 - type: nauc_map_at_1_std value: 3.2596636083232524 - type: nauc_map_at_20_diff1 value: 33.18385578478434 - type: nauc_map_at_20_max value: 26.750880392311256 - type: nauc_map_at_20_std value: 4.560028824060983 - type: nauc_map_at_3_diff1 value: 36.134060387060806 - type: nauc_map_at_3_max value: 28.53718072767372 - type: nauc_map_at_3_std value: 3.8039060416364054 - type: nauc_map_at_5_diff1 value: 34.85287692775015 - type: nauc_map_at_5_max value: 27.89364342330856 - type: nauc_map_at_5_std value: 4.119474259507159 - type: nauc_mrr_at_1000_diff1 value: 32.015809492076826 - type: nauc_mrr_at_1000_max value: 27.431639711646994 - type: nauc_mrr_at_1000_std value: 5.95554166485951 - type: nauc_mrr_at_100_diff1 value: 32.07039747646208 - type: nauc_mrr_at_100_max value: 27.452847130237775 - type: nauc_mrr_at_100_std value: 5.905310921828455 - type: nauc_mrr_at_10_diff1 value: 32.93108532798797 - type: nauc_mrr_at_10_max value: 27.768472855609204 - type: nauc_mrr_at_10_std value: 5.580104763303006 - type: nauc_mrr_at_1_diff1 value: 43.888408590108355 - type: nauc_mrr_at_1_max value: 32.903967259484176 - type: nauc_mrr_at_1_std value: 3.514629542175588 - type: nauc_mrr_at_20_diff1 value: 32.408176921975254 - type: nauc_mrr_at_20_max value: 27.470576205679897 - type: nauc_mrr_at_20_std value: 5.716181575723001 - type: nauc_mrr_at_3_diff1 value: 35.354655207362356 - type: nauc_mrr_at_3_max value: 29.14309593167405 - type: nauc_mrr_at_3_std value: 4.63189493416609 - type: nauc_mrr_at_5_diff1 value: 33.970622089384825 - type: nauc_mrr_at_5_max value: 28.6239836688986 - type: nauc_mrr_at_5_std value: 5.122010745650993 - type: nauc_ndcg_at_1000_diff1 value: 25.030181517448163 - type: nauc_ndcg_at_1000_max value: 24.25419053775242 - type: nauc_ndcg_at_1000_std value: 9.178235317241148 - type: nauc_ndcg_at_100_diff1 value: 26.546832760443966 - type: nauc_ndcg_at_100_max value: 24.42201784253177 - type: nauc_ndcg_at_100_std value: 7.9899910907634375 - type: nauc_ndcg_at_10_diff1 value: 29.856179532797423 - type: nauc_ndcg_at_10_max value: 25.424197578846012 - type: nauc_ndcg_at_10_std value: 5.1638300059562035 - type: nauc_ndcg_at_1_diff1 value: 43.888408590108355 - type: nauc_ndcg_at_1_max value: 32.903967259484176 - type: nauc_ndcg_at_1_std value: 3.514629542175588 - type: nauc_ndcg_at_20_diff1 value: 28.387788168718874 - type: nauc_ndcg_at_20_max value: 24.54850515588615 - type: nauc_ndcg_at_20_std value: 5.896669986261477 - type: nauc_ndcg_at_3_diff1 value: 34.072630397644424 - type: nauc_ndcg_at_3_max value: 28.28910465749962 - type: nauc_ndcg_at_3_std value: 4.108392335721374 - type: nauc_ndcg_at_5_diff1 value: 32.01123351290829 - type: nauc_ndcg_at_5_max value: 27.245024254467303 - type: nauc_ndcg_at_5_std value: 4.721870277645733 - type: nauc_precision_at_1000_diff1 value: 10.47217681263907 - type: nauc_precision_at_1000_max value: 20.919793131324727 - type: nauc_precision_at_1000_std value: 14.804007062294563 - type: nauc_precision_at_100_diff1 value: 16.685502515637722 - type: nauc_precision_at_100_max value: 23.37373409901207 - type: nauc_precision_at_100_std value: 13.953311698132442 - type: nauc_precision_at_10_diff1 value: 22.478790016325785 - type: nauc_precision_at_10_max value: 23.607477242235102 - type: nauc_precision_at_10_std value: 7.794068171304157 - type: nauc_precision_at_1_diff1 value: 43.888408590108355 - type: nauc_precision_at_1_max value: 32.903967259484176 - type: nauc_precision_at_1_std value: 3.514629542175588 - type: nauc_precision_at_20_diff1 value: 19.959179713421722 - type: nauc_precision_at_20_max value: 21.738126842321893 - type: nauc_precision_at_20_std value: 9.007914166096132 - type: nauc_precision_at_3_diff1 value: 29.984253127282134 - type: nauc_precision_at_3_max value: 28.271022607772796 - type: nauc_precision_at_3_std value: 5.620451575052563 - type: nauc_precision_at_5_diff1 value: 26.198401324939464 - type: nauc_precision_at_5_max value: 26.593956126902786 - type: nauc_precision_at_5_std value: 6.684705108310583 - type: nauc_recall_at_1000_diff1 value: 9.812234445343657 - type: nauc_recall_at_1000_max value: 17.800710147129053 - type: nauc_recall_at_1000_std value: 15.826278320231745 - type: nauc_recall_at_100_diff1 value: 14.586175748060896 - type: nauc_recall_at_100_max value: 18.340956025066333 - type: nauc_recall_at_100_std value: 12.791161727474043 - type: nauc_recall_at_10_diff1 value: 21.286255365948538 - type: nauc_recall_at_10_max value: 20.04866550317387 - type: nauc_recall_at_10_std value: 5.645106302785361 - type: nauc_recall_at_1_diff1 value: 45.07110295953283 - type: nauc_recall_at_1_max value: 31.183919870403624 - type: nauc_recall_at_1_std value: 3.2596636083232524 - type: nauc_recall_at_20_diff1 value: 18.757519729175094 - type: nauc_recall_at_20_max value: 18.59809411356838 - type: nauc_recall_at_20_std value: 7.482712453171494 - type: nauc_recall_at_3_diff1 value: 29.350550830882405 - type: nauc_recall_at_3_max value: 26.26284543188125 - type: nauc_recall_at_3_std value: 4.284032658092434 - type: nauc_recall_at_5_diff1 value: 25.247444183841345 - type: nauc_recall_at_5_max value: 23.639030774195213 - type: nauc_recall_at_5_std value: 5.05748857090612 - type: ndcg_at_1 value: 7.054 - type: ndcg_at_10 value: 9.754999999999999 - type: ndcg_at_100 value: 12.252 - type: ndcg_at_1000 value: 14.658999999999999 - type: ndcg_at_20 value: 10.508000000000001 - type: ndcg_at_3 value: 8.265 - type: ndcg_at_5 value: 8.929 - type: precision_at_1 value: 7.054 - type: precision_at_10 value: 1.807 - type: precision_at_100 value: 0.368 - type: precision_at_1000 value: 0.06899999999999999 - type: precision_at_20 value: 1.1199999999999999 - type: precision_at_3 value: 3.9690000000000003 - type: precision_at_5 value: 2.863 - type: recall_at_1 value: 5.489999999999999 - type: recall_at_10 value: 13.422 - type: recall_at_100 value: 24.962999999999997 - type: recall_at_1000 value: 42.725 - type: recall_at_20 value: 16.259 - type: recall_at_3 value: 9.155000000000001 - type: recall_at_5 value: 10.923 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackUnixRetrieval (default) revision: 6c6430d3a6d36f8d2a829195bc5dc94d7e063e53 split: test type: mteb/cqadupstack-unix metrics: - type: main_score value: 16.884 - type: map_at_1 value: 11.259 - type: map_at_10 value: 14.371999999999998 - type: map_at_100 value: 14.921999999999999 - type: map_at_1000 value: 15.012 - type: map_at_20 value: 14.643 - type: map_at_3 value: 13.196 - type: map_at_5 value: 13.786000000000001 - type: mrr_at_1 value: 13.619402985074627 - type: mrr_at_10 value: 17.155739161336175 - type: mrr_at_100 value: 17.682382182436477 - type: mrr_at_1000 value: 17.762865075369113 - type: mrr_at_20 value: 17.394179616617638 - type: mrr_at_3 value: 15.951492537313436 - type: mrr_at_5 value: 16.497201492537318 - type: nauc_map_at_1000_diff1 value: 47.4265740975564 - type: nauc_map_at_1000_max value: 28.882262726128438 - type: nauc_map_at_1000_std value: 8.733456805684261 - type: nauc_map_at_100_diff1 value: 47.47182414534892 - type: nauc_map_at_100_max value: 28.85824710228484 - type: nauc_map_at_100_std value: 8.689373453465027 - type: nauc_map_at_10_diff1 value: 48.02651594284678 - type: nauc_map_at_10_max value: 29.238822235344035 - type: nauc_map_at_10_std value: 8.33007800978345 - type: nauc_map_at_1_diff1 value: 56.39452680423106 - type: nauc_map_at_1_max value: 32.60008414160042 - type: nauc_map_at_1_std value: 6.843961503288069 - type: nauc_map_at_20_diff1 value: 47.63901968476526 - type: nauc_map_at_20_max value: 29.025324617088327 - type: nauc_map_at_20_std value: 8.643210479120588 - type: nauc_map_at_3_diff1 value: 49.40628498975407 - type: nauc_map_at_3_max value: 30.22948877331367 - type: nauc_map_at_3_std value: 7.289154264399903 - type: nauc_map_at_5_diff1 value: 48.664130342694136 - type: nauc_map_at_5_max value: 30.14327671294244 - type: nauc_map_at_5_std value: 7.939333631753251 - type: nauc_mrr_at_1000_diff1 value: 44.58799837398294 - type: nauc_mrr_at_1000_max value: 31.03541915705859 - type: nauc_mrr_at_1000_std value: 10.403824515337941 - type: nauc_mrr_at_100_diff1 value: 44.601824537567715 - type: nauc_mrr_at_100_max value: 31.02756566133194 - type: nauc_mrr_at_100_std value: 10.374041246429492 - type: nauc_mrr_at_10_diff1 value: 45.08809081749144 - type: nauc_mrr_at_10_max value: 31.57615351364963 - type: nauc_mrr_at_10_std value: 10.29441865771061 - type: nauc_mrr_at_1_diff1 value: 53.78193049233505 - type: nauc_mrr_at_1_max value: 35.795787308983364 - type: nauc_mrr_at_1_std value: 9.700924818901061 - type: nauc_mrr_at_20_diff1 value: 44.74335182043816 - type: nauc_mrr_at_20_max value: 31.18129900426782 - type: nauc_mrr_at_20_std value: 10.385325054118825 - type: nauc_mrr_at_3_diff1 value: 46.73779708259278 - type: nauc_mrr_at_3_max value: 32.65075209697959 - type: nauc_mrr_at_3_std value: 9.728066031213869 - type: nauc_mrr_at_5_diff1 value: 45.92982408736637 - type: nauc_mrr_at_5_max value: 32.467526279204826 - type: nauc_mrr_at_5_std value: 9.989919602029717 - type: nauc_ndcg_at_1000_diff1 value: 40.92066479403982 - type: nauc_ndcg_at_1000_max value: 26.324838581358712 - type: nauc_ndcg_at_1000_std value: 11.523782722688093 - type: nauc_ndcg_at_100_diff1 value: 41.69901831802912 - type: nauc_ndcg_at_100_max value: 26.05948550508969 - type: nauc_ndcg_at_100_std value: 10.741879131890466 - type: nauc_ndcg_at_10_diff1 value: 43.984470289795006 - type: nauc_ndcg_at_10_max value: 27.712165270383217 - type: nauc_ndcg_at_10_std value: 9.664252780617716 - type: nauc_ndcg_at_1_diff1 value: 53.78193049233505 - type: nauc_ndcg_at_1_max value: 35.795787308983364 - type: nauc_ndcg_at_1_std value: 9.700924818901061 - type: nauc_ndcg_at_20_diff1 value: 42.87969088645589 - type: nauc_ndcg_at_20_max value: 26.93508319676996 - type: nauc_ndcg_at_20_std value: 10.383528785973736 - type: nauc_ndcg_at_3_diff1 value: 46.50711903290246 - type: nauc_ndcg_at_3_max value: 30.119861670148136 - type: nauc_ndcg_at_3_std value: 8.209698597192652 - type: nauc_ndcg_at_5_diff1 value: 45.5276661506903 - type: nauc_ndcg_at_5_max value: 29.727216155363013 - type: nauc_ndcg_at_5_std value: 8.969137019208551 - type: nauc_precision_at_1000_diff1 value: 13.186344514919291 - type: nauc_precision_at_1000_max value: 14.081180493706894 - type: nauc_precision_at_1000_std value: 13.331957277782028 - type: nauc_precision_at_100_diff1 value: 25.836947568988094 - type: nauc_precision_at_100_max value: 19.399450264723857 - type: nauc_precision_at_100_std value: 15.996979763079173 - type: nauc_precision_at_10_diff1 value: 31.611911937904136 - type: nauc_precision_at_10_max value: 23.67106809118961 - type: nauc_precision_at_10_std value: 12.494002491494403 - type: nauc_precision_at_1_diff1 value: 53.78193049233505 - type: nauc_precision_at_1_max value: 35.795787308983364 - type: nauc_precision_at_1_std value: 9.700924818901061 - type: nauc_precision_at_20_diff1 value: 28.52666886145722 - type: nauc_precision_at_20_max value: 21.954240311035203 - type: nauc_precision_at_20_std value: 14.844645388086807 - type: nauc_precision_at_3_diff1 value: 38.45498467923997 - type: nauc_precision_at_3_max value: 29.266449529306882 - type: nauc_precision_at_3_std value: 9.049210381929473 - type: nauc_precision_at_5_diff1 value: 36.09730656980118 - type: nauc_precision_at_5_max value: 28.837127135797243 - type: nauc_precision_at_5_std value: 11.158339114522931 - type: nauc_recall_at_1000_diff1 value: 21.260887713456125 - type: nauc_recall_at_1000_max value: 16.113129212962036 - type: nauc_recall_at_1000_std value: 18.480136835190926 - type: nauc_recall_at_100_diff1 value: 27.104482564680143 - type: nauc_recall_at_100_max value: 15.992106261015381 - type: nauc_recall_at_100_std value: 13.84189240491372 - type: nauc_recall_at_10_diff1 value: 35.07971219401454 - type: nauc_recall_at_10_max value: 21.285398091407597 - type: nauc_recall_at_10_std value: 11.2371939944325 - type: nauc_recall_at_1_diff1 value: 56.39452680423106 - type: nauc_recall_at_1_max value: 32.60008414160042 - type: nauc_recall_at_1_std value: 6.843961503288069 - type: nauc_recall_at_20_diff1 value: 32.39512106898805 - type: nauc_recall_at_20_max value: 19.218626368924355 - type: nauc_recall_at_20_std value: 12.883976865810729 - type: nauc_recall_at_3_diff1 value: 42.44181844531972 - type: nauc_recall_at_3_max value: 26.878784537566723 - type: nauc_recall_at_3_std value: 8.021682738108238 - type: nauc_recall_at_5_diff1 value: 39.71281577688504 - type: nauc_recall_at_5_max value: 26.741868241320095 - type: nauc_recall_at_5_std value: 9.776821004059626 - type: ndcg_at_1 value: 13.619 - type: ndcg_at_10 value: 16.884 - type: ndcg_at_100 value: 19.919999999999998 - type: ndcg_at_1000 value: 22.61 - type: ndcg_at_20 value: 17.802 - type: ndcg_at_3 value: 14.601 - type: ndcg_at_5 value: 15.47 - type: precision_at_1 value: 13.619 - type: precision_at_10 value: 2.8080000000000003 - type: precision_at_100 value: 0.485 - type: precision_at_1000 value: 0.08099999999999999 - type: precision_at_20 value: 1.66 - type: precision_at_3 value: 6.468 - type: precision_at_5 value: 4.496 - type: recall_at_1 value: 11.259 - type: recall_at_10 value: 22.148 - type: recall_at_100 value: 36.338 - type: recall_at_1000 value: 56.37 - type: recall_at_20 value: 25.444 - type: recall_at_3 value: 15.601 - type: recall_at_5 value: 17.904999999999998 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackWebmastersRetrieval (default) revision: 160c094312a0e1facb97e55eeddb698c0abe3571 split: test type: mteb/cqadupstack-webmasters metrics: - type: main_score value: 18.986 - type: map_at_1 value: 11.219 - type: map_at_10 value: 15.572 - type: map_at_100 value: 16.496 - type: map_at_1000 value: 16.666 - type: map_at_20 value: 16.073999999999998 - type: map_at_3 value: 14.173 - type: map_at_5 value: 14.915000000000001 - type: mrr_at_1 value: 14.82213438735178 - type: mrr_at_10 value: 19.52365267582659 - type: mrr_at_100 value: 20.370290185635753 - type: mrr_at_1000 value: 20.467043542503724 - type: mrr_at_20 value: 20.0766545965337 - type: mrr_at_3 value: 18.21475625823452 - type: mrr_at_5 value: 18.945981554677203 - type: nauc_map_at_1000_diff1 value: 42.231943470301474 - type: nauc_map_at_1000_max value: 26.47159454229298 - type: nauc_map_at_1000_std value: 8.142899408562116 - type: nauc_map_at_100_diff1 value: 42.20734027834296 - type: nauc_map_at_100_max value: 26.482392045352114 - type: nauc_map_at_100_std value: 7.869302970334234 - type: nauc_map_at_10_diff1 value: 43.04836148095647 - type: nauc_map_at_10_max value: 26.854456008820886 - type: nauc_map_at_10_std value: 7.199117428761973 - type: nauc_map_at_1_diff1 value: 52.69584045825562 - type: nauc_map_at_1_max value: 32.26169513753074 - type: nauc_map_at_1_std value: 6.952498233745584 - type: nauc_map_at_20_diff1 value: 42.41625410983439 - type: nauc_map_at_20_max value: 26.907750306130733 - type: nauc_map_at_20_std value: 7.478967739706924 - type: nauc_map_at_3_diff1 value: 44.785788923058384 - type: nauc_map_at_3_max value: 27.412957229850438 - type: nauc_map_at_3_std value: 6.907258583517531 - type: nauc_map_at_5_diff1 value: 43.634053742171005 - type: nauc_map_at_5_max value: 27.311414645244174 - type: nauc_map_at_5_std value: 6.782368796408486 - type: nauc_mrr_at_1000_diff1 value: 40.121034147067355 - type: nauc_mrr_at_1000_max value: 26.418816188019484 - type: nauc_mrr_at_1000_std value: 11.036789931313589 - type: nauc_mrr_at_100_diff1 value: 40.09038771859193 - type: nauc_mrr_at_100_max value: 26.35109915559335 - type: nauc_mrr_at_100_std value: 11.004694419173386 - type: nauc_mrr_at_10_diff1 value: 40.70815905748883 - type: nauc_mrr_at_10_max value: 26.39730116006313 - type: nauc_mrr_at_10_std value: 10.795296410891202 - type: nauc_mrr_at_1_diff1 value: 49.49023740663914 - type: nauc_mrr_at_1_max value: 32.80752877856241 - type: nauc_mrr_at_1_std value: 9.182609293548452 - type: nauc_mrr_at_20_diff1 value: 40.09097766117321 - type: nauc_mrr_at_20_max value: 26.543696500831608 - type: nauc_mrr_at_20_std value: 11.045110550071236 - type: nauc_mrr_at_3_diff1 value: 42.547772290792786 - type: nauc_mrr_at_3_max value: 27.248503683439974 - type: nauc_mrr_at_3_std value: 11.12811144130018 - type: nauc_mrr_at_5_diff1 value: 41.182672458130945 - type: nauc_mrr_at_5_max value: 27.204022967551346 - type: nauc_mrr_at_5_std value: 10.736058227235059 - type: nauc_ndcg_at_1000_diff1 value: 38.283155226012525 - type: nauc_ndcg_at_1000_max value: 23.952454186870728 - type: nauc_ndcg_at_1000_std value: 11.202190633221258 - type: nauc_ndcg_at_100_diff1 value: 37.28326924063582 - type: nauc_ndcg_at_100_max value: 23.059861557232345 - type: nauc_ndcg_at_100_std value: 9.94550524440808 - type: nauc_ndcg_at_10_diff1 value: 39.63812221599438 - type: nauc_ndcg_at_10_max value: 24.35015593369919 - type: nauc_ndcg_at_10_std value: 9.315660164781054 - type: nauc_ndcg_at_1_diff1 value: 49.49023740663914 - type: nauc_ndcg_at_1_max value: 32.80752877856241 - type: nauc_ndcg_at_1_std value: 9.182609293548452 - type: nauc_ndcg_at_20_diff1 value: 37.63726489914318 - type: nauc_ndcg_at_20_max value: 24.728684570593007 - type: nauc_ndcg_at_20_std value: 9.986169134250208 - type: nauc_ndcg_at_3_diff1 value: 41.86142781421585 - type: nauc_ndcg_at_3_max value: 25.373436332199645 - type: nauc_ndcg_at_3_std value: 9.66682128586139 - type: nauc_ndcg_at_5_diff1 value: 40.642745287564594 - type: nauc_ndcg_at_5_max value: 25.56873621658099 - type: nauc_ndcg_at_5_std value: 9.25538178041856 - type: nauc_precision_at_1000_diff1 value: 11.480722649998393 - type: nauc_precision_at_1000_max value: 1.8213948061833445 - type: nauc_precision_at_1000_std value: 29.23515602956654 - type: nauc_precision_at_100_diff1 value: 14.18816101118032 - type: nauc_precision_at_100_max value: 2.440318670740079 - type: nauc_precision_at_100_std value: 29.24020499259622 - type: nauc_precision_at_10_diff1 value: 27.712287052106255 - type: nauc_precision_at_10_max value: 16.786789482138776 - type: nauc_precision_at_10_std value: 14.310510991471832 - type: nauc_precision_at_1_diff1 value: 49.49023740663914 - type: nauc_precision_at_1_max value: 32.80752877856241 - type: nauc_precision_at_1_std value: 9.182609293548452 - type: nauc_precision_at_20_diff1 value: 20.46872198920085 - type: nauc_precision_at_20_max value: 14.825240542929851 - type: nauc_precision_at_20_std value: 20.953665146043296 - type: nauc_precision_at_3_diff1 value: 36.03554983971536 - type: nauc_precision_at_3_max value: 21.854122073954194 - type: nauc_precision_at_3_std value: 13.04509621136731 - type: nauc_precision_at_5_diff1 value: 32.79763412951098 - type: nauc_precision_at_5_max value: 21.11796990161242 - type: nauc_precision_at_5_std value: 13.431327120495338 - type: nauc_recall_at_1000_diff1 value: 30.09802696990947 - type: nauc_recall_at_1000_max value: 13.40584644567289 - type: nauc_recall_at_1000_std value: 16.521370765894975 - type: nauc_recall_at_100_diff1 value: 26.309114191114602 - type: nauc_recall_at_100_max value: 13.350873360428366 - type: nauc_recall_at_100_std value: 11.078547445094047 - type: nauc_recall_at_10_diff1 value: 31.32014394352729 - type: nauc_recall_at_10_max value: 18.345182060137695 - type: nauc_recall_at_10_std value: 9.128692650287276 - type: nauc_recall_at_1_diff1 value: 52.69584045825562 - type: nauc_recall_at_1_max value: 32.26169513753074 - type: nauc_recall_at_1_std value: 6.952498233745584 - type: nauc_recall_at_20_diff1 value: 25.40389262415684 - type: nauc_recall_at_20_max value: 19.21175870928344 - type: nauc_recall_at_20_std value: 10.924171074066592 - type: nauc_recall_at_3_diff1 value: 38.07498529415478 - type: nauc_recall_at_3_max value: 21.675031784523334 - type: nauc_recall_at_3_std value: 7.885136540556627 - type: nauc_recall_at_5_diff1 value: 33.03739602855325 - type: nauc_recall_at_5_max value: 20.891017025098222 - type: nauc_recall_at_5_std value: 7.259719761129051 - type: ndcg_at_1 value: 14.822 - type: ndcg_at_10 value: 18.986 - type: ndcg_at_100 value: 22.996 - type: ndcg_at_1000 value: 26.569 - type: ndcg_at_20 value: 20.62 - type: ndcg_at_3 value: 16.778000000000002 - type: ndcg_at_5 value: 17.742 - type: precision_at_1 value: 14.822 - type: precision_at_10 value: 3.755 - type: precision_at_100 value: 0.8540000000000001 - type: precision_at_1000 value: 0.163 - type: precision_at_20 value: 2.4899999999999998 - type: precision_at_3 value: 8.235000000000001 - type: precision_at_5 value: 5.968 - type: recall_at_1 value: 11.219 - type: recall_at_10 value: 24.784 - type: recall_at_100 value: 43.143 - type: recall_at_1000 value: 68.416 - type: recall_at_20 value: 31.266 - type: recall_at_3 value: 17.607999999999997 - type: recall_at_5 value: 20.468 task: type: Retrieval - dataset: config: default name: MTEB CQADupstackWordpressRetrieval (default) revision: 4ffe81d471b1924886b33c7567bfb200e9eec5c4 split: test type: mteb/cqadupstack-wordpress metrics: - type: main_score value: 14.105 - type: map_at_1 value: 9.766 - type: map_at_10 value: 12.35 - type: map_at_100 value: 12.794 - type: map_at_1000 value: 12.876000000000001 - type: map_at_20 value: 12.548 - type: map_at_3 value: 11.583 - type: map_at_5 value: 11.855 - type: mrr_at_1 value: 10.35120147874307 - type: mrr_at_10 value: 13.323137634597895 - type: mrr_at_100 value: 13.8122389813538 - type: mrr_at_1000 value: 13.891191650266954 - type: mrr_at_20 value: 13.550088548700803 - type: mrr_at_3 value: 12.41528034504005 - type: mrr_at_5 value: 12.74799753542822 - type: nauc_map_at_1000_diff1 value: 30.214009272387493 - type: nauc_map_at_1000_max value: 27.100911874185957 - type: nauc_map_at_1000_std value: 4.556062715371813 - type: nauc_map_at_100_diff1 value: 30.283972909659536 - type: nauc_map_at_100_max value: 27.101751795355376 - type: nauc_map_at_100_std value: 4.530095632746722 - type: nauc_map_at_10_diff1 value: 30.703580851962275 - type: nauc_map_at_10_max value: 27.45889128777842 - type: nauc_map_at_10_std value: 4.056332236709348 - type: nauc_map_at_1_diff1 value: 38.44336021108366 - type: nauc_map_at_1_max value: 31.341289082946698 - type: nauc_map_at_1_std value: 5.249357458733503 - type: nauc_map_at_20_diff1 value: 30.50519884637743 - type: nauc_map_at_20_max value: 27.340643104548395 - type: nauc_map_at_20_std value: 4.165692308941953 - type: nauc_map_at_3_diff1 value: 32.38602261885505 - type: nauc_map_at_3_max value: 28.903602549949543 - type: nauc_map_at_3_std value: 3.5402281277974756 - type: nauc_map_at_5_diff1 value: 32.2685825283353 - type: nauc_map_at_5_max value: 28.485087249150176 - type: nauc_map_at_5_std value: 3.8418506057303445 - type: nauc_mrr_at_1000_diff1 value: 30.308168307291954 - type: nauc_mrr_at_1000_max value: 26.895198553568438 - type: nauc_mrr_at_1000_std value: 6.332711766194871 - type: nauc_mrr_at_100_diff1 value: 30.366219069831494 - type: nauc_mrr_at_100_max value: 26.88024956005868 - type: nauc_mrr_at_100_std value: 6.328345475093812 - type: nauc_mrr_at_10_diff1 value: 30.60181659497291 - type: nauc_mrr_at_10_max value: 27.33947661988829 - type: nauc_mrr_at_10_std value: 5.98212349517898 - type: nauc_mrr_at_1_diff1 value: 38.01665824488639 - type: nauc_mrr_at_1_max value: 31.273295508014538 - type: nauc_mrr_at_1_std value: 7.49596621052432 - type: nauc_mrr_at_20_diff1 value: 30.504642171833616 - type: nauc_mrr_at_20_max value: 27.093254296264142 - type: nauc_mrr_at_20_std value: 6.011940896215445 - type: nauc_mrr_at_3_diff1 value: 32.30298334779263 - type: nauc_mrr_at_3_max value: 28.46795259170204 - type: nauc_mrr_at_3_std value: 5.233276939737523 - type: nauc_mrr_at_5_diff1 value: 32.317520734292316 - type: nauc_mrr_at_5_max value: 28.31645764893187 - type: nauc_mrr_at_5_std value: 5.514394216402804 - type: nauc_ndcg_at_1000_diff1 value: 25.46804692303833 - type: nauc_ndcg_at_1000_max value: 24.577578434016004 - type: nauc_ndcg_at_1000_std value: 8.08099372903191 - type: nauc_ndcg_at_100_diff1 value: 25.7728600426837 - type: nauc_ndcg_at_100_max value: 23.852719795214735 - type: nauc_ndcg_at_100_std value: 7.271020641236757 - type: nauc_ndcg_at_10_diff1 value: 27.787864887098827 - type: nauc_ndcg_at_10_max value: 25.82070997315848 - type: nauc_ndcg_at_10_std value: 4.84958725429997 - type: nauc_ndcg_at_1_diff1 value: 38.01665824488639 - type: nauc_ndcg_at_1_max value: 31.273295508014538 - type: nauc_ndcg_at_1_std value: 7.49596621052432 - type: nauc_ndcg_at_20_diff1 value: 27.23687052702463 - type: nauc_ndcg_at_20_max value: 25.3030643349024 - type: nauc_ndcg_at_20_std value: 5.128184329356223 - type: nauc_ndcg_at_3_diff1 value: 30.94323024403614 - type: nauc_ndcg_at_3_max value: 28.112791463025488 - type: nauc_ndcg_at_3_std value: 3.4748257092667845 - type: nauc_ndcg_at_5_diff1 value: 30.979886062267525 - type: nauc_ndcg_at_5_max value: 27.832062407091833 - type: nauc_ndcg_at_5_std value: 4.066523891816962 - type: nauc_precision_at_1000_diff1 value: 13.717212581088436 - type: nauc_precision_at_1000_max value: 14.726337919465527 - type: nauc_precision_at_1000_std value: 19.286677279311952 - type: nauc_precision_at_100_diff1 value: 13.83440364507339 - type: nauc_precision_at_100_max value: 13.983610901499812 - type: nauc_precision_at_100_std value: 17.767107323199852 - type: nauc_precision_at_10_diff1 value: 18.989269379083463 - type: nauc_precision_at_10_max value: 20.291510121396815 - type: nauc_precision_at_10_std value: 8.518048232551553 - type: nauc_precision_at_1_diff1 value: 38.01665824488639 - type: nauc_precision_at_1_max value: 31.273295508014538 - type: nauc_precision_at_1_std value: 7.49596621052432 - type: nauc_precision_at_20_diff1 value: 18.381866045394073 - type: nauc_precision_at_20_max value: 18.90966326296592 - type: nauc_precision_at_20_std value: 9.141677018751377 - type: nauc_precision_at_3_diff1 value: 26.100613624838605 - type: nauc_precision_at_3_max value: 24.76218487581011 - type: nauc_precision_at_3_std value: 2.4322989886641495 - type: nauc_precision_at_5_diff1 value: 26.83172966704407 - type: nauc_precision_at_5_max value: 24.090343452479146 - type: nauc_precision_at_5_std value: 4.535854021501322 - type: nauc_recall_at_1000_diff1 value: 13.245456056842464 - type: nauc_recall_at_1000_max value: 19.61498051994092 - type: nauc_recall_at_1000_std value: 17.188990206491262 - type: nauc_recall_at_100_diff1 value: 14.025440613222711 - type: nauc_recall_at_100_max value: 15.06663046965985 - type: nauc_recall_at_100_std value: 12.610345211569749 - type: nauc_recall_at_10_diff1 value: 21.102550210495654 - type: nauc_recall_at_10_max value: 21.76066577972798 - type: nauc_recall_at_10_std value: 5.1852219341177115 - type: nauc_recall_at_1_diff1 value: 38.44336021108366 - type: nauc_recall_at_1_max value: 31.341289082946698 - type: nauc_recall_at_1_std value: 5.249357458733503 - type: nauc_recall_at_20_diff1 value: 19.281075192679307 - type: nauc_recall_at_20_max value: 20.050580691482935 - type: nauc_recall_at_20_std value: 5.836669306240979 - type: nauc_recall_at_3_diff1 value: 27.334543456325626 - type: nauc_recall_at_3_max value: 26.711101790009558 - type: nauc_recall_at_3_std value: 2.3329176939418037 - type: nauc_recall_at_5_diff1 value: 27.75488164284888 - type: nauc_recall_at_5_max value: 26.285171746330576 - type: nauc_recall_at_5_std value: 3.361376753158064 - type: ndcg_at_1 value: 10.351 - type: ndcg_at_10 value: 14.105 - type: ndcg_at_100 value: 16.765 - type: ndcg_at_1000 value: 19.220000000000002 - type: ndcg_at_20 value: 14.82 - type: ndcg_at_3 value: 12.398000000000001 - type: ndcg_at_5 value: 12.879999999999999 - type: precision_at_1 value: 10.351 - type: precision_at_10 value: 2.144 - type: precision_at_100 value: 0.373 - type: precision_at_1000 value: 0.062 - type: precision_at_20 value: 1.238 - type: precision_at_3 value: 5.114 - type: precision_at_5 value: 3.401 - type: recall_at_1 value: 9.766 - type: recall_at_10 value: 18.595 - type: recall_at_100 value: 31.669999999999998 - type: recall_at_1000 value: 50.659 - type: recall_at_20 value: 21.248 - type: recall_at_3 value: 13.876 - type: recall_at_5 value: 15.015 task: type: Retrieval - dataset: config: default name: MTEB CUADAffiliateLicenseLicenseeLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 73.73737373737373 - type: ap value: 65.8818399825594 - type: ap_weighted value: 65.8818399825594 - type: f1 value: 72.61993404956918 - type: f1_weighted value: 72.61993404956918 - type: main_score value: 73.73737373737373 task: type: Classification - dataset: config: default name: MTEB CUADAffiliateLicenseLicensorLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 79.54545454545453 - type: ap value: 73.12252964426878 - type: ap_weighted value: 73.12252964426878 - type: f1 value: 79.53488372093022 - type: f1_weighted value: 79.53488372093024 - type: main_score value: 79.54545454545453 task: type: Classification - dataset: config: default name: MTEB CUADAntiAssignmentLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 70.64846416382251 - type: ap value: 63.215973012261415 - type: ap_weighted value: 63.215973012261415 - type: f1 value: 68.89855743269304 - type: f1_weighted value: 68.89855743269304 - type: main_score value: 70.64846416382251 task: type: Classification - dataset: config: default name: MTEB CUADAuditRightsLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 60.44407894736842 - type: ap value: 57.470171721677076 - type: ap_weighted value: 57.470171721677076 - type: f1 value: 57.63732113071247 - type: f1_weighted value: 57.63732113071247 - type: main_score value: 60.44407894736842 task: type: Classification - dataset: config: default name: MTEB CUADCapOnLiabilityLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 49.518459069020864 - type: ap value: 49.761431703402096 - type: ap_weighted value: 49.761431703402096 - type: f1 value: 49.48302433823829 - type: f1_weighted value: 49.48302433823827 - type: main_score value: 49.518459069020864 task: type: Classification - dataset: config: default name: MTEB CUADChangeOfControlLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 71.875 - type: ap value: 64.42982456140352 - type: ap_weighted value: 64.42982456140352 - type: f1 value: 70.87723707120934 - type: f1_weighted value: 70.8772370712093 - type: main_score value: 71.875 task: type: Classification - dataset: config: default name: MTEB CUADCompetitiveRestrictionExceptionLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 53.181818181818194 - type: ap value: 51.65110565110565 - type: ap_weighted value: 51.65110565110565 - type: f1 value: 47.02513150204559 - type: f1_weighted value: 47.025131502045596 - type: main_score value: 53.181818181818194 task: type: Classification - dataset: config: default name: MTEB CUADCovenantNotToSueLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 67.53246753246754 - type: ap value: 60.65974025974026 - type: ap_weighted value: 60.65974025974026 - type: f1 value: 64.03885671586028 - type: f1_weighted value: 64.03885671586026 - type: main_score value: 67.53246753246754 task: type: Classification - dataset: config: default name: MTEB CUADEffectiveDateLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 56.35593220338983 - type: ap value: 53.54749704375246 - type: ap_weighted value: 53.54749704375246 - type: f1 value: 56.26090868196132 - type: f1_weighted value: 56.26090868196131 - type: main_score value: 56.35593220338983 task: type: Classification - dataset: config: default name: MTEB CUADExclusivityLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 61.154855643044606 - type: ap value: 56.35333840225783 - type: ap_weighted value: 56.35333840225783 - type: f1 value: 57.26109628910987 - type: f1_weighted value: 57.26109628910987 - type: main_score value: 61.154855643044606 task: type: Classification - dataset: config: default name: MTEB CUADExpirationDateLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 80.82191780821917 - type: ap value: 77.03374913905259 - type: ap_weighted value: 77.03374913905259 - type: f1 value: 80.66062530224343 - type: f1_weighted value: 80.66062530224343 - type: main_score value: 80.82191780821917 task: type: Classification - dataset: config: default name: MTEB CUADGoverningLawLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 92.12328767123289 - type: ap value: 88.44810149857499 - type: ap_weighted value: 88.44810149857499 - type: f1 value: 92.12245616092896 - type: f1_weighted value: 92.12245616092899 - type: main_score value: 92.12328767123289 task: type: Classification - dataset: config: default name: MTEB CUADIPOwnershipAssignmentLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 64.0625 - type: ap value: 59.78260869565217 - type: ap_weighted value: 59.78260869565217 - type: f1 value: 63.33748443337483 - type: f1_weighted value: 63.33748443337485 - type: main_score value: 64.0625 task: type: Classification - dataset: config: default name: MTEB CUADInsuranceLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 80.3883495145631 - type: ap value: 76.65387764650838 - type: ap_weighted value: 76.65387764650838 - type: f1 value: 80.20173184889143 - type: f1_weighted value: 80.20173184889143 - type: main_score value: 80.3883495145631 task: type: Classification - dataset: config: default name: MTEB CUADIrrevocableOrPerpetualLicenseLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 78.21428571428572 - type: ap value: 70.19711163153788 - type: ap_weighted value: 70.19711163153788 - type: f1 value: 77.68807722955938 - type: f1_weighted value: 77.6880772295594 - type: main_score value: 78.21428571428572 task: type: Classification - dataset: config: default name: MTEB CUADJointIPOwnershipLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 85.9375 - type: ap value: 79.55607476635514 - type: ap_weighted value: 79.55607476635514 - type: f1 value: 85.89119015866969 - type: f1_weighted value: 85.89119015866969 - type: main_score value: 85.9375 task: type: Classification - dataset: config: default name: MTEB CUADLicenseGrantLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 72.56446991404013 - type: ap value: 65.06701026209069 - type: ap_weighted value: 65.06701026209069 - type: f1 value: 71.72168495320604 - type: f1_weighted value: 71.72168495320604 - type: main_score value: 72.56446991404013 task: type: Classification - dataset: config: default name: MTEB CUADLiquidatedDamagesLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 80.45454545454544 - type: ap value: 73.2605583392985 - type: ap_weighted value: 73.2605583392985 - type: f1 value: 80.33713703726801 - type: f1_weighted value: 80.33713703726798 - type: main_score value: 80.45454545454544 task: type: Classification - dataset: config: default name: MTEB CUADMinimumCommitmentLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 75.51813471502591 - type: ap value: 68.84511159342107 - type: ap_weighted value: 68.84511159342107 - type: f1 value: 75.48815213647933 - type: f1_weighted value: 75.48815213647931 - type: main_score value: 75.51813471502591 task: type: Classification - dataset: config: default name: MTEB CUADMostFavoredNationLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 73.4375 - type: ap value: 65.80668604651162 - type: ap_weighted value: 65.80668604651162 - type: f1 value: 72.62893081761007 - type: f1_weighted value: 72.62893081761007 - type: main_score value: 73.4375 task: type: Classification - dataset: config: default name: MTEB CUADNoSolicitOfCustomersLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 82.14285714285714 - type: ap value: 73.68421052631578 - type: ap_weighted value: 73.68421052631578 - type: f1 value: 81.55467720685114 - type: f1_weighted value: 81.55467720685111 - type: main_score value: 82.14285714285714 task: type: Classification - dataset: config: default name: MTEB CUADNoSolicitOfEmployeesLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 88.02816901408453 - type: ap value: 81.23742454728371 - type: ap_weighted value: 81.23742454728371 - type: f1 value: 87.92698174543636 - type: f1_weighted value: 87.92698174543636 - type: main_score value: 88.02816901408453 task: type: Classification - dataset: config: default name: MTEB CUADNonCompeteLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 53.84615384615385 - type: ap value: 52.05651491365778 - type: ap_weighted value: 52.05651491365778 - type: f1 value: 53.70967410723452 - type: f1_weighted value: 53.70967410723452 - type: main_score value: 53.84615384615385 task: type: Classification - dataset: config: default name: MTEB CUADNonDisparagementLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 82.0 - type: ap value: 73.75757575757575 - type: ap_weighted value: 73.75757575757575 - type: f1 value: 81.5270935960591 - type: f1_weighted value: 81.5270935960591 - type: main_score value: 82.0 task: type: Classification - dataset: config: default name: MTEB CUADNonTransferableLicenseLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 72.69372693726936 - type: ap value: 68.36025144171039 - type: ap_weighted value: 68.36025144171039 - type: f1 value: 72.20320188509251 - type: f1_weighted value: 72.20320188509251 - type: main_score value: 72.69372693726936 task: type: Classification - dataset: config: default name: MTEB CUADNoticePeriodToTerminateRenewalLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 81.53153153153154 - type: ap value: 73.22254687119553 - type: ap_weighted value: 73.22254687119553 - type: f1 value: 81.003861003861 - type: f1_weighted value: 81.003861003861 - type: main_score value: 81.53153153153154 task: type: Classification - dataset: config: default name: MTEB CUADPostTerminationServicesLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 59.52970297029702 - type: ap value: 55.494262149873045 - type: ap_weighted value: 55.494262149873045 - type: f1 value: 58.91289033889372 - type: f1_weighted value: 58.91289033889372 - type: main_score value: 59.52970297029702 task: type: Classification - dataset: config: default name: MTEB CUADPriceRestrictionsLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 86.95652173913044 - type: ap value: 80.11272141706925 - type: ap_weighted value: 80.11272141706925 - type: f1 value: 86.85714285714286 - type: f1_weighted value: 86.85714285714286 - type: main_score value: 86.95652173913044 task: type: Classification - dataset: config: default name: MTEB CUADRenewalTermLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 81.86528497409327 - type: ap value: 74.56574832804549 - type: ap_weighted value: 74.56574832804549 - type: f1 value: 81.72348484848484 - type: f1_weighted value: 81.72348484848484 - type: main_score value: 81.86528497409327 task: type: Classification - dataset: config: default name: MTEB CUADRevenueProfitSharingLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 78.9405684754522 - type: ap value: 75.88346617170725 - type: ap_weighted value: 75.88346617170725 - type: f1 value: 78.5609048595758 - type: f1_weighted value: 78.5609048595758 - type: main_score value: 78.9405684754522 task: type: Classification - dataset: config: default name: MTEB CUADRofrRofoRofnLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 67.53623188405797 - type: ap value: 61.059567408520365 - type: ap_weighted value: 61.059567408520365 - type: f1 value: 66.55819428096656 - type: f1_weighted value: 66.55819428096656 - type: main_score value: 67.53623188405797 task: type: Classification - dataset: config: default name: MTEB CUADSourceCodeEscrowLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 79.66101694915253 - type: ap value: 73.06967984934086 - type: ap_weighted value: 73.06967984934086 - type: f1 value: 79.63761863675583 - type: f1_weighted value: 79.63761863675583 - type: main_score value: 79.66101694915253 task: type: Classification - dataset: config: default name: MTEB CUADTerminationForConvenienceLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 82.55813953488372 - type: ap value: 76.9289284938057 - type: ap_weighted value: 76.9289284938057 - type: f1 value: 82.5580452030568 - type: f1_weighted value: 82.55804520305684 - type: main_score value: 82.55813953488372 task: type: Classification - dataset: config: default name: MTEB CUADThirdPartyBeneficiaryLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 86.76470588235293 - type: ap value: 82.30837789661318 - type: ap_weighted value: 82.30837789661318 - type: f1 value: 86.76184295911746 - type: f1_weighted value: 86.76184295911744 - type: main_score value: 86.76470588235293 task: type: Classification - dataset: config: default name: MTEB CUADUncappedLiabilityLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 78.91156462585033 - type: ap value: 70.63036269784295 - type: ap_weighted value: 70.63036269784295 - type: f1 value: 78.23054507237377 - type: f1_weighted value: 78.23054507237376 - type: main_score value: 78.91156462585033 task: type: Classification - dataset: config: default name: MTEB CUADUnlimitedAllYouCanEatLicenseLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 75.0 - type: ap value: 67.5 - type: ap_weighted value: 67.5 - type: f1 value: 74.60317460317461 - type: f1_weighted value: 74.60317460317461 - type: main_score value: 75.0 task: type: Classification - dataset: config: default name: MTEB CUADVolumeRestrictionLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 68.32298136645963 - type: ap value: 67.47730530339226 - type: ap_weighted value: 67.47730530339226 - type: f1 value: 65.23267138078504 - type: f1_weighted value: 65.23267138078504 - type: main_score value: 68.32298136645963 task: type: Classification - dataset: config: default name: MTEB CUADWarrantyDurationLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 77.18749999999999 - type: ap value: 70.84930981595093 - type: ap_weighted value: 70.84930981595093 - type: f1 value: 77.18549481888057 - type: f1_weighted value: 77.18549481888057 - type: main_score value: 77.18749999999999 task: type: Classification - dataset: config: default name: MTEB CanadaTaxCourtOutcomesLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 45.90163934426229 - type: f1 value: 41.86755057433674 - type: f1_weighted value: 52.49140373560517 - type: main_score value: 45.90163934426229 task: type: Classification - dataset: config: default name: MTEB ClimateFEVER (default) revision: 47f2ac6acb640fc46020b02a5b59fdda04d39380 split: test type: mteb/climate-fever metrics: - type: main_score value: 5.558 - type: map_at_1 value: 2.099 - type: map_at_10 value: 3.6790000000000003 - type: map_at_100 value: 4.021 - type: map_at_1000 value: 4.083 - type: map_at_20 value: 3.843 - type: map_at_3 value: 3.107 - type: map_at_5 value: 3.398 - type: mrr_at_1 value: 4.364820846905538 - type: mrr_at_10 value: 7.478723954293985 - type: mrr_at_100 value: 8.041420875649584 - type: mrr_at_1000 value: 8.120754871238086 - type: mrr_at_20 value: 7.760020669319687 - type: mrr_at_3 value: 6.438653637350702 - type: mrr_at_5 value: 7.028230184581975 - type: nauc_map_at_1000_diff1 value: 26.989583880363355 - type: nauc_map_at_1000_max value: 19.651932768180743 - type: nauc_map_at_1000_std value: 28.682949493303113 - type: nauc_map_at_100_diff1 value: 27.123176019982058 - type: nauc_map_at_100_max value: 19.598769909181605 - type: nauc_map_at_100_std value: 28.431702256094276 - type: nauc_map_at_10_diff1 value: 28.090105463174243 - type: nauc_map_at_10_max value: 19.316825624764327 - type: nauc_map_at_10_std value: 27.879940536760657 - type: nauc_map_at_1_diff1 value: 38.86635884960338 - type: nauc_map_at_1_max value: 23.66935741341746 - type: nauc_map_at_1_std value: 25.594810836643088 - type: nauc_map_at_20_diff1 value: 27.932097656688153 - type: nauc_map_at_20_max value: 19.705436224378094 - type: nauc_map_at_20_std value: 28.005161889024915 - type: nauc_map_at_3_diff1 value: 31.343508506514787 - type: nauc_map_at_3_max value: 17.617676175693653 - type: nauc_map_at_3_std value: 27.372138781240235 - type: nauc_map_at_5_diff1 value: 29.21950281006726 - type: nauc_map_at_5_max value: 18.039174755804527 - type: nauc_map_at_5_std value: 26.278075304640147 - type: nauc_mrr_at_1000_diff1 value: 21.017635057347793 - type: nauc_mrr_at_1000_max value: 20.84007387790555 - type: nauc_mrr_at_1000_std value: 24.684523933084744 - type: nauc_mrr_at_100_diff1 value: 21.051698171004 - type: nauc_mrr_at_100_max value: 20.79459868740917 - type: nauc_mrr_at_100_std value: 24.62077347403019 - type: nauc_mrr_at_10_diff1 value: 21.926692626233184 - type: nauc_mrr_at_10_max value: 20.868215747512338 - type: nauc_mrr_at_10_std value: 24.10229968572614 - type: nauc_mrr_at_1_diff1 value: 32.12007148649377 - type: nauc_mrr_at_1_max value: 25.428643110489634 - type: nauc_mrr_at_1_std value: 19.946229629460547 - type: nauc_mrr_at_20_diff1 value: 21.617935715645125 - type: nauc_mrr_at_20_max value: 21.046484288936377 - type: nauc_mrr_at_20_std value: 24.297367370651244 - type: nauc_mrr_at_3_diff1 value: 24.094623370861303 - type: nauc_mrr_at_3_max value: 19.713811945549196 - type: nauc_mrr_at_3_std value: 23.568839477173757 - type: nauc_mrr_at_5_diff1 value: 22.3010395396166 - type: nauc_mrr_at_5_max value: 20.569180907488864 - type: nauc_mrr_at_5_std value: 23.15568498862624 - type: nauc_ndcg_at_1000_diff1 value: 17.73440786298746 - type: nauc_ndcg_at_1000_max value: 21.164734898511266 - type: nauc_ndcg_at_1000_std value: 32.20409116224434 - type: nauc_ndcg_at_100_diff1 value: 19.491657641927414 - type: nauc_ndcg_at_100_max value: 19.73425182329514 - type: nauc_ndcg_at_100_std value: 29.633697891721162 - type: nauc_ndcg_at_10_diff1 value: 23.236666416810397 - type: nauc_ndcg_at_10_max value: 19.859686062177957 - type: nauc_ndcg_at_10_std value: 27.607123060751103 - type: nauc_ndcg_at_1_diff1 value: 32.12007148649377 - type: nauc_ndcg_at_1_max value: 25.428643110489634 - type: nauc_ndcg_at_1_std value: 19.946229629460547 - type: nauc_ndcg_at_20_diff1 value: 22.766492789770794 - type: nauc_ndcg_at_20_max value: 20.68653243447615 - type: nauc_ndcg_at_20_std value: 27.80598558578259 - type: nauc_ndcg_at_3_diff1 value: 26.430176145767764 - type: nauc_ndcg_at_3_max value: 17.178786585572514 - type: nauc_ndcg_at_3_std value: 26.551392559385945 - type: nauc_ndcg_at_5_diff1 value: 24.359838503352492 - type: nauc_ndcg_at_5_max value: 18.139249994062958 - type: nauc_ndcg_at_5_std value: 25.04579441208386 - type: nauc_precision_at_1000_diff1 value: 3.5941753705590855 - type: nauc_precision_at_1000_max value: 23.295418071068074 - type: nauc_precision_at_1000_std value: 37.823737794558035 - type: nauc_precision_at_100_diff1 value: 7.711362755764835 - type: nauc_precision_at_100_max value: 21.000892665907962 - type: nauc_precision_at_100_std value: 35.56596455340648 - type: nauc_precision_at_10_diff1 value: 14.603402002580449 - type: nauc_precision_at_10_max value: 22.112935744796918 - type: nauc_precision_at_10_std value: 30.665912790934176 - type: nauc_precision_at_1_diff1 value: 32.12007148649377 - type: nauc_precision_at_1_max value: 25.428643110489634 - type: nauc_precision_at_1_std value: 19.946229629460547 - type: nauc_precision_at_20_diff1 value: 14.716417574100266 - type: nauc_precision_at_20_max value: 23.926389785704096 - type: nauc_precision_at_20_std value: 30.69168946837732 - type: nauc_precision_at_3_diff1 value: 18.67632522519008 - type: nauc_precision_at_3_max value: 15.461714107477059 - type: nauc_precision_at_3_std value: 24.408621037612654 - type: nauc_precision_at_5_diff1 value: 14.433484685750017 - type: nauc_precision_at_5_max value: 18.682282289432337 - type: nauc_precision_at_5_std value: 24.03615092175192 - type: nauc_recall_at_1000_diff1 value: 7.5569286948470955 - type: nauc_recall_at_1000_max value: 18.988365246129565 - type: nauc_recall_at_1000_std value: 32.73921563811838 - type: nauc_recall_at_100_diff1 value: 12.11778715469688 - type: nauc_recall_at_100_max value: 16.608390547005357 - type: nauc_recall_at_100_std value: 29.88269190630321 - type: nauc_recall_at_10_diff1 value: 20.008263704255814 - type: nauc_recall_at_10_max value: 19.07669508851797 - type: nauc_recall_at_10_std value: 28.95827325426037 - type: nauc_recall_at_1_diff1 value: 38.86635884960338 - type: nauc_recall_at_1_max value: 23.66935741341746 - type: nauc_recall_at_1_std value: 25.594810836643088 - type: nauc_recall_at_20_diff1 value: 19.54693652826011 - type: nauc_recall_at_20_max value: 20.582517703572815 - type: nauc_recall_at_20_std value: 28.52204311008764 - type: nauc_recall_at_3_diff1 value: 25.95757457673112 - type: nauc_recall_at_3_max value: 13.802011828871594 - type: nauc_recall_at_3_std value: 28.160988060479163 - type: nauc_recall_at_5_diff1 value: 21.718874199874673 - type: nauc_recall_at_5_max value: 15.812170162395233 - type: nauc_recall_at_5_std value: 24.970427791223297 - type: ndcg_at_1 value: 4.365 - type: ndcg_at_10 value: 5.558 - type: ndcg_at_100 value: 7.637 - type: ndcg_at_1000 value: 9.700000000000001 - type: ndcg_at_20 value: 6.215 - type: ndcg_at_3 value: 4.314 - type: ndcg_at_5 value: 4.795 - type: precision_at_1 value: 4.365 - type: precision_at_10 value: 1.6740000000000002 - type: precision_at_100 value: 0.384 - type: precision_at_1000 value: 0.076 - type: precision_at_20 value: 1.111 - type: precision_at_3 value: 3.084 - type: precision_at_5 value: 2.423 - type: recall_at_1 value: 2.099 - type: recall_at_10 value: 7.371999999999999 - type: recall_at_100 value: 14.976999999999999 - type: recall_at_1000 value: 27.328000000000003 - type: recall_at_20 value: 9.288 - type: recall_at_3 value: 4.299 - type: recall_at_5 value: 5.509 task: type: Retrieval - dataset: config: default name: MTEB ContractNLIConfidentialityOfAgreementLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 64.63414634146342 - type: ap value: 59.62772785622593 - type: ap_weighted value: 59.62772785622593 - type: f1 value: 64.58674609084142 - type: f1_weighted value: 64.58674609084142 - type: main_score value: 64.63414634146342 task: type: Classification - dataset: config: default name: MTEB ContractNLIExplicitIdentificationLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 56.88073394495412 - type: ap value: 21.457096600107935 - type: ap_weighted value: 21.457096600107935 - type: f1 value: 50.91501389288109 - type: f1_weighted value: 61.74750556638211 - type: main_score value: 56.88073394495412 task: type: Classification - dataset: config: default name: MTEB ContractNLIInclusionOfVerballyConveyedInformationLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 60.431654676258994 - type: ap value: 55.25139990309542 - type: ap_weighted value: 55.25139990309542 - type: f1 value: 60.4234611999793 - type: f1_weighted value: 60.435751414398844 - type: main_score value: 60.431654676258994 task: type: Classification - dataset: config: default name: MTEB ContractNLILimitedUseLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 73.07692307692307 - type: ap value: 63.954526895988565 - type: ap_weighted value: 63.954526895988565 - type: f1 value: 73.01454916133815 - type: f1_weighted value: 73.10187264315704 - type: main_score value: 73.07692307692307 task: type: Classification - dataset: config: default name: MTEB ContractNLINoLicensingLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 82.09876543209876 - type: ap value: 75.19529587058324 - type: ap_weighted value: 75.19529587058324 - type: f1 value: 82.08169647965215 - type: f1_weighted value: 82.0748688986735 - type: main_score value: 82.09876543209876 task: type: Classification - dataset: config: default name: MTEB ContractNLINoticeOnCompelledDisclosureLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 78.87323943661971 - type: ap value: 72.12365099689045 - type: ap_weighted value: 72.12365099689045 - type: f1 value: 78.83545310015897 - type: f1_weighted value: 78.83545310015897 - type: main_score value: 78.87323943661971 task: type: Classification - dataset: config: default name: MTEB ContractNLIPermissibleAcquirementOfSimilarInformationLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 72.47191011235954 - type: ap value: 64.74719101123597 - type: ap_weighted value: 64.74719101123597 - type: f1 value: 71.08377813877931 - type: f1_weighted value: 71.08377813877931 - type: main_score value: 72.47191011235954 task: type: Classification - dataset: config: default name: MTEB ContractNLIPermissibleCopyLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 41.379310344827594 - type: ap value: 19.168356997971607 - type: ap_weighted value: 19.168356997971607 - type: f1 value: 38.75776397515528 - type: f1_weighted value: 46.18547868922682 - type: main_score value: 41.379310344827594 task: type: Classification - dataset: config: default name: MTEB ContractNLIPermissibleDevelopmentOfSimilarInformationLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 71.3235294117647 - type: ap value: 65.14279624893436 - type: ap_weighted value: 65.14279624893436 - type: f1 value: 71.3219789132198 - type: f1_weighted value: 71.3219789132198 - type: main_score value: 71.3235294117647 task: type: Classification - dataset: config: default name: MTEB ContractNLIPermissiblePostAgreementPossessionLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 39.63963963963964 - type: ap value: 25.290389847351868 - type: ap_weighted value: 25.290389847351868 - type: f1 value: 39.56115400243804 - type: f1_weighted value: 40.64033151396011 - type: main_score value: 39.63963963963964 task: type: Classification - dataset: config: default name: MTEB ContractNLIReturnOfConfidentialInformationLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 71.21212121212122 - type: ap value: 63.13978196600149 - type: ap_weighted value: 63.13978196600149 - type: f1 value: 70.88460645460877 - type: f1_weighted value: 70.7910308096052 - type: main_score value: 71.21212121212122 task: type: Classification - dataset: config: default name: MTEB ContractNLISharingWithEmployeesLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 73.52941176470588 - type: ap value: 66.24576478752499 - type: ap_weighted value: 66.24576478752499 - type: f1 value: 71.13098607494621 - type: f1_weighted value: 71.42467085328414 - type: main_score value: 73.52941176470588 task: type: Classification - dataset: config: default name: MTEB ContractNLISharingWithThirdPartiesLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 68.88888888888889 - type: ap value: 51.569719636083924 - type: ap_weighted value: 51.569719636083924 - type: f1 value: 66.28762541806019 - type: f1_weighted value: 68.26458565589 - type: main_score value: 68.88888888888889 task: type: Classification - dataset: config: default name: MTEB ContractNLISurvivalOfObligationsLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 49.044585987261144 - type: ap value: 47.085151843488305 - type: ap_weighted value: 47.085151843488305 - type: f1 value: 48.28722002635046 - type: f1_weighted value: 47.92846772907698 - type: main_score value: 49.044585987261144 task: type: Classification - dataset: config: default name: MTEB CorporateLobbyingLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 70.40816326530613 - type: ap value: 29.59183673469388 - type: ap_weighted value: 29.59183673469388 - type: f1 value: 41.31736526946107 - type: f1_weighted value: 58.181595991690074 - type: main_score value: 70.40816326530613 task: type: Classification - dataset: config: default name: MTEB CyrillicTurkicLangClassification (default) revision: e42d330f33d65b7b72dfd408883daf1661f06f18 split: test type: tatiana-merz/cyrillic_turkic_langs metrics: - type: accuracy value: 61.19140625 - type: f1 value: 59.377085898563365 - type: f1_weighted value: 59.385881195883925 - type: main_score value: 61.19140625 task: type: Classification - dataset: config: default name: MTEB DBPedia (default) revision: c0f706b76e590d620bd6618b3ca8efdd34e2d659 split: dev type: mteb/dbpedia metrics: - type: main_score value: 7.161 - type: map_at_1 value: 0.599 - type: map_at_10 value: 2.243 - type: map_at_100 value: 3.1189999999999998 - type: map_at_1000 value: 3.488 - type: map_at_20 value: 2.522 - type: map_at_3 value: 1.397 - type: map_at_5 value: 1.951 - type: mrr_at_1 value: 8.955223880597014 - type: mrr_at_10 value: 18.287728026533994 - type: mrr_at_100 value: 18.978113584928742 - type: mrr_at_1000 value: 19.053758841865573 - type: mrr_at_20 value: 18.61199952617863 - type: mrr_at_3 value: 14.676616915422885 - type: mrr_at_5 value: 17.06467661691542 - type: nauc_map_at_1000_diff1 value: -2.930033724497058 - type: nauc_map_at_1000_max value: 3.5995430754716904 - type: nauc_map_at_1000_std value: 5.61203479120595 - type: nauc_map_at_100_diff1 value: -5.4531441891668795 - type: nauc_map_at_100_max value: -0.0055832626529105185 - type: nauc_map_at_100_std value: 3.439773391163607 - type: nauc_map_at_10_diff1 value: -14.3319757103363 - type: nauc_map_at_10_max value: -9.021024411612359 - type: nauc_map_at_10_std value: 1.0275253768638628 - type: nauc_map_at_1_diff1 value: 22.607506151253776 - type: nauc_map_at_1_max value: 10.921408762597743 - type: nauc_map_at_1_std value: -2.0177080867009054 - type: nauc_map_at_20_diff1 value: -11.794157692538237 - type: nauc_map_at_20_max value: -6.44484538876576 - type: nauc_map_at_20_std value: 1.039851694368717 - type: nauc_map_at_3_diff1 value: -7.469347804676409 - type: nauc_map_at_3_max value: -5.393936026725367 - type: nauc_map_at_3_std value: 9.280689460783249 - type: nauc_map_at_5_diff1 value: -15.955321054747321 - type: nauc_map_at_5_max value: -9.855092671604572 - type: nauc_map_at_5_std value: 0.06180279408320787 - type: nauc_mrr_at_1000_diff1 value: -2.821396337906413 - type: nauc_mrr_at_1000_max value: 5.972877383405757 - type: nauc_mrr_at_1000_std value: -1.6896049835004336 - type: nauc_mrr_at_100_diff1 value: -2.8632536639982105 - type: nauc_mrr_at_100_max value: 5.973020236396294 - type: nauc_mrr_at_100_std value: -1.809958349128643 - type: nauc_mrr_at_10_diff1 value: -4.515463799529893 - type: nauc_mrr_at_10_max value: 5.030384515417533 - type: nauc_mrr_at_10_std value: -1.547480529694615 - type: nauc_mrr_at_1_diff1 value: 8.719512377821816 - type: nauc_mrr_at_1_max value: 16.272382792823382 - type: nauc_mrr_at_1_std value: -3.187491782487964 - type: nauc_mrr_at_20_diff1 value: -2.908929872190089 - type: nauc_mrr_at_20_max value: 6.58409584409903 - type: nauc_mrr_at_20_std value: -1.1174417761572792 - type: nauc_mrr_at_3_diff1 value: -1.6595580931793985 - type: nauc_mrr_at_3_max value: 9.640215787928428 - type: nauc_mrr_at_3_std value: 2.889288978742377 - type: nauc_mrr_at_5_diff1 value: -6.89298539225687 - type: nauc_mrr_at_5_max value: 6.578043390443974 - type: nauc_mrr_at_5_std value: -0.6581933130437475 - type: nauc_ndcg_at_1000_diff1 value: 3.75625342513744 - type: nauc_ndcg_at_1000_max value: 6.952585708583143 - type: nauc_ndcg_at_1000_std value: 5.400684775811628 - type: nauc_ndcg_at_100_diff1 value: -2.242186789473446 - type: nauc_ndcg_at_100_max value: 1.7125259047701242 - type: nauc_ndcg_at_100_std value: -0.6824733710981048 - type: nauc_ndcg_at_10_diff1 value: -11.969827974466098 - type: nauc_ndcg_at_10_max value: -4.424965429405649 - type: nauc_ndcg_at_10_std value: 0.03592313276976773 - type: nauc_ndcg_at_1_diff1 value: -4.197220327746547 - type: nauc_ndcg_at_1_max value: 9.247135683163954 - type: nauc_ndcg_at_1_std value: -6.671985136155276 - type: nauc_ndcg_at_20_diff1 value: -8.358422632396593 - type: nauc_ndcg_at_20_max value: -1.0551974757194074 - type: nauc_ndcg_at_20_std value: 2.0508581550409524 - type: nauc_ndcg_at_3_diff1 value: -7.53212458402589 - type: nauc_ndcg_at_3_max value: 3.6347588818172336 - type: nauc_ndcg_at_3_std value: 5.073680163820697 - type: nauc_ndcg_at_5_diff1 value: -17.183713921651613 - type: nauc_ndcg_at_5_max value: -2.598662858319381 - type: nauc_ndcg_at_5_std value: -0.4734708395726036 - type: nauc_precision_at_1000_diff1 value: 22.034829237918075 - type: nauc_precision_at_1000_max value: 29.133045600628414 - type: nauc_precision_at_1000_std value: 22.48207630228867 - type: nauc_precision_at_100_diff1 value: 22.17246050117164 - type: nauc_precision_at_100_max value: 25.497860199414003 - type: nauc_precision_at_100_std value: 14.10941839109608 - type: nauc_precision_at_10_diff1 value: -2.3976462009254527 - type: nauc_precision_at_10_max value: 3.2185747947259737 - type: nauc_precision_at_10_std value: 1.1160090019272848 - type: nauc_precision_at_1_diff1 value: 8.719512377821816 - type: nauc_precision_at_1_max value: 16.272382792823382 - type: nauc_precision_at_1_std value: -3.187491782487964 - type: nauc_precision_at_20_diff1 value: 8.125877087406765 - type: nauc_precision_at_20_max value: 14.004634012058606 - type: nauc_precision_at_20_std value: 6.076987698320296 - type: nauc_precision_at_3_diff1 value: -5.415944490965941 - type: nauc_precision_at_3_max value: 6.0110244505222 - type: nauc_precision_at_3_std value: 6.0205421596952675 - type: nauc_precision_at_5_diff1 value: -19.55829195099795 - type: nauc_precision_at_5_max value: -2.3847548504000993 - type: nauc_precision_at_5_std value: -4.296125770063572 - type: nauc_recall_at_1000_diff1 value: 5.793923275597914 - type: nauc_recall_at_1000_max value: 2.365078190964481 - type: nauc_recall_at_1000_std value: 3.5546888704254744 - type: nauc_recall_at_100_diff1 value: 1.652314810086157 - type: nauc_recall_at_100_max value: 1.2466358966197024 - type: nauc_recall_at_100_std value: -5.516640557428562 - type: nauc_recall_at_10_diff1 value: -18.83385802183443 - type: nauc_recall_at_10_max value: -15.04302952000884 - type: nauc_recall_at_10_std value: -0.9615025531726922 - type: nauc_recall_at_1_diff1 value: 22.607506151253776 - type: nauc_recall_at_1_max value: 10.921408762597743 - type: nauc_recall_at_1_std value: -2.0177080867009054 - type: nauc_recall_at_20_diff1 value: -8.960549697900921 - type: nauc_recall_at_20_max value: -6.8364201397227164 - type: nauc_recall_at_20_std value: -1.2091707122721411 - type: nauc_recall_at_3_diff1 value: -17.196135512311084 - type: nauc_recall_at_3_max value: -10.816815002699384 - type: nauc_recall_at_3_std value: 12.535755202753904 - type: nauc_recall_at_5_diff1 value: -23.856486271404066 - type: nauc_recall_at_5_max value: -13.129773406696268 - type: nauc_recall_at_5_std value: -2.885196394596191 - type: ndcg_at_1 value: 6.715999999999999 - type: ndcg_at_10 value: 7.161 - type: ndcg_at_100 value: 9.506 - type: ndcg_at_1000 value: 14.194 - type: ndcg_at_20 value: 6.969 - type: ndcg_at_3 value: 7.285 - type: ndcg_at_5 value: 7.436 - type: precision_at_1 value: 8.955 - type: precision_at_10 value: 6.866 - type: precision_at_100 value: 2.343 - type: precision_at_1000 value: 0.557 - type: precision_at_20 value: 5.0 - type: precision_at_3 value: 9.453 - type: precision_at_5 value: 8.955 - type: recall_at_1 value: 0.599 - type: recall_at_10 value: 5.234 - type: recall_at_100 value: 14.610999999999999 - type: recall_at_1000 value: 31.723000000000003 - type: recall_at_20 value: 6.797000000000001 - type: recall_at_3 value: 2.1239999999999997 - type: recall_at_5 value: 3.836 task: type: Retrieval - dataset: config: default name: MTEB DBPedia (default) revision: c0f706b76e590d620bd6618b3ca8efdd34e2d659 split: test type: mteb/dbpedia metrics: - type: main_score value: 9.612 - type: map_at_1 value: 1.5150000000000001 - type: map_at_10 value: 3.324 - type: map_at_100 value: 4.593 - type: map_at_1000 value: 4.942 - type: map_at_20 value: 3.775 - type: map_at_3 value: 2.349 - type: map_at_5 value: 2.83 - type: mrr_at_1 value: 17.75 - type: mrr_at_10 value: 25.455257936507948 - type: mrr_at_100 value: 26.384386588195795 - type: mrr_at_1000 value: 26.43428730177263 - type: mrr_at_20 value: 26.012663071147983 - type: mrr_at_3 value: 22.916666666666668 - type: mrr_at_5 value: 24.42916666666667 - type: nauc_map_at_1000_diff1 value: 22.13041079857 - type: nauc_map_at_1000_max value: 30.847169046279717 - type: nauc_map_at_1000_std value: 26.662372161640164 - type: nauc_map_at_100_diff1 value: 22.33437365695696 - type: nauc_map_at_100_max value: 30.631982988659413 - type: nauc_map_at_100_std value: 24.343041349757826 - type: nauc_map_at_10_diff1 value: 24.027517719649303 - type: nauc_map_at_10_max value: 25.07712884251914 - type: nauc_map_at_10_std value: 13.947979384184976 - type: nauc_map_at_1_diff1 value: 36.83267850021598 - type: nauc_map_at_1_max value: 19.169430946850284 - type: nauc_map_at_1_std value: 9.884774862276792 - type: nauc_map_at_20_diff1 value: 23.514668795309415 - type: nauc_map_at_20_max value: 27.504950445908978 - type: nauc_map_at_20_std value: 17.094975030047124 - type: nauc_map_at_3_diff1 value: 26.34278610573698 - type: nauc_map_at_3_max value: 20.845843284715972 - type: nauc_map_at_3_std value: 7.67049397964597 - type: nauc_map_at_5_diff1 value: 25.7750795640811 - type: nauc_map_at_5_max value: 22.947480091712098 - type: nauc_map_at_5_std value: 11.721230195408548 - type: nauc_mrr_at_1000_diff1 value: 22.232372488450842 - type: nauc_mrr_at_1000_max value: 27.572890316358283 - type: nauc_mrr_at_1000_std value: 16.214637981707586 - type: nauc_mrr_at_100_diff1 value: 22.236444609236038 - type: nauc_mrr_at_100_max value: 27.58760243571819 - type: nauc_mrr_at_100_std value: 16.244413870712897 - type: nauc_mrr_at_10_diff1 value: 22.225463768969977 - type: nauc_mrr_at_10_max value: 28.085279372515014 - type: nauc_mrr_at_10_std value: 16.63553736106648 - type: nauc_mrr_at_1_diff1 value: 29.84035077607877 - type: nauc_mrr_at_1_max value: 29.694489641199347 - type: nauc_mrr_at_1_std value: 13.521637546163495 - type: nauc_mrr_at_20_diff1 value: 22.04153237789325 - type: nauc_mrr_at_20_max value: 27.694203519607907 - type: nauc_mrr_at_20_std value: 16.41753082494305 - type: nauc_mrr_at_3_diff1 value: 23.699732601185406 - type: nauc_mrr_at_3_max value: 28.552272889924087 - type: nauc_mrr_at_3_std value: 15.054097838038286 - type: nauc_mrr_at_5_diff1 value: 23.127326455282443 - type: nauc_mrr_at_5_max value: 28.769272111978832 - type: nauc_mrr_at_5_std value: 16.113310297737975 - type: nauc_ndcg_at_1000_diff1 value: 19.30064409197478 - type: nauc_ndcg_at_1000_max value: 28.102160223624878 - type: nauc_ndcg_at_1000_std value: 30.203518553202162 - type: nauc_ndcg_at_100_diff1 value: 18.61374183566408 - type: nauc_ndcg_at_100_max value: 26.626236693773404 - type: nauc_ndcg_at_100_std value: 25.742758699186076 - type: nauc_ndcg_at_10_diff1 value: 22.519496459830016 - type: nauc_ndcg_at_10_max value: 29.403797316052678 - type: nauc_ndcg_at_10_std value: 20.893386965358616 - type: nauc_ndcg_at_1_diff1 value: 32.866635298438084 - type: nauc_ndcg_at_1_max value: 26.59719751655438 - type: nauc_ndcg_at_1_std value: 11.114394574061539 - type: nauc_ndcg_at_20_diff1 value: 21.157000991633115 - type: nauc_ndcg_at_20_max value: 27.740565719664534 - type: nauc_ndcg_at_20_std value: 21.639809971682443 - type: nauc_ndcg_at_3_diff1 value: 25.11861929994868 - type: nauc_ndcg_at_3_max value: 30.05796948174576 - type: nauc_ndcg_at_3_std value: 15.558218990994382 - type: nauc_ndcg_at_5_diff1 value: 23.56633730677446 - type: nauc_ndcg_at_5_max value: 29.407157319632233 - type: nauc_ndcg_at_5_std value: 18.567271816504054 - type: nauc_precision_at_1000_diff1 value: 15.34548548807785 - type: nauc_precision_at_1000_max value: 10.572226641262324 - type: nauc_precision_at_1000_std value: 29.1034314360236 - type: nauc_precision_at_100_diff1 value: 15.716430228733962 - type: nauc_precision_at_100_max value: 29.095076486854232 - type: nauc_precision_at_100_std value: 38.5066690028862 - type: nauc_precision_at_10_diff1 value: 19.68952528017596 - type: nauc_precision_at_10_max value: 36.890169328577436 - type: nauc_precision_at_10_std value: 30.965796095297055 - type: nauc_precision_at_1_diff1 value: 29.84035077607877 - type: nauc_precision_at_1_max value: 29.694489641199347 - type: nauc_precision_at_1_std value: 13.521637546163495 - type: nauc_precision_at_20_diff1 value: 18.030808015274253 - type: nauc_precision_at_20_max value: 37.61603054850129 - type: nauc_precision_at_20_std value: 34.160861586371816 - type: nauc_precision_at_3_diff1 value: 20.899695298609572 - type: nauc_precision_at_3_max value: 35.736648108449906 - type: nauc_precision_at_3_std value: 21.012939343933635 - type: nauc_precision_at_5_diff1 value: 20.038574686656855 - type: nauc_precision_at_5_max value: 37.244225604024464 - type: nauc_precision_at_5_std value: 27.105877764557317 - type: nauc_recall_at_1000_diff1 value: 7.621037010770166 - type: nauc_recall_at_1000_max value: 14.556069262959875 - type: nauc_recall_at_1000_std value: 24.912834855259458 - type: nauc_recall_at_100_diff1 value: 5.640854515267624 - type: nauc_recall_at_100_max value: 12.319243091931583 - type: nauc_recall_at_100_std value: 18.20593364111766 - type: nauc_recall_at_10_diff1 value: 9.625612977495116 - type: nauc_recall_at_10_max value: 17.05920473206263 - type: nauc_recall_at_10_std value: 10.7221437835498 - type: nauc_recall_at_1_diff1 value: 36.83267850021598 - type: nauc_recall_at_1_max value: 19.169430946850284 - type: nauc_recall_at_1_std value: 9.884774862276792 - type: nauc_recall_at_20_diff1 value: 8.05059067573258 - type: nauc_recall_at_20_max value: 15.8154139120262 - type: nauc_recall_at_20_std value: 12.679202204644218 - type: nauc_recall_at_3_diff1 value: 16.446191987706968 - type: nauc_recall_at_3_max value: 16.891019665567892 - type: nauc_recall_at_3_std value: 5.902427268316366 - type: nauc_recall_at_5_diff1 value: 16.441740431697145 - type: nauc_recall_at_5_max value: 18.339945932093187 - type: nauc_recall_at_5_std value: 11.244004704766795 - type: ndcg_at_1 value: 13.0 - type: ndcg_at_10 value: 9.612 - type: ndcg_at_100 value: 11.403 - type: ndcg_at_1000 value: 15.142 - type: ndcg_at_20 value: 9.419 - type: ndcg_at_3 value: 10.821 - type: ndcg_at_5 value: 10.462 - type: precision_at_1 value: 17.75 - type: precision_at_10 value: 9.15 - type: precision_at_100 value: 3.0 - type: precision_at_1000 value: 0.716 - type: precision_at_20 value: 6.763 - type: precision_at_3 value: 13.417000000000002 - type: precision_at_5 value: 12.35 - type: recall_at_1 value: 1.5150000000000001 - type: recall_at_10 value: 5.858 - type: recall_at_100 value: 15.643 - type: recall_at_1000 value: 28.51 - type: recall_at_20 value: 8.25 - type: recall_at_3 value: 2.995 - type: recall_at_5 value: 4.117 task: type: Retrieval - dataset: config: default name: MTEB DBpediaClassification (default) revision: 9abd46cf7fc8b4c64290f26993c540b92aa145ac split: test type: fancyzhx/dbpedia_14 metrics: - type: accuracy value: 79.6484375 - type: f1 value: 78.34279956840108 - type: f1_weighted value: 78.35088313144212 - type: main_score value: 79.6484375 task: type: Classification - dataset: config: default name: MTEB DefinitionClassificationLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 84.51757666417352 - type: ap value: 80.76707736262222 - type: ap_weighted value: 80.76707736262222 - type: f1 value: 84.51702233000746 - type: f1_weighted value: 84.52014045969152 - type: main_score value: 84.51757666417352 task: type: Classification - dataset: config: default name: MTEB Diversity1LegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 76.33333333333334 - type: ap value: 23.666666666666668 - type: ap_weighted value: 23.666666666666668 - type: f1 value: 43.28922495274102 - type: f1_weighted value: 66.08821676118463 - type: main_score value: 76.33333333333334 task: type: Classification - dataset: config: default name: MTEB Diversity2LegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 74.66666666666669 - type: ap value: 25.333333333333336 - type: ap_weighted value: 25.333333333333336 - type: f1 value: 42.74809160305343 - type: f1_weighted value: 63.83715012722646 - type: main_score value: 74.66666666666669 task: type: Classification - dataset: config: default name: MTEB Diversity3LegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 58.666666666666664 - type: ap value: 58.666666666666664 - type: ap_weighted value: 58.666666666666664 - type: f1 value: 36.97478991596639 - type: f1_weighted value: 43.383753501400555 - type: main_score value: 58.666666666666664 task: type: Classification - dataset: config: default name: MTEB Diversity4LegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 53.333333333333336 - type: ap value: 53.333333333333336 - type: ap_weighted value: 53.333333333333336 - type: f1 value: 34.782608695652165 - type: f1_weighted value: 37.10144927536233 - type: main_score value: 53.333333333333336 task: type: Classification - dataset: config: default name: MTEB Diversity5LegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 57.333333333333336 - type: ap value: 57.333333333333336 - type: ap_weighted value: 57.333333333333336 - type: f1 value: 36.440677966101696 - type: f1_weighted value: 41.78531073446328 - type: main_score value: 57.333333333333336 task: type: Classification - dataset: config: default name: MTEB Diversity6LegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 55.33333333333334 - type: ap value: 55.335312709510575 - type: ap_weighted value: 55.335312709510575 - type: f1 value: 53.72075888745626 - type: f1_weighted value: 54.239086387916736 - type: main_score value: 55.33333333333334 task: type: Classification - dataset: config: default name: MTEB EmotionClassification (default) revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37 split: test type: mteb/emotion metrics: - type: accuracy value: 29.500000000000004 - type: f1 value: 25.366180985174143 - type: f1_weighted value: 31.616367697127934 - type: main_score value: 29.500000000000004 task: type: Classification - dataset: config: default name: MTEB EmotionClassification (default) revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37 split: validation type: mteb/emotion metrics: - type: accuracy value: 29.59 - type: f1 value: 25.66115067003055 - type: f1_weighted value: 31.610928656113497 - type: main_score value: 29.59 task: type: Classification - dataset: config: default name: MTEB FaithDial (default) revision: 7a414e80725eac766f2602676dc8b39f80b061e4 split: test type: McGill-NLP/FaithDial metrics: - type: main_score value: 13.203999999999999 - type: map_at_1 value: 4.603 - type: map_at_10 value: 9.689 - type: map_at_100 value: 10.934000000000001 - type: map_at_1000 value: 11.06 - type: map_at_20 value: 10.282 - type: map_at_3 value: 7.46 - type: map_at_5 value: 8.601 - type: mrr_at_1 value: 3.9177277179236047 - type: mrr_at_10 value: 9.372463970896874 - type: mrr_at_100 value: 10.603150618822562 - type: mrr_at_1000 value: 10.7286670506961 - type: mrr_at_20 value: 9.954996988904508 - type: mrr_at_3 value: 7.190662748938949 - type: mrr_at_5 value: 8.24844923277832 - type: nauc_map_at_1000_diff1 value: 5.307634687499811 - type: nauc_map_at_1000_max value: 2.3021513473591937 - type: nauc_map_at_1000_std value: -17.73170584094867 - type: nauc_map_at_100_diff1 value: 5.297350465897308 - type: nauc_map_at_100_max value: 2.346907480087932 - type: nauc_map_at_100_std value: -17.732933045818474 - type: nauc_map_at_10_diff1 value: 6.045977877604437 - type: nauc_map_at_10_max value: 1.8368181824684384 - type: nauc_map_at_10_std value: -19.787304492799954 - type: nauc_map_at_1_diff1 value: 1.3052717698444036 - type: nauc_map_at_1_max value: -4.135496842891768 - type: nauc_map_at_1_std value: -19.25157996189646 - type: nauc_map_at_20_diff1 value: 5.761740069816983 - type: nauc_map_at_20_max value: 2.2984777745182807 - type: nauc_map_at_20_std value: -18.75124467493425 - type: nauc_map_at_3_diff1 value: 6.651930299284997 - type: nauc_map_at_3_max value: -0.3272549806355308 - type: nauc_map_at_3_std value: -21.098596102590484 - type: nauc_map_at_5_diff1 value: 6.967992538819455 - type: nauc_map_at_5_max value: 0.5435787268710469 - type: nauc_map_at_5_std value: -20.283953347398604 - type: nauc_mrr_at_1000_diff1 value: 6.740910238395446 - type: nauc_mrr_at_1000_max value: 2.260193924794291 - type: nauc_mrr_at_1000_std value: -16.012044193795997 - type: nauc_mrr_at_100_diff1 value: 6.722495330136685 - type: nauc_mrr_at_100_max value: 2.303043406886841 - type: nauc_mrr_at_100_std value: -16.020952265971687 - type: nauc_mrr_at_10_diff1 value: 7.499027953700563 - type: nauc_mrr_at_10_max value: 1.7369780903909435 - type: nauc_mrr_at_10_std value: -17.773058332780796 - type: nauc_mrr_at_1_diff1 value: 7.479923371906451 - type: nauc_mrr_at_1_max value: -6.618146247607683 - type: nauc_mrr_at_1_std value: -17.69446400002114 - type: nauc_mrr_at_20_diff1 value: 7.167945669605475 - type: nauc_mrr_at_20_max value: 2.272029597435147 - type: nauc_mrr_at_20_std value: -17.15567528957464 - type: nauc_mrr_at_3_diff1 value: 8.689535713040886 - type: nauc_mrr_at_3_max value: -0.503459138449647 - type: nauc_mrr_at_3_std value: -18.50457781869527 - type: nauc_mrr_at_5_diff1 value: 8.688882139587488 - type: nauc_mrr_at_5_max value: 0.6822164815544203 - type: nauc_mrr_at_5_std value: -18.323678647634363 - type: nauc_ndcg_at_1000_diff1 value: 3.895349559751926 - type: nauc_ndcg_at_1000_max value: 4.497321779831305 - type: nauc_ndcg_at_1000_std value: -11.297185296929218 - type: nauc_ndcg_at_100_diff1 value: 2.8704577253134365 - type: nauc_ndcg_at_100_max value: 5.389954929442454 - type: nauc_ndcg_at_100_std value: -10.400630555415756 - type: nauc_ndcg_at_10_diff1 value: 6.092068255087623 - type: nauc_ndcg_at_10_max value: 4.227250873974054 - type: nauc_ndcg_at_10_std value: -19.171869390880573 - type: nauc_ndcg_at_1_diff1 value: 1.3052717698444036 - type: nauc_ndcg_at_1_max value: -4.135496842891768 - type: nauc_ndcg_at_1_std value: -19.25157996189646 - type: nauc_ndcg_at_20_diff1 value: 5.40179215063042 - type: nauc_ndcg_at_20_max value: 5.316262069583032 - type: nauc_ndcg_at_20_std value: -16.253163982932534 - type: nauc_ndcg_at_3_diff1 value: 7.419223521385511 - type: nauc_ndcg_at_3_max value: 0.5830467018062534 - type: nauc_ndcg_at_3_std value: -21.398247993882336 - type: nauc_ndcg_at_5_diff1 value: 7.871015584820952 - type: nauc_ndcg_at_5_max value: 1.911179358773651 - type: nauc_ndcg_at_5_std value: -20.05509945356285 - type: nauc_precision_at_1000_diff1 value: -0.844755882557819 - type: nauc_precision_at_1000_max value: 9.219453102597015 - type: nauc_precision_at_1000_std value: 29.23861313970078 - type: nauc_precision_at_100_diff1 value: -3.7470853890619606 - type: nauc_precision_at_100_max value: 10.533862037156355 - type: nauc_precision_at_100_std value: 8.252086567057157 - type: nauc_precision_at_10_diff1 value: 5.901773888339623 - type: nauc_precision_at_10_max value: 8.111412609207008 - type: nauc_precision_at_10_std value: -18.07076007909741 - type: nauc_precision_at_1_diff1 value: 1.3052717698444036 - type: nauc_precision_at_1_max value: -4.135496842891768 - type: nauc_precision_at_1_std value: -19.25157996189646 - type: nauc_precision_at_20_diff1 value: 4.510193698541817 - type: nauc_precision_at_20_max value: 10.055538647436114 - type: nauc_precision_at_20_std value: -11.60139299594993 - type: nauc_precision_at_3_diff1 value: 8.853244226690453 - type: nauc_precision_at_3_max value: 2.3906768293455305 - type: nauc_precision_at_3_std value: -21.96838812494048 - type: nauc_precision_at_5_diff1 value: 9.38307261489558 - type: nauc_precision_at_5_max value: 4.352929382840095 - type: nauc_precision_at_5_std value: -19.535985352739786 - type: nauc_recall_at_1000_diff1 value: -0.8447558825574738 - type: nauc_recall_at_1000_max value: 9.219453102597296 - type: nauc_recall_at_1000_std value: 29.23861313970089 - type: nauc_recall_at_100_diff1 value: -3.747085389061965 - type: nauc_recall_at_100_max value: 10.533862037156396 - type: nauc_recall_at_100_std value: 8.252086567057194 - type: nauc_recall_at_10_diff1 value: 5.901773888339621 - type: nauc_recall_at_10_max value: 8.111412609207008 - type: nauc_recall_at_10_std value: -18.07076007909743 - type: nauc_recall_at_1_diff1 value: 1.3052717698444036 - type: nauc_recall_at_1_max value: -4.135496842891768 - type: nauc_recall_at_1_std value: -19.25157996189646 - type: nauc_recall_at_20_diff1 value: 4.510193698541801 - type: nauc_recall_at_20_max value: 10.055538647436121 - type: nauc_recall_at_20_std value: -11.601392995949936 - type: nauc_recall_at_3_diff1 value: 8.853244226690453 - type: nauc_recall_at_3_max value: 2.390676829345526 - type: nauc_recall_at_3_std value: -21.96838812494048 - type: nauc_recall_at_5_diff1 value: 9.383072614895593 - type: nauc_recall_at_5_max value: 4.352929382840121 - type: nauc_recall_at_5_std value: -19.535985352739782 - type: ndcg_at_1 value: 4.603 - type: ndcg_at_10 value: 13.203999999999999 - type: ndcg_at_100 value: 20.254 - type: ndcg_at_1000 value: 23.923 - type: ndcg_at_20 value: 15.354000000000001 - type: ndcg_at_3 value: 8.469 - type: ndcg_at_5 value: 10.536 - type: precision_at_1 value: 4.603 - type: precision_at_10 value: 2.478 - type: precision_at_100 value: 0.6 - type: precision_at_1000 value: 0.09 - type: precision_at_20 value: 1.6629999999999998 - type: precision_at_3 value: 3.803 - type: precision_at_5 value: 3.2910000000000004 - type: recall_at_1 value: 4.603 - type: recall_at_10 value: 24.779999999999998 - type: recall_at_100 value: 60.039 - type: recall_at_1000 value: 89.667 - type: recall_at_20 value: 33.251999999999995 - type: recall_at_3 value: 11.41 - type: recall_at_5 value: 16.454 task: type: Retrieval - dataset: config: default name: MTEB FeedbackQARetrieval (default) revision: 1ee1cd0 split: test type: lt2c/fqa metrics: - type: main_score value: 19.026 - type: map_at_1 value: 19.026 - type: map_at_10 value: 26.287 - type: map_at_100 value: 27.294 - type: map_at_1000 value: 27.381 - type: map_at_20 value: 26.823999999999998 - type: map_at_3 value: 24.18 - type: map_at_5 value: 25.365 - type: mrr_at_1 value: 19.026104417670684 - type: mrr_at_10 value: 26.287052973799952 - type: mrr_at_100 value: 27.29426430169323 - type: mrr_at_1000 value: 27.380630702740504 - type: mrr_at_20 value: 26.824443943374348 - type: mrr_at_3 value: 24.1800535475234 - type: mrr_at_5 value: 25.364792503346674 - type: nauc_map_at_1000_diff1 value: 40.81899763873748 - type: nauc_map_at_1000_max value: 11.253631614437268 - type: nauc_map_at_1000_std value: 1.5897060898020656 - type: nauc_map_at_100_diff1 value: 40.78701343792848 - type: nauc_map_at_100_max value: 11.27294926630661 - type: nauc_map_at_100_std value: 1.6118772584552687 - type: nauc_map_at_10_diff1 value: 41.075611489073324 - type: nauc_map_at_10_max value: 11.521202364241029 - type: nauc_map_at_10_std value: 1.2931734299571058 - type: nauc_map_at_1_diff1 value: 48.17546169609799 - type: nauc_map_at_1_max value: 13.494189949598375 - type: nauc_map_at_1_std value: 0.07263746580580938 - type: nauc_map_at_20_diff1 value: 40.841882938863435 - type: nauc_map_at_20_max value: 11.418649006248861 - type: nauc_map_at_20_std value: 1.4175148500460242 - type: nauc_map_at_3_diff1 value: 42.213517992662815 - type: nauc_map_at_3_max value: 12.808728940816176 - type: nauc_map_at_3_std value: 1.0861600000182654 - type: nauc_map_at_5_diff1 value: 41.6309141720988 - type: nauc_map_at_5_max value: 11.996308489388992 - type: nauc_map_at_5_std value: 1.2641645150076395 - type: nauc_mrr_at_1000_diff1 value: 40.81899763873748 - type: nauc_mrr_at_1000_max value: 11.253631614437268 - type: nauc_mrr_at_1000_std value: 1.5897060898020656 - type: nauc_mrr_at_100_diff1 value: 40.78701343792848 - type: nauc_mrr_at_100_max value: 11.27294926630661 - type: nauc_mrr_at_100_std value: 1.6118772584552687 - type: nauc_mrr_at_10_diff1 value: 41.075611489073324 - type: nauc_mrr_at_10_max value: 11.521202364241029 - type: nauc_mrr_at_10_std value: 1.2931734299571058 - type: nauc_mrr_at_1_diff1 value: 48.17546169609799 - type: nauc_mrr_at_1_max value: 13.494189949598375 - type: nauc_mrr_at_1_std value: 0.07263746580580938 - type: nauc_mrr_at_20_diff1 value: 40.841882938863435 - type: nauc_mrr_at_20_max value: 11.418649006248861 - type: nauc_mrr_at_20_std value: 1.4175148500460242 - type: nauc_mrr_at_3_diff1 value: 42.213517992662815 - type: nauc_mrr_at_3_max value: 12.808728940816176 - type: nauc_mrr_at_3_std value: 1.0861600000182654 - type: nauc_mrr_at_5_diff1 value: 41.6309141720988 - type: nauc_mrr_at_5_max value: 11.996308489388992 - type: nauc_mrr_at_5_std value: 1.2641645150076395 - type: nauc_ndcg_at_1000_diff1 value: 37.7525819268389 - type: nauc_ndcg_at_1000_max value: 8.537400436184365 - type: nauc_ndcg_at_1000_std value: 2.9622195950411925 - type: nauc_ndcg_at_100_diff1 value: 36.787603237032975 - type: nauc_ndcg_at_100_max value: 8.608543884213873 - type: nauc_ndcg_at_100_std value: 3.8384319334640695 - type: nauc_ndcg_at_10_diff1 value: 38.17646042200737 - type: nauc_ndcg_at_10_max value: 10.09464701041161 - type: nauc_ndcg_at_10_std value: 1.82746325273071 - type: nauc_ndcg_at_1_diff1 value: 48.17546169609799 - type: nauc_ndcg_at_1_max value: 13.494189949598375 - type: nauc_ndcg_at_1_std value: 0.07263746580580938 - type: nauc_ndcg_at_20_diff1 value: 37.27227964097512 - type: nauc_ndcg_at_20_max value: 9.739171990515723 - type: nauc_ndcg_at_20_std value: 2.3086094833252115 - type: nauc_ndcg_at_3_diff1 value: 40.37281782985726 - type: nauc_ndcg_at_3_max value: 12.624015391541455 - type: nauc_ndcg_at_3_std value: 1.407593942089084 - type: nauc_ndcg_at_5_diff1 value: 39.35750963645447 - type: nauc_ndcg_at_5_max value: 11.236243459280038 - type: nauc_ndcg_at_5_std value: 1.722451235770262 - type: nauc_precision_at_1000_diff1 value: 12.726040453874319 - type: nauc_precision_at_1000_max value: -30.085818447743566 - type: nauc_precision_at_1000_std value: 15.649828948529738 - type: nauc_precision_at_100_diff1 value: 20.374750836627285 - type: nauc_precision_at_100_max value: -4.315521193959148 - type: nauc_precision_at_100_std value: 15.928528368224907 - type: nauc_precision_at_10_diff1 value: 30.394845120941987 - type: nauc_precision_at_10_max value: 5.92964609786744 - type: nauc_precision_at_10_std value: 3.297191207595148 - type: nauc_precision_at_1_diff1 value: 48.17546169609799 - type: nauc_precision_at_1_max value: 13.494189949598375 - type: nauc_precision_at_1_std value: 0.07263746580580938 - type: nauc_precision_at_20_diff1 value: 26.72269495712158 - type: nauc_precision_at_20_max value: 4.521447508378409 - type: nauc_precision_at_20_std value: 5.180527682236829 - type: nauc_precision_at_3_diff1 value: 35.59077406479908 - type: nauc_precision_at_3_max value: 12.151097771811763 - type: nauc_precision_at_3_std value: 2.24486462426719 - type: nauc_precision_at_5_diff1 value: 33.428016378866076 - type: nauc_precision_at_5_max value: 9.15731660897423 - type: nauc_precision_at_5_std value: 2.9353909916486294 - type: nauc_recall_at_1000_diff1 value: 12.726040453874369 - type: nauc_recall_at_1000_max value: -30.085818447743364 - type: nauc_recall_at_1000_std value: 15.649828948529635 - type: nauc_recall_at_100_diff1 value: 20.374750836627264 - type: nauc_recall_at_100_max value: -4.315521193959231 - type: nauc_recall_at_100_std value: 15.928528368224876 - type: nauc_recall_at_10_diff1 value: 30.394845120942005 - type: nauc_recall_at_10_max value: 5.929646097867471 - type: nauc_recall_at_10_std value: 3.297191207595157 - type: nauc_recall_at_1_diff1 value: 48.17546169609799 - type: nauc_recall_at_1_max value: 13.494189949598375 - type: nauc_recall_at_1_std value: 0.07263746580580938 - type: nauc_recall_at_20_diff1 value: 26.722694957121647 - type: nauc_recall_at_20_max value: 4.521447508378419 - type: nauc_recall_at_20_std value: 5.1805276822368524 - type: nauc_recall_at_3_diff1 value: 35.59077406479911 - type: nauc_recall_at_3_max value: 12.151097771811772 - type: nauc_recall_at_3_std value: 2.2448646242671857 - type: nauc_recall_at_5_diff1 value: 33.42801637886615 - type: nauc_recall_at_5_max value: 9.15731660897428 - type: nauc_recall_at_5_std value: 2.9353909916486782 - type: ndcg_at_1 value: 19.026 - type: ndcg_at_10 value: 30.245 - type: ndcg_at_100 value: 35.716 - type: ndcg_at_1000 value: 38.421 - type: ndcg_at_20 value: 32.242 - type: ndcg_at_3 value: 25.884 - type: ndcg_at_5 value: 28.016999999999996 - type: precision_at_1 value: 19.026 - type: precision_at_10 value: 4.287 - type: precision_at_100 value: 0.697 - type: precision_at_1000 value: 0.092 - type: precision_at_20 value: 2.543 - type: precision_at_3 value: 10.274 - type: precision_at_5 value: 7.199 - type: recall_at_1 value: 19.026 - type: recall_at_10 value: 42.870999999999995 - type: recall_at_100 value: 69.729 - type: recall_at_1000 value: 91.968 - type: recall_at_20 value: 50.853 - type: recall_at_3 value: 30.823 - type: recall_at_5 value: 35.994 task: type: Retrieval - dataset: config: default name: MTEB FinancialPhrasebankClassification (default) revision: 1484d06fe7af23030c7c977b12556108d1f67039 split: train type: takala/financial_phrasebank metrics: - type: accuracy value: 67.97703180212015 - type: f1 value: 57.55594804795911 - type: f1_weighted value: 68.01782223640284 - type: main_score value: 67.97703180212015 task: type: Classification - dataset: config: default name: MTEB FrenkEnClassification (default) revision: 52483dba0ff23291271ee9249839865e3c3e7e50 split: test type: classla/FRENK-hate-en metrics: - type: accuracy value: 55.289004780530206 - type: ap value: 41.78925787378802 - type: ap_weighted value: 41.78925787378802 - type: f1 value: 54.04961911556596 - type: f1_weighted value: 54.99825667370393 - type: main_score value: 55.289004780530206 task: type: Classification - dataset: config: default name: MTEB FunctionOfDecisionSectionLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 16.621253405994548 - type: f1 value: 15.693085823082844 - type: f1_weighted value: 15.880480382757908 - type: main_score value: 16.621253405994548 task: type: Classification - dataset: config: default name: MTEB GPUSpeedTask (default) revision: '1.0' split: test type: 'GPUSpeedTask' metrics: - type: avg_words_per_sec value: 7186456.843601672 - type: main_score value: 7186456.843601672 - type: num_gpus value: 300 - type: physical_cores value: 3600 - type: time_mean value: 5.055342401776995 - type: time_std value: 1.0630782067852145 - type: total_cores value: 7200 task: type: Speed - dataset: config: default name: MTEB GeoreviewClassification (default) revision: 3765c0d1de6b7d264bc459433c45e5a75513839c split: test type: ai-forever/georeview-classification metrics: - type: accuracy value: 41.3623046875 - type: f1 value: 39.78804299557415 - type: f1_weighted value: 39.787468620260825 - type: main_score value: 41.3623046875 task: type: Classification - dataset: config: default name: MTEB GeoreviewClusteringP2P (default) revision: 97a313c8fc85b47f13f33e7e9a95c1ad888c7fec split: test type: ai-forever/georeview-clustering-p2p metrics: - type: main_score value: 59.713474431847416 - type: v_measure value: 59.713474431847416 - type: v_measure_std value: 1.1676689250848244 task: type: Clustering - dataset: config: default name: MTEB HeadlineClassification (default) revision: 2fe05ee6b5832cda29f2ef7aaad7b7fe6a3609eb split: test type: ai-forever/headline-classification metrics: - type: accuracy value: 68.9013671875 - type: f1 value: 68.80041842725984 - type: f1_weighted value: 68.80034868754102 - type: main_score value: 68.9013671875 task: type: Classification - dataset: config: default name: MTEB ImdbClassification (default) revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7 split: test type: mteb/imdb metrics: - type: accuracy value: 58.35799999999999 - type: ap value: 55.16102855038145 - type: ap_weighted value: 55.16102855038145 - type: f1 value: 57.51452465161078 - type: f1_weighted value: 57.514524651610785 - type: main_score value: 58.35799999999999 task: type: Classification - dataset: config: default name: MTEB InappropriatenessClassification (default) revision: 601651fdc45ef243751676e62dd7a19f491c0285 split: test type: ai-forever/inappropriateness-classification metrics: - type: accuracy value: 59.11132812499999 - type: ap value: 55.4713646939923 - type: ap_weighted value: 55.4713646939923 - type: f1 value: 58.8968409989092 - type: f1_weighted value: 58.8968409989092 - type: main_score value: 59.11132812499999 task: type: Classification - dataset: config: default name: MTEB InsurancePolicyInterpretationLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 20.30075187969925 - type: f1 value: 11.25 - type: f1_weighted value: 6.851503759398496 - type: main_score value: 20.30075187969925 task: type: Classification - dataset: config: default name: MTEB InternationalCitizenshipQuestionsLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 60.107421875 - type: ap value: 46.4447988877498 - type: ap_weighted value: 46.4447988877498 - type: f1 value: 56.153528268151675 - type: f1_weighted value: 58.210838762771935 - type: main_score value: 60.107421875 task: type: Classification - dataset: config: default name: MTEB JCrewBlockerLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 79.62962962962962 - type: ap value: 86.55394524959743 - type: ap_weighted value: 86.55394524959743 - type: f1 value: 61.60310277957336 - type: f1_weighted value: 79.14242620124973 - type: main_score value: 79.62962962962962 task: type: Classification - dataset: config: default name: MTEB KinopoiskClassification (default) revision: 5911f26666ac11af46cb9c6849d0dc80a378af24 split: test type: ai-forever/kinopoisk-sentiment-classification metrics: - type: accuracy value: 50.46666666666666 - type: f1 value: 49.1239356856144 - type: f1_weighted value: 49.123935685614384 - type: main_score value: 50.46666666666666 task: type: Classification - dataset: config: default name: MTEB LearnedHandsBenefitsLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 66.66666666666667 - type: ap value: 61.11111111111111 - type: ap_weighted value: 61.11111111111111 - type: f1 value: 66.66666666666667 - type: f1_weighted value: 66.66666666666667 - type: main_score value: 66.66666666666667 task: type: Classification - dataset: config: default name: MTEB LearnedHandsBusinessLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 70.11494252873564 - type: ap value: 68.24378508420207 - type: ap_weighted value: 68.24378508420207 - type: f1 value: 68.07339449541284 - type: f1_weighted value: 68.07339449541284 - type: main_score value: 70.11494252873564 task: type: Classification - dataset: config: default name: MTEB LearnedHandsConsumerLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 58.143322475570045 - type: ap value: 54.72001493806926 - type: ap_weighted value: 54.72001493806926 - type: f1 value: 58.13788145283024 - type: f1_weighted value: 58.13788145283024 - type: main_score value: 58.143322475570045 task: type: Classification - dataset: config: default name: MTEB LearnedHandsCourtsLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 60.41666666666667 - type: ap value: 56.07638888888889 - type: ap_weighted value: 56.07638888888889 - type: f1 value: 59.78835978835979 - type: f1_weighted value: 59.78835978835979 - type: main_score value: 60.41666666666667 task: type: Classification - dataset: config: default name: MTEB LearnedHandsCrimeLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 70.63953488372093 - type: ap value: 65.3728949478749 - type: ap_weighted value: 65.3728949478749 - type: f1 value: 70.45754079263989 - type: f1_weighted value: 70.45754079263989 - type: main_score value: 70.63953488372093 task: type: Classification - dataset: config: default name: MTEB LearnedHandsDivorceLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 62.66666666666667 - type: ap value: 57.45794392523364 - type: ap_weighted value: 57.45794392523364 - type: f1 value: 60.886571056062586 - type: f1_weighted value: 60.886571056062586 - type: main_score value: 62.66666666666667 task: type: Classification - dataset: config: default name: MTEB LearnedHandsDomesticViolenceLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 68.39080459770115 - type: ap value: 62.26053639846742 - type: ap_weighted value: 62.26053639846742 - type: f1 value: 68.30601092896174 - type: f1_weighted value: 68.30601092896174 - type: main_score value: 68.39080459770115 task: type: Classification - dataset: config: default name: MTEB LearnedHandsEducationLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 69.64285714285714 - type: ap value: 62.222222222222214 - type: ap_weighted value: 62.222222222222214 - type: f1 value: 66.56129258868984 - type: f1_weighted value: 66.56129258868984 - type: main_score value: 69.64285714285714 task: type: Classification - dataset: config: default name: MTEB LearnedHandsEmploymentLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 63.521126760563384 - type: ap value: 58.7392648574373 - type: ap_weighted value: 58.7392648574373 - type: f1 value: 63.4682967433563 - type: f1_weighted value: 63.4682967433563 - type: main_score value: 63.521126760563384 task: type: Classification - dataset: config: default name: MTEB LearnedHandsEstatesLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 70.78651685393258 - type: ap value: 64.05564472980203 - type: ap_weighted value: 64.05564472980203 - type: f1 value: 70.54855542828051 - type: f1_weighted value: 70.54855542828051 - type: main_score value: 70.78651685393258 task: type: Classification - dataset: config: default name: MTEB LearnedHandsFamilyLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 75.48828125 - type: ap value: 68.42998798076924 - type: ap_weighted value: 68.42998798076924 - type: f1 value: 75.3630731744256 - type: f1_weighted value: 75.3630731744256 - type: main_score value: 75.48828125 task: type: Classification - dataset: config: default name: MTEB LearnedHandsHealthLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 64.60176991150443 - type: ap value: 58.96246566981995 - type: ap_weighted value: 58.96246566981995 - type: f1 value: 63.877567329976834 - type: f1_weighted value: 63.877567329976834 - type: main_score value: 64.60176991150443 task: type: Classification - dataset: config: default name: MTEB LearnedHandsHousingLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 48.73046875 - type: ap value: 49.376600701618464 - type: ap_weighted value: 49.376600701618464 - type: f1 value: 46.38903847304493 - type: f1_weighted value: 46.38903847304493 - type: main_score value: 48.73046875 task: type: Classification - dataset: config: default name: MTEB LearnedHandsImmigrationLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 83.5820895522388 - type: ap value: 77.43325625394155 - type: ap_weighted value: 77.43325625394155 - type: f1 value: 83.5674470457079 - type: f1_weighted value: 83.5674470457079 - type: main_score value: 83.5820895522388 task: type: Classification - dataset: config: default name: MTEB LearnedHandsTortsLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 63.19444444444444 - type: ap value: 58.41384863123993 - type: ap_weighted value: 58.41384863123993 - type: f1 value: 63.17846287451151 - type: f1_weighted value: 63.17846287451151 - type: main_score value: 63.19444444444444 task: type: Classification - dataset: config: default name: MTEB LearnedHandsTrafficLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 69.7841726618705 - type: ap value: 62.353917770760766 - type: ap_weighted value: 62.353917770760766 - type: f1 value: 66.90476190476191 - type: f1_weighted value: 66.90476190476191 - type: main_score value: 69.7841726618705 task: type: Classification - dataset: config: default name: MTEB LegalReasoningCausalityLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 56.36363636363636 - type: ap value: 64.75724991854024 - type: ap_weighted value: 64.75724991854024 - type: f1 value: 52.85714285714286 - type: f1_weighted value: 51.220779220779214 - type: main_score value: 56.36363636363636 task: type: Classification - dataset: config: default name: MTEB MAUDLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 27.607421875 - type: f1 value: 14.84669450435061 - type: f1_weighted value: 28.881436838109853 - type: main_score value: 27.607421875 task: type: Classification - dataset: config: zh-CN name: MTEB MassiveIntentClassification (zh-CN) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 5.208473436449227 - type: f1 value: 3.062867346742466 - type: f1_weighted value: 3.5821384620305414 - type: main_score value: 5.208473436449227 task: type: Classification - dataset: config: ko name: MTEB MassiveIntentClassification (ko) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 2.5319435104236723 - type: f1 value: 0.5994050487142139 - type: f1_weighted value: 1.0538452549913138 - type: main_score value: 2.5319435104236723 task: type: Classification - dataset: config: hi name: MTEB MassiveIntentClassification (hi) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 2.558843308675185 - type: f1 value: 1.258311921873436 - type: f1_weighted value: 1.4083594758704836 - type: main_score value: 2.558843308675185 task: type: Classification - dataset: config: kn name: MTEB MassiveIntentClassification (kn) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 2.0645595158036314 - type: f1 value: 1.2240987569096886 - type: f1_weighted value: 1.0817495786784068 - type: main_score value: 2.0645595158036314 task: type: Classification - dataset: config: ka name: MTEB MassiveIntentClassification (ka) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 2.6395427034297243 - type: f1 value: 0.7660068670322584 - type: f1_weighted value: 0.7729737527960681 - type: main_score value: 2.6395427034297243 task: type: Classification - dataset: config: am name: MTEB MassiveIntentClassification (am) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 2.276395427034297 - type: f1 value: 0.7755708386766476 - type: f1_weighted value: 0.9189927682322296 - type: main_score value: 2.276395427034297 task: type: Classification - dataset: config: my name: MTEB MassiveIntentClassification (my) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 3.9576328177538667 - type: f1 value: 1.0681259563998668 - type: f1_weighted value: 1.5818553042962555 - type: main_score value: 3.9576328177538667 task: type: Classification - dataset: config: el name: MTEB MassiveIntentClassification (el) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 9.663752521856086 - type: f1 value: 4.860476294706458 - type: f1_weighted value: 6.8590598543643395 - type: main_score value: 9.663752521856086 task: type: Classification - dataset: config: lv name: MTEB MassiveIntentClassification (lv) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 22.32347007397445 - type: f1 value: 20.939653553666744 - type: f1_weighted value: 20.899939110877806 - type: main_score value: 22.32347007397445 task: type: Classification - dataset: config: ml name: MTEB MassiveIntentClassification (ml) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 2.390719569603228 - type: f1 value: 0.46817075523593493 - type: f1_weighted value: 0.8438228708667787 - type: main_score value: 2.390719569603228 task: type: Classification - dataset: config: mn name: MTEB MassiveIntentClassification (mn) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 28.994620040349695 - type: f1 value: 27.571069823401256 - type: f1_weighted value: 27.263930155378503 - type: main_score value: 28.994620040349695 task: type: Classification - dataset: config: ur name: MTEB MassiveIntentClassification (ur) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 2.4478816408876933 - type: f1 value: 1.497656725806116 - type: f1_weighted value: 1.5398763678691354 - type: main_score value: 2.4478816408876933 task: type: Classification - dataset: config: fa name: MTEB MassiveIntentClassification (fa) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 3.3355749831876267 - type: f1 value: 0.6816922655284716 - type: f1_weighted value: 1.0887948480367862 - type: main_score value: 3.3355749831876267 task: type: Classification - dataset: config: ro name: MTEB MassiveIntentClassification (ro) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 31.72494956287828 - type: f1 value: 29.577749786404826 - type: f1_weighted value: 29.551193355600514 - type: main_score value: 31.72494956287828 task: type: Classification - dataset: config: is name: MTEB MassiveIntentClassification (is) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 24.845326160053798 - type: f1 value: 22.11363990784136 - type: f1_weighted value: 23.65026728412048 - type: main_score value: 24.845326160053798 task: type: Classification - dataset: config: en name: MTEB MassiveIntentClassification (en) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 50.164761264290526 - type: f1 value: 47.85763581891828 - type: f1_weighted value: 48.98444884040328 - type: main_score value: 50.164761264290526 task: type: Classification - dataset: config: hu name: MTEB MassiveIntentClassification (hu) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 25.524546065904502 - type: f1 value: 23.753046097467873 - type: f1_weighted value: 23.826312126027823 - type: main_score value: 25.524546065904502 task: type: Classification - dataset: config: fr name: MTEB MassiveIntentClassification (fr) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 31.50638870208473 - type: f1 value: 31.370642915213388 - type: f1_weighted value: 30.505546915456012 - type: main_score value: 31.50638870208473 task: type: Classification - dataset: config: th name: MTEB MassiveIntentClassification (th) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 3.739071956960323 - type: f1 value: 1.411228354273586 - type: f1_weighted value: 1.216275118762689 - type: main_score value: 3.739071956960323 task: type: Classification - dataset: config: de name: MTEB MassiveIntentClassification (de) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 32.1049092131809 - type: f1 value: 29.794603179718106 - type: f1_weighted value: 30.137050786689766 - type: main_score value: 32.1049092131809 task: type: Classification - dataset: config: tr name: MTEB MassiveIntentClassification (tr) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 27.562205783456626 - type: f1 value: 25.683266426146687 - type: f1_weighted value: 25.803636686733057 - type: main_score value: 27.562205783456626 task: type: Classification - dataset: config: pt name: MTEB MassiveIntentClassification (pt) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 34.347679892400805 - type: f1 value: 31.465774161046767 - type: f1_weighted value: 31.735356981669327 - type: main_score value: 34.347679892400805 task: type: Classification - dataset: config: sq name: MTEB MassiveIntentClassification (sq) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 32.38063214525891 - type: f1 value: 29.53168994128031 - type: f1_weighted value: 30.112896935570273 - type: main_score value: 32.38063214525891 task: type: Classification - dataset: config: zh-TW name: MTEB MassiveIntentClassification (zh-TW) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 6.809011432414256 - type: f1 value: 5.205218706422693 - type: f1_weighted value: 5.178287349465675 - type: main_score value: 6.809011432414256 task: type: Classification - dataset: config: hy name: MTEB MassiveIntentClassification (hy) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 2.723604572965703 - type: f1 value: 0.6429150866665544 - type: f1_weighted value: 0.9113227866994432 - type: main_score value: 2.723604572965703 task: type: Classification - dataset: config: da name: MTEB MassiveIntentClassification (da) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 33.95427034297243 - type: f1 value: 32.204428726904936 - type: f1_weighted value: 32.47064251083498 - type: main_score value: 33.95427034297243 task: type: Classification - dataset: config: af name: MTEB MassiveIntentClassification (af) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 30.403496973772697 - type: f1 value: 27.814640020382342 - type: f1_weighted value: 29.552471475522786 - type: main_score value: 30.403496973772697 task: type: Classification - dataset: config: ar name: MTEB MassiveIntentClassification (ar) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 3.796234028244788 - type: f1 value: 2.4115955159178712 - type: f1_weighted value: 2.9705530799117428 - type: main_score value: 3.796234028244788 task: type: Classification - dataset: config: jv name: MTEB MassiveIntentClassification (jv) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 28.533960995292528 - type: f1 value: 26.21221777741412 - type: f1_weighted value: 27.072811075990217 - type: main_score value: 28.533960995292528 task: type: Classification - dataset: config: te name: MTEB MassiveIntentClassification (te) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 2.2125084061869535 - type: f1 value: 1.0173733514352028 - type: f1_weighted value: 1.316987953476142 - type: main_score value: 2.2125084061869535 task: type: Classification - dataset: config: tl name: MTEB MassiveIntentClassification (tl) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 32.017484868863484 - type: f1 value: 29.32295890060929 - type: f1_weighted value: 29.657369574195414 - type: main_score value: 32.017484868863484 task: type: Classification - dataset: config: sw name: MTEB MassiveIntentClassification (sw) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 27.790854068594484 - type: f1 value: 26.66461334490106 - type: f1_weighted value: 26.3309301465354 - type: main_score value: 27.790854068594484 task: type: Classification - dataset: config: ja name: MTEB MassiveIntentClassification (ja) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 5.611970410221924 - type: f1 value: 3.949675565526302 - type: f1_weighted value: 3.8008532811790516 - type: main_score value: 5.611970410221924 task: type: Classification - dataset: config: ms name: MTEB MassiveIntentClassification (ms) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 28.940820443846675 - type: f1 value: 26.913943613442726 - type: f1_weighted value: 27.58112937211184 - type: main_score value: 28.940820443846675 task: type: Classification - dataset: config: nb name: MTEB MassiveIntentClassification (nb) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 32.29993275050437 - type: f1 value: 30.38953729738546 - type: f1_weighted value: 30.973971090234315 - type: main_score value: 32.29993275050437 task: type: Classification - dataset: config: fi name: MTEB MassiveIntentClassification (fi) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 31.13315400134499 - type: f1 value: 28.151659309577315 - type: f1_weighted value: 28.919992380957805 - type: main_score value: 31.13315400134499 task: type: Classification - dataset: config: id name: MTEB MassiveIntentClassification (id) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 33.56422326832549 - type: f1 value: 32.13999124730796 - type: f1_weighted value: 31.821742347727334 - type: main_score value: 33.56422326832549 task: type: Classification - dataset: config: cy name: MTEB MassiveIntentClassification (cy) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 31.68123739071957 - type: f1 value: 28.08132049625695 - type: f1_weighted value: 30.136632177167293 - type: main_score value: 31.68123739071957 task: type: Classification - dataset: config: sl name: MTEB MassiveIntentClassification (sl) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 31.388702084734366 - type: f1 value: 30.06510634561652 - type: f1_weighted value: 29.575793355168027 - type: main_score value: 31.388702084734366 task: type: Classification - dataset: config: es name: MTEB MassiveIntentClassification (es) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 31.032279757901815 - type: f1 value: 30.20555955874916 - type: f1_weighted value: 28.87618616461917 - type: main_score value: 31.032279757901815 task: type: Classification - dataset: config: bn name: MTEB MassiveIntentClassification (bn) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 3.0766644250168125 - type: f1 value: 1.1659097449170488 - type: f1_weighted value: 1.6261385516847686 - type: main_score value: 3.0766644250168125 task: type: Classification - dataset: config: sv name: MTEB MassiveIntentClassification (sv) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 30.22864828513786 - type: f1 value: 29.514038012557155 - type: f1_weighted value: 28.79006788550934 - type: main_score value: 30.22864828513786 task: type: Classification - dataset: config: ru name: MTEB MassiveIntentClassification (ru) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 57.97915265635507 - type: f1 value: 56.5014953445001 - type: f1_weighted value: 56.64147015986123 - type: main_score value: 57.97915265635507 task: type: Classification - dataset: config: az name: MTEB MassiveIntentClassification (az) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 23.577673167451245 - type: f1 value: 23.44310534002699 - type: f1_weighted value: 22.73388843513862 - type: main_score value: 23.577673167451245 task: type: Classification - dataset: config: it name: MTEB MassiveIntentClassification (it) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 35.24209818426362 - type: f1 value: 34.17643389765681 - type: f1_weighted value: 31.88705168526876 - type: main_score value: 35.24209818426362 task: type: Classification - dataset: config: pl name: MTEB MassiveIntentClassification (pl) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 26.815736381977135 - type: f1 value: 23.59490629738082 - type: f1_weighted value: 24.824019034766742 - type: main_score value: 26.815736381977135 task: type: Classification - dataset: config: vi name: MTEB MassiveIntentClassification (vi) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 23.71889710827169 - type: f1 value: 20.9474996841838 - type: f1_weighted value: 21.8696712485011 - type: main_score value: 23.71889710827169 task: type: Classification - dataset: config: ta name: MTEB MassiveIntentClassification (ta) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 1.4996637525218561 - type: f1 value: 0.3621176226135693 - type: f1_weighted value: 0.40253328041710507 - type: main_score value: 1.4996637525218561 task: type: Classification - dataset: config: he name: MTEB MassiveIntentClassification (he) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 2.2461331540013454 - type: f1 value: 0.590566331230622 - type: f1_weighted value: 0.6162176049666722 - type: main_score value: 2.2461331540013454 task: type: Classification - dataset: config: nl name: MTEB MassiveIntentClassification (nl) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 32.43779421654338 - type: f1 value: 29.65516413448003 - type: f1_weighted value: 30.056107103546008 - type: main_score value: 32.43779421654338 task: type: Classification - dataset: config: km name: MTEB MassiveIntentClassification (km) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 5.137861466039005 - type: f1 value: 1.5034651435201778 - type: f1_weighted value: 1.8580225168667703 - type: main_score value: 5.137861466039005 task: type: Classification - dataset: config: zh-CN name: MTEB MassiveIntentClassification (zh-CN) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 5.15002459419577 - type: f1 value: 3.2849878732080238 - type: f1_weighted value: 3.171516129361724 - type: main_score value: 5.15002459419577 task: type: Classification - dataset: config: ko name: MTEB MassiveIntentClassification (ko) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 2.3610427939006393 - type: f1 value: 0.6344240632132025 - type: f1_weighted value: 0.8741011326135733 - type: main_score value: 2.3610427939006393 task: type: Classification - dataset: config: hi name: MTEB MassiveIntentClassification (hi) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 2.4299065420560746 - type: f1 value: 1.1990062972384772 - type: f1_weighted value: 1.2846405130538945 - type: main_score value: 2.4299065420560746 task: type: Classification - dataset: config: kn name: MTEB MassiveIntentClassification (kn) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 2.100344318740777 - type: f1 value: 1.0691096895187684 - type: f1_weighted value: 1.0245515267986838 - type: main_score value: 2.100344318740777 task: type: Classification - dataset: config: ka name: MTEB MassiveIntentClassification (ka) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 2.144613871126414 - type: f1 value: 0.38751721719666626 - type: f1_weighted value: 0.5494302003085859 - type: main_score value: 2.144613871126414 task: type: Classification - dataset: config: am name: MTEB MassiveIntentClassification (am) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 2.1347761928184945 - type: f1 value: 0.7186972868374003 - type: f1_weighted value: 0.8692320111678621 - type: main_score value: 2.1347761928184945 task: type: Classification - dataset: config: my name: MTEB MassiveIntentClassification (my) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 3.9744220363994094 - type: f1 value: 1.320159702083562 - type: f1_weighted value: 1.6615339662178419 - type: main_score value: 3.9744220363994094 task: type: Classification - dataset: config: el name: MTEB MassiveIntentClassification (el) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 8.740777176586326 - type: f1 value: 4.625508580628892 - type: f1_weighted value: 5.910937912610004 - type: main_score value: 8.740777176586326 task: type: Classification - dataset: config: lv name: MTEB MassiveIntentClassification (lv) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 22.056074766355138 - type: f1 value: 20.067449871163735 - type: f1_weighted value: 20.679581641637213 - type: main_score value: 22.056074766355138 task: type: Classification - dataset: config: ml name: MTEB MassiveIntentClassification (ml) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 2.287260206591244 - type: f1 value: 0.5144479181790914 - type: f1_weighted value: 0.7532382956194585 - type: main_score value: 2.287260206591244 task: type: Classification - dataset: config: mn name: MTEB MassiveIntentClassification (mn) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 28.514510575504183 - type: f1 value: 27.670683007330656 - type: f1_weighted value: 26.797727875405965 - type: main_score value: 28.514510575504183 task: type: Classification - dataset: config: ur name: MTEB MassiveIntentClassification (ur) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 2.5528775209050663 - type: f1 value: 1.5528439347982526 - type: f1_weighted value: 1.59863069765228 - type: main_score value: 2.5528775209050663 task: type: Classification - dataset: config: fa name: MTEB MassiveIntentClassification (fa) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 3.1578947368421053 - type: f1 value: 0.612147286970534 - type: f1_weighted value: 0.9311100758788083 - type: main_score value: 3.1578947368421053 task: type: Classification - dataset: config: ro name: MTEB MassiveIntentClassification (ro) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 30.472208558780135 - type: f1 value: 28.570236227937524 - type: f1_weighted value: 29.26182782217857 - type: main_score value: 30.472208558780135 task: type: Classification - dataset: config: is name: MTEB MassiveIntentClassification (is) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 24.12690605017216 - type: f1 value: 21.730073248467978 - type: f1_weighted value: 23.3232094260056 - type: main_score value: 24.12690605017216 task: type: Classification - dataset: config: en name: MTEB MassiveIntentClassification (en) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 50.6837186424004 - type: f1 value: 46.24633043195857 - type: f1_weighted value: 49.89222156091109 - type: main_score value: 50.6837186424004 task: type: Classification - dataset: config: hu name: MTEB MassiveIntentClassification (hu) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 24.869650762420065 - type: f1 value: 22.646829281311646 - type: f1_weighted value: 23.75607068147335 - type: main_score value: 24.869650762420065 task: type: Classification - dataset: config: fr name: MTEB MassiveIntentClassification (fr) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 30.83620265617314 - type: f1 value: 30.12388095110573 - type: f1_weighted value: 29.755084946082466 - type: main_score value: 30.83620265617314 task: type: Classification - dataset: config: th name: MTEB MassiveIntentClassification (th) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 3.7924249877029017 - type: f1 value: 1.3490081402255192 - type: f1_weighted value: 1.1964792923823864 - type: main_score value: 3.7924249877029017 task: type: Classification - dataset: config: de name: MTEB MassiveIntentClassification (de) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 30.85095917363502 - type: f1 value: 28.76898470499743 - type: f1_weighted value: 29.742721084026552 - type: main_score value: 30.85095917363502 task: type: Classification - dataset: config: tr name: MTEB MassiveIntentClassification (tr) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 26.22233152975898 - type: f1 value: 24.13532374526957 - type: f1_weighted value: 24.801681753477833 - type: main_score value: 26.22233152975898 task: type: Classification - dataset: config: pt name: MTEB MassiveIntentClassification (pt) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 33.85145105755042 - type: f1 value: 30.993852084910046 - type: f1_weighted value: 31.47706557692265 - type: main_score value: 33.85145105755042 task: type: Classification - dataset: config: sq name: MTEB MassiveIntentClassification (sq) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 31.69699950811608 - type: f1 value: 28.43551777754717 - type: f1_weighted value: 29.35991647173387 - type: main_score value: 31.69699950811608 task: type: Classification - dataset: config: zh-TW name: MTEB MassiveIntentClassification (zh-TW) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 6.296114117068371 - type: f1 value: 4.469538815411268 - type: f1_weighted value: 4.470912934534107 - type: main_score value: 6.296114117068371 task: type: Classification - dataset: config: hy name: MTEB MassiveIntentClassification (hy) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 2.6660108214461387 - type: f1 value: 0.7095128645283928 - type: f1_weighted value: 0.900359447084975 - type: main_score value: 2.6660108214461387 task: type: Classification - dataset: config: da name: MTEB MassiveIntentClassification (da) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 32.24790949335957 - type: f1 value: 30.09602016401104 - type: f1_weighted value: 31.27365296679004 - type: main_score value: 32.24790949335957 task: type: Classification - dataset: config: af name: MTEB MassiveIntentClassification (af) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 29.85243482538121 - type: f1 value: 27.02898547703625 - type: f1_weighted value: 29.19825733648402 - type: main_score value: 29.85243482538121 task: type: Classification - dataset: config: ar name: MTEB MassiveIntentClassification (ar) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 3.413674372848008 - type: f1 value: 2.3814730307183596 - type: f1_weighted value: 2.758592436005351 - type: main_score value: 3.413674372848008 task: type: Classification - dataset: config: jv name: MTEB MassiveIntentClassification (jv) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 27.59960649286769 - type: f1 value: 25.169829835887036 - type: f1_weighted value: 26.378021821617065 - type: main_score value: 27.59960649286769 task: type: Classification - dataset: config: te name: MTEB MassiveIntentClassification (te) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 2.0363994097393014 - type: f1 value: 0.7934004289138196 - type: f1_weighted value: 1.1834679007875544 - type: main_score value: 2.0363994097393014 task: type: Classification - dataset: config: tl name: MTEB MassiveIntentClassification (tl) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 31.43630103295622 - type: f1 value: 28.28710817943075 - type: f1_weighted value: 29.47693147061905 - type: main_score value: 31.43630103295622 task: type: Classification - dataset: config: sw name: MTEB MassiveIntentClassification (sw) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 27.515986227250366 - type: f1 value: 25.65654395144761 - type: f1_weighted value: 26.414094210360055 - type: main_score value: 27.515986227250366 task: type: Classification - dataset: config: ja name: MTEB MassiveIntentClassification (ja) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 5.986227250368913 - type: f1 value: 3.9449730568824433 - type: f1_weighted value: 3.8102259721047833 - type: main_score value: 5.986227250368913 task: type: Classification - dataset: config: ms name: MTEB MassiveIntentClassification (ms) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 28.155435317265127 - type: f1 value: 25.708172487585202 - type: f1_weighted value: 27.024916707588677 - type: main_score value: 28.155435317265127 task: type: Classification - dataset: config: nb name: MTEB MassiveIntentClassification (nb) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 31.485489424495817 - type: f1 value: 29.47639008406045 - type: f1_weighted value: 30.377692398014027 - type: main_score value: 31.485489424495817 task: type: Classification - dataset: config: fi name: MTEB MassiveIntentClassification (fi) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 30.403344810624695 - type: f1 value: 26.82843832763937 - type: f1_weighted value: 28.11110907470959 - type: main_score value: 30.403344810624695 task: type: Classification - dataset: config: id name: MTEB MassiveIntentClassification (id) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 32.70044269552386 - type: f1 value: 30.910774335551594 - type: f1_weighted value: 31.371749140831422 - type: main_score value: 32.70044269552386 task: type: Classification - dataset: config: cy name: MTEB MassiveIntentClassification (cy) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 29.429414658140686 - type: f1 value: 25.594886516936256 - type: f1_weighted value: 28.392261199556877 - type: main_score value: 29.429414658140686 task: type: Classification - dataset: config: sl name: MTEB MassiveIntentClassification (sl) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 29.636005902606982 - type: f1 value: 28.287023938527234 - type: f1_weighted value: 27.924913519954554 - type: main_score value: 29.636005902606982 task: type: Classification - dataset: config: es name: MTEB MassiveIntentClassification (es) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 30.63453025086079 - type: f1 value: 29.5921601385162 - type: f1_weighted value: 28.58410607526952 - type: main_score value: 30.63453025086079 task: type: Classification - dataset: config: bn name: MTEB MassiveIntentClassification (bn) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 2.867683226758485 - type: f1 value: 1.0374630680286294 - type: f1_weighted value: 1.3261691151267023 - type: main_score value: 2.867683226758485 task: type: Classification - dataset: config: sv name: MTEB MassiveIntentClassification (sv) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 29.754058042302017 - type: f1 value: 27.921243093926957 - type: f1_weighted value: 28.600526975101815 - type: main_score value: 29.754058042302017 task: type: Classification - dataset: config: ru name: MTEB MassiveIntentClassification (ru) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 58.06197737333989 - type: f1 value: 53.92404816772661 - type: f1_weighted value: 56.72057857737771 - type: main_score value: 58.06197737333989 task: type: Classification - dataset: config: az name: MTEB MassiveIntentClassification (az) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 22.725036891293655 - type: f1 value: 22.05764593465915 - type: f1_weighted value: 22.36326529771844 - type: main_score value: 22.725036891293655 task: type: Classification - dataset: config: it name: MTEB MassiveIntentClassification (it) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 34.57943925233645 - type: f1 value: 33.54269802516337 - type: f1_weighted value: 31.59380780190696 - type: main_score value: 34.57943925233645 task: type: Classification - dataset: config: pl name: MTEB MassiveIntentClassification (pl) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 26.050172159370387 - type: f1 value: 23.37018289487783 - type: f1_weighted value: 24.52891801190779 - type: main_score value: 26.050172159370387 task: type: Classification - dataset: config: vi name: MTEB MassiveIntentClassification (vi) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 23.10378750614855 - type: f1 value: 19.634766811442688 - type: f1_weighted value: 21.39922163237278 - type: main_score value: 23.10378750614855 task: type: Classification - dataset: config: ta name: MTEB MassiveIntentClassification (ta) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 1.382193802262666 - type: f1 value: 0.2962201919122291 - type: f1_weighted value: 0.36568543738308745 - type: main_score value: 1.382193802262666 task: type: Classification - dataset: config: he name: MTEB MassiveIntentClassification (he) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 2.0560747663551404 - type: f1 value: 0.4742414282381403 - type: f1_weighted value: 0.5861893507001308 - type: main_score value: 2.0560747663551404 task: type: Classification - dataset: config: nl name: MTEB MassiveIntentClassification (nl) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 30.5115592720118 - type: f1 value: 27.61045064110582 - type: f1_weighted value: 28.987990654116114 - type: main_score value: 30.5115592720118 task: type: Classification - dataset: config: km name: MTEB MassiveIntentClassification (km) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 4.377766847024103 - type: f1 value: 1.2676703377671132 - type: f1_weighted value: 1.426174554035529 - type: main_score value: 4.377766847024103 task: type: Classification - dataset: config: zh-CN name: MTEB MassiveScenarioClassification (zh-CN) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 10.601882985877605 - type: f1 value: 6.8689500634035365 - type: f1_weighted value: 8.260029142337519 - type: main_score value: 10.601882985877605 task: type: Classification - dataset: config: ko name: MTEB MassiveScenarioClassification (ko) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 5.62542030934768 - type: f1 value: 1.9399090161521315 - type: f1_weighted value: 1.7790298099358886 - type: main_score value: 5.62542030934768 task: type: Classification - dataset: config: hi name: MTEB MassiveScenarioClassification (hi) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 7.407531943510423 - type: f1 value: 3.622072056826428 - type: f1_weighted value: 3.444172662951229 - type: main_score value: 7.407531943510423 task: type: Classification - dataset: config: kn name: MTEB MassiveScenarioClassification (kn) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 7.602555480833894 - type: f1 value: 3.9001734711485803 - type: f1_weighted value: 3.4912256692008397 - type: main_score value: 7.602555480833894 task: type: Classification - dataset: config: ka name: MTEB MassiveScenarioClassification (ka) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 7.010759919300605 - type: f1 value: 2.1485666974093878 - type: f1_weighted value: 2.3739456428263477 - type: main_score value: 7.010759919300605 task: type: Classification - dataset: config: am name: MTEB MassiveScenarioClassification (am) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 7.679892400806995 - type: f1 value: 2.728187383195907 - type: f1_weighted value: 3.0454310752856353 - type: main_score value: 7.679892400806995 task: type: Classification - dataset: config: my name: MTEB MassiveScenarioClassification (my) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 10.729657027572292 - type: f1 value: 4.138439669406968 - type: f1_weighted value: 4.843092536146883 - type: main_score value: 10.729657027572292 task: type: Classification - dataset: config: el name: MTEB MassiveScenarioClassification (el) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 17.952252858103563 - type: f1 value: 12.418135741505608 - type: f1_weighted value: 15.228054842385186 - type: main_score value: 17.952252858103563 task: type: Classification - dataset: config: lv name: MTEB MassiveScenarioClassification (lv) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 29.29388029589779 - type: f1 value: 25.95638727776611 - type: f1_weighted value: 27.82646328315652 - type: main_score value: 29.29388029589779 task: type: Classification - dataset: config: ml name: MTEB MassiveScenarioClassification (ml) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 6.923335574983189 - type: f1 value: 2.2338102382542795 - type: f1_weighted value: 2.837475945704109 - type: main_score value: 6.923335574983189 task: type: Classification - dataset: config: mn name: MTEB MassiveScenarioClassification (mn) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 33.70208473436449 - type: f1 value: 31.451013524608147 - type: f1_weighted value: 33.4571016718763 - type: main_score value: 33.70208473436449 task: type: Classification - dataset: config: ur name: MTEB MassiveScenarioClassification (ur) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 8.530598520511097 - type: f1 value: 3.993356806346034 - type: f1_weighted value: 4.275297414153249 - type: main_score value: 8.530598520511097 task: type: Classification - dataset: config: fa name: MTEB MassiveScenarioClassification (fa) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 6.6240753194351045 - type: f1 value: 2.559179690443991 - type: f1_weighted value: 2.8775036329690353 - type: main_score value: 6.6240753194351045 task: type: Classification - dataset: config: ro name: MTEB MassiveScenarioClassification (ro) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 40.01681237390719 - type: f1 value: 36.15548220887307 - type: f1_weighted value: 38.91143847106075 - type: main_score value: 40.01681237390719 task: type: Classification - dataset: config: is name: MTEB MassiveScenarioClassification (is) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 33.10356422326833 - type: f1 value: 29.87073203020746 - type: f1_weighted value: 32.736926298821786 - type: main_score value: 33.10356422326833 task: type: Classification - dataset: config: en name: MTEB MassiveScenarioClassification (en) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 61.291190316072644 - type: f1 value: 58.09487277036398 - type: f1_weighted value: 60.52223749579593 - type: main_score value: 61.291190316072644 task: type: Classification - dataset: config: hu name: MTEB MassiveScenarioClassification (hu) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 36.40551445864156 - type: f1 value: 32.12815170334265 - type: f1_weighted value: 35.421611675898745 - type: main_score value: 36.40551445864156 task: type: Classification - dataset: config: fr name: MTEB MassiveScenarioClassification (fr) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 42.90181573638198 - type: f1 value: 39.00450485042174 - type: f1_weighted value: 41.74577968212385 - type: main_score value: 42.90181573638198 task: type: Classification - dataset: config: th name: MTEB MassiveScenarioClassification (th) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 8.261600537995966 - type: f1 value: 3.8946817615361597 - type: f1_weighted value: 3.7437491646031926 - type: main_score value: 8.261600537995966 task: type: Classification - dataset: config: de name: MTEB MassiveScenarioClassification (de) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 42.07128446536651 - type: f1 value: 38.28996078984755 - type: f1_weighted value: 41.04738811504033 - type: main_score value: 42.07128446536651 task: type: Classification - dataset: config: tr name: MTEB MassiveScenarioClassification (tr) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 34.845326160053794 - type: f1 value: 32.52170618407094 - type: f1_weighted value: 33.35658510579412 - type: main_score value: 34.845326160053794 task: type: Classification - dataset: config: pt name: MTEB MassiveScenarioClassification (pt) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 40.78681909885676 - type: f1 value: 37.33575502776686 - type: f1_weighted value: 38.66002021299529 - type: main_score value: 40.78681909885676 task: type: Classification - dataset: config: sq name: MTEB MassiveScenarioClassification (sq) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 42.65635507733692 - type: f1 value: 38.53947437411434 - type: f1_weighted value: 41.52520693995739 - type: main_score value: 42.65635507733692 task: type: Classification - dataset: config: zh-TW name: MTEB MassiveScenarioClassification (zh-TW) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 11.926698049764628 - type: f1 value: 8.724194514820493 - type: f1_weighted value: 10.266244979280504 - type: main_score value: 11.926698049764628 task: type: Classification - dataset: config: hy name: MTEB MassiveScenarioClassification (hy) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 8.779421654337593 - type: f1 value: 3.47659510611439 - type: f1_weighted value: 4.092370736159162 - type: main_score value: 8.779421654337593 task: type: Classification - dataset: config: da name: MTEB MassiveScenarioClassification (da) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 43.6852723604573 - type: f1 value: 39.338012150585094 - type: f1_weighted value: 43.3756140521009 - type: main_score value: 43.6852723604573 task: type: Classification - dataset: config: af name: MTEB MassiveScenarioClassification (af) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 40.83725622057835 - type: f1 value: 36.67993326074695 - type: f1_weighted value: 40.73536387442413 - type: main_score value: 40.83725622057835 task: type: Classification - dataset: config: ar name: MTEB MassiveScenarioClassification (ar) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 11.859448554135843 - type: f1 value: 6.502577103628851 - type: f1_weighted value: 9.922384035467028 - type: main_score value: 11.859448554135843 task: type: Classification - dataset: config: jv name: MTEB MassiveScenarioClassification (jv) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 37.22932078009414 - type: f1 value: 34.37198836784653 - type: f1_weighted value: 36.41682430619207 - type: main_score value: 37.22932078009414 task: type: Classification - dataset: config: te name: MTEB MassiveScenarioClassification (te) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 6.909885675857431 - type: f1 value: 2.659712889039866 - type: f1_weighted value: 3.315252295282912 - type: main_score value: 6.909885675857431 task: type: Classification - dataset: config: tl name: MTEB MassiveScenarioClassification (tl) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 38.157363819771355 - type: f1 value: 33.871383306341926 - type: f1_weighted value: 37.16844466757229 - type: main_score value: 38.157363819771355 task: type: Classification - dataset: config: sw name: MTEB MassiveScenarioClassification (sw) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 35.65904505716207 - type: f1 value: 32.95848641686319 - type: f1_weighted value: 33.46347965861419 - type: main_score value: 35.65904505716207 task: type: Classification - dataset: config: ja name: MTEB MassiveScenarioClassification (ja) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 10.601882985877605 - type: f1 value: 8.05499004226519 - type: f1_weighted value: 8.12291817923475 - type: main_score value: 10.601882985877605 task: type: Classification - dataset: config: ms name: MTEB MassiveScenarioClassification (ms) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 38.97108271687962 - type: f1 value: 34.19920488698337 - type: f1_weighted value: 37.406365439450006 - type: main_score value: 38.97108271687962 task: type: Classification - dataset: config: nb name: MTEB MassiveScenarioClassification (nb) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 39.04505716207128 - type: f1 value: 35.380977049887605 - type: f1_weighted value: 38.79082603370826 - type: main_score value: 39.04505716207128 task: type: Classification - dataset: config: fi name: MTEB MassiveScenarioClassification (fi) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 35.18829858776059 - type: f1 value: 30.972699263943966 - type: f1_weighted value: 34.66929745941575 - type: main_score value: 35.18829858776059 task: type: Classification - dataset: config: id name: MTEB MassiveScenarioClassification (id) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 39.53934095494284 - type: f1 value: 37.19939485401421 - type: f1_weighted value: 38.163540271879384 - type: main_score value: 39.53934095494284 task: type: Classification - dataset: config: cy name: MTEB MassiveScenarioClassification (cy) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 39.85205110961668 - type: f1 value: 34.567211938088086 - type: f1_weighted value: 38.93137139872493 - type: main_score value: 39.85205110961668 task: type: Classification - dataset: config: sl name: MTEB MassiveScenarioClassification (sl) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 35.978480161398785 - type: f1 value: 33.70493150778863 - type: f1_weighted value: 34.89613180942136 - type: main_score value: 35.978480161398785 task: type: Classification - dataset: config: es name: MTEB MassiveScenarioClassification (es) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 37.12508406186954 - type: f1 value: 34.14887874344704 - type: f1_weighted value: 35.491336292250615 - type: main_score value: 37.12508406186954 task: type: Classification - dataset: config: bn name: MTEB MassiveScenarioClassification (bn) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 8.846671149966376 - type: f1 value: 3.772079613264656 - type: f1_weighted value: 4.569880079881123 - type: main_score value: 8.846671149966376 task: type: Classification - dataset: config: sv name: MTEB MassiveScenarioClassification (sv) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 36.11970410221924 - type: f1 value: 33.64741825888341 - type: f1_weighted value: 36.04738800166304 - type: main_score value: 36.11970410221924 task: type: Classification - dataset: config: ru name: MTEB MassiveScenarioClassification (ru) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 62.89509078681911 - type: f1 value: 62.296937620668366 - type: f1_weighted value: 61.50844245234364 - type: main_score value: 62.89509078681911 task: type: Classification - dataset: config: az name: MTEB MassiveScenarioClassification (az) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 30.31607262945528 - type: f1 value: 27.373913596444382 - type: f1_weighted value: 29.154743431705356 - type: main_score value: 30.31607262945528 task: type: Classification - dataset: config: it name: MTEB MassiveScenarioClassification (it) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 42.68997982515131 - type: f1 value: 39.34921574451304 - type: f1_weighted value: 41.39971354124732 - type: main_score value: 42.68997982515131 task: type: Classification - dataset: config: pl name: MTEB MassiveScenarioClassification (pl) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 31.62071284465367 - type: f1 value: 27.53427875798914 - type: f1_weighted value: 30.442690748521006 - type: main_score value: 31.62071284465367 task: type: Classification - dataset: config: vi name: MTEB MassiveScenarioClassification (vi) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 31.889710827168795 - type: f1 value: 29.1527074423781 - type: f1_weighted value: 29.84128781391531 - type: main_score value: 31.889710827168795 task: type: Classification - dataset: config: ta name: MTEB MassiveScenarioClassification (ta) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 7.007397444519166 - type: f1 value: 1.763256752893296 - type: f1_weighted value: 2.3996756522652913 - type: main_score value: 7.007397444519166 task: type: Classification - dataset: config: he name: MTEB MassiveScenarioClassification (he) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 7.612642905178212 - type: f1 value: 2.0115132382174585 - type: f1_weighted value: 2.8178938596974503 - type: main_score value: 7.612642905178212 task: type: Classification - dataset: config: nl name: MTEB MassiveScenarioClassification (nl) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 40.93813046402152 - type: f1 value: 35.475977992563635 - type: f1_weighted value: 40.249098836834044 - type: main_score value: 40.93813046402152 task: type: Classification - dataset: config: km name: MTEB MassiveScenarioClassification (km) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 8.510423671822462 - type: f1 value: 2.77822187113745 - type: f1_weighted value: 3.488782507211019 - type: main_score value: 8.510423671822462 task: type: Classification - dataset: config: zh-CN name: MTEB MassiveScenarioClassification (zh-CN) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 10.560747663551401 - type: f1 value: 7.321692095226571 - type: f1_weighted value: 8.136926309421098 - type: main_score value: 10.560747663551401 task: type: Classification - dataset: config: ko name: MTEB MassiveScenarioClassification (ko) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 5.622233152975899 - type: f1 value: 1.7454943918873769 - type: f1_weighted value: 1.5544580080510706 - type: main_score value: 5.622233152975899 task: type: Classification - dataset: config: hi name: MTEB MassiveScenarioClassification (hi) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 7.50614854894245 - type: f1 value: 3.671558894965337 - type: f1_weighted value: 3.6075123924941224 - type: main_score value: 7.50614854894245 task: type: Classification - dataset: config: kn name: MTEB MassiveScenarioClassification (kn) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 8.047220855878013 - type: f1 value: 4.199596683728984 - type: f1_weighted value: 3.705979981207572 - type: main_score value: 8.047220855878013 task: type: Classification - dataset: config: ka name: MTEB MassiveScenarioClassification (ka) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 6.591244466305953 - type: f1 value: 1.9804826267181144 - type: f1_weighted value: 2.1652032753558714 - type: main_score value: 6.591244466305953 task: type: Classification - dataset: config: am name: MTEB MassiveScenarioClassification (am) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 7.511067388096411 - type: f1 value: 2.641163180255864 - type: f1_weighted value: 3.03599461945174 - type: main_score value: 7.511067388096411 task: type: Classification - dataset: config: my name: MTEB MassiveScenarioClassification (my) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 11.234628627643877 - type: f1 value: 4.53829675095688 - type: f1_weighted value: 5.119828126415879 - type: main_score value: 11.234628627643877 task: type: Classification - dataset: config: el name: MTEB MassiveScenarioClassification (el) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 16.438760452533202 - type: f1 value: 12.026293516540374 - type: f1_weighted value: 13.40697491103347 - type: main_score value: 16.438760452533202 task: type: Classification - dataset: config: lv name: MTEB MassiveScenarioClassification (lv) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 28.470241023118547 - type: f1 value: 26.06308403577423 - type: f1_weighted value: 26.913188635640108 - type: main_score value: 28.470241023118547 task: type: Classification - dataset: config: ml name: MTEB MassiveScenarioClassification (ml) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 7.34874569601574 - type: f1 value: 2.163368202700301 - type: f1_weighted value: 2.9794749471502735 - type: main_score value: 7.34874569601574 task: type: Classification - dataset: config: mn name: MTEB MassiveScenarioClassification (mn) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 33.482538121003444 - type: f1 value: 31.74224548475336 - type: f1_weighted value: 32.974792871093996 - type: main_score value: 33.482538121003444 task: type: Classification - dataset: config: ur name: MTEB MassiveScenarioClassification (ur) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 8.735858337432365 - type: f1 value: 4.387957216974412 - type: f1_weighted value: 4.487011850573568 - type: main_score value: 8.735858337432365 task: type: Classification - dataset: config: fa name: MTEB MassiveScenarioClassification (fa) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 6.8027545499262185 - type: f1 value: 2.724940339247371 - type: f1_weighted value: 2.9191909608862248 - type: main_score value: 6.8027545499262185 task: type: Classification - dataset: config: ro name: MTEB MassiveScenarioClassification (ro) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 39.77865223807182 - type: f1 value: 36.713842977439086 - type: f1_weighted value: 38.411147363742614 - type: main_score value: 39.77865223807182 task: type: Classification - dataset: config: is name: MTEB MassiveScenarioClassification (is) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 32.611903590752576 - type: f1 value: 30.478777350564933 - type: f1_weighted value: 32.33376716992967 - type: main_score value: 32.611903590752576 task: type: Classification - dataset: config: en name: MTEB MassiveScenarioClassification (en) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 60.81652729955731 - type: f1 value: 57.85686645797947 - type: f1_weighted value: 59.96336225413508 - type: main_score value: 60.81652729955731 task: type: Classification - dataset: config: hu name: MTEB MassiveScenarioClassification (hu) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 35.041810132808656 - type: f1 value: 32.32895536298411 - type: f1_weighted value: 34.08983039599136 - type: main_score value: 35.041810132808656 task: type: Classification - dataset: config: fr name: MTEB MassiveScenarioClassification (fr) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 42.4151500245942 - type: f1 value: 39.716877977971514 - type: f1_weighted value: 40.98904556640093 - type: main_score value: 42.4151500245942 task: type: Classification - dataset: config: th name: MTEB MassiveScenarioClassification (th) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 8.253812100344318 - type: f1 value: 4.2941598559113645 - type: f1_weighted value: 3.7137986151126743 - type: main_score value: 8.253812100344318 task: type: Classification - dataset: config: de name: MTEB MassiveScenarioClassification (de) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 40.65912444663059 - type: f1 value: 37.90162745459205 - type: f1_weighted value: 39.942707376839756 - type: main_score value: 40.65912444663059 task: type: Classification - dataset: config: tr name: MTEB MassiveScenarioClassification (tr) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 33.85145105755042 - type: f1 value: 32.41363211826809 - type: f1_weighted value: 32.696811929693745 - type: main_score value: 33.85145105755042 task: type: Classification - dataset: config: pt name: MTEB MassiveScenarioClassification (pt) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 40.22626660108214 - type: f1 value: 37.84448697275546 - type: f1_weighted value: 37.82059370217246 - type: main_score value: 40.22626660108214 task: type: Classification - dataset: config: sq name: MTEB MassiveScenarioClassification (sq) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 42.06591244466306 - type: f1 value: 38.76214747335659 - type: f1_weighted value: 40.65484003509404 - type: main_score value: 42.06591244466306 task: type: Classification - dataset: config: zh-TW name: MTEB MassiveScenarioClassification (zh-TW) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 11.682242990654206 - type: f1 value: 8.850699907144218 - type: f1_weighted value: 9.655517346069553 - type: main_score value: 11.682242990654206 task: type: Classification - dataset: config: hy name: MTEB MassiveScenarioClassification (hy) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 8.52926709296606 - type: f1 value: 3.4189589714301167 - type: f1_weighted value: 3.894511154092698 - type: main_score value: 8.52926709296606 task: type: Classification - dataset: config: da name: MTEB MassiveScenarioClassification (da) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 41.14117068371864 - type: f1 value: 38.08063702754415 - type: f1_weighted value: 40.65305294882936 - type: main_score value: 41.14117068371864 task: type: Classification - dataset: config: af name: MTEB MassiveScenarioClassification (af) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 39.3654697491392 - type: f1 value: 36.43369907401146 - type: f1_weighted value: 39.09920883835431 - type: main_score value: 39.3654697491392 task: type: Classification - dataset: config: ar name: MTEB MassiveScenarioClassification (ar) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 11.362518445646828 - type: f1 value: 6.2728348209099565 - type: f1_weighted value: 8.903159425462325 - type: main_score value: 11.362518445646828 task: type: Classification - dataset: config: jv name: MTEB MassiveScenarioClassification (jv) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 36.246925725528776 - type: f1 value: 34.242775177193415 - type: f1_weighted value: 34.90531238831363 - type: main_score value: 36.246925725528776 task: type: Classification - dataset: config: te name: MTEB MassiveScenarioClassification (te) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 6.861780619773734 - type: f1 value: 2.7017710457799873 - type: f1_weighted value: 3.1681349264113137 - type: main_score value: 6.861780619773734 task: type: Classification - dataset: config: tl name: MTEB MassiveScenarioClassification (tl) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 38.17019183472701 - type: f1 value: 34.777811838185485 - type: f1_weighted value: 36.90042555420213 - type: main_score value: 38.17019183472701 task: type: Classification - dataset: config: sw name: MTEB MassiveScenarioClassification (sw) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 35.32710280373832 - type: f1 value: 33.32826385073952 - type: f1_weighted value: 33.388725291289916 - type: main_score value: 35.32710280373832 task: type: Classification - dataset: config: ja name: MTEB MassiveScenarioClassification (ja) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 11.20511559272012 - type: f1 value: 8.976181412932425 - type: f1_weighted value: 8.576498601594645 - type: main_score value: 11.20511559272012 task: type: Classification - dataset: config: ms name: MTEB MassiveScenarioClassification (ms) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 38.85391047712739 - type: f1 value: 34.90571468739814 - type: f1_weighted value: 36.82763280572209 - type: main_score value: 38.85391047712739 task: type: Classification - dataset: config: nb name: MTEB MassiveScenarioClassification (nb) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 38.052139695031975 - type: f1 value: 35.272001887507564 - type: f1_weighted value: 37.42041278303434 - type: main_score value: 38.052139695031975 task: type: Classification - dataset: config: fi name: MTEB MassiveScenarioClassification (fi) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 34.500737825873095 - type: f1 value: 30.68780970737908 - type: f1_weighted value: 33.716051134823 - type: main_score value: 34.500737825873095 task: type: Classification - dataset: config: id name: MTEB MassiveScenarioClassification (id) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 39.596655189375305 - type: f1 value: 37.72092200675893 - type: f1_weighted value: 37.89234511492137 - type: main_score value: 39.596655189375305 task: type: Classification - dataset: config: cy name: MTEB MassiveScenarioClassification (cy) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 38.93261190359076 - type: f1 value: 34.67593293977394 - type: f1_weighted value: 37.58144266593478 - type: main_score value: 38.93261190359076 task: type: Classification - dataset: config: sl name: MTEB MassiveScenarioClassification (sl) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 35.336940482046245 - type: f1 value: 34.06391073492543 - type: f1_weighted value: 34.19964460077873 - type: main_score value: 35.336940482046245 task: type: Classification - dataset: config: es name: MTEB MassiveScenarioClassification (es) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 36.28135759960649 - type: f1 value: 33.98213113943637 - type: f1_weighted value: 34.432683108706726 - type: main_score value: 36.28135759960649 task: type: Classification - dataset: config: bn name: MTEB MassiveScenarioClassification (bn) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 8.789965568125922 - type: f1 value: 3.615951273986677 - type: f1_weighted value: 4.543124755655086 - type: main_score value: 8.789965568125922 task: type: Classification - dataset: config: sv name: MTEB MassiveScenarioClassification (sv) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 35.78947368421053 - type: f1 value: 33.641144471139874 - type: f1_weighted value: 35.35509200878473 - type: main_score value: 35.78947368421053 task: type: Classification - dataset: config: ru name: MTEB MassiveScenarioClassification (ru) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 64.14658140678799 - type: f1 value: 63.45318114952019 - type: f1_weighted value: 62.837233214870004 - type: main_score value: 64.14658140678799 task: type: Classification - dataset: config: az name: MTEB MassiveScenarioClassification (az) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 29.616330545991143 - type: f1 value: 27.89304924236733 - type: f1_weighted value: 28.557344732597763 - type: main_score value: 29.616330545991143 task: type: Classification - dataset: config: it name: MTEB MassiveScenarioClassification (it) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 41.1952779144122 - type: f1 value: 38.70295863724121 - type: f1_weighted value: 39.8087264213271 - type: main_score value: 41.1952779144122 task: type: Classification - dataset: config: pl name: MTEB MassiveScenarioClassification (pl) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 30.15248401377275 - type: f1 value: 27.24749237955316 - type: f1_weighted value: 29.24459561389263 - type: main_score value: 30.15248401377275 task: type: Classification - dataset: config: vi name: MTEB MassiveScenarioClassification (vi) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 31.942941465814062 - type: f1 value: 29.238187005403976 - type: f1_weighted value: 29.360530025850295 - type: main_score value: 31.942941465814062 task: type: Classification - dataset: config: ta name: MTEB MassiveScenarioClassification (ta) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 7.211018199704869 - type: f1 value: 1.858123064629565 - type: f1_weighted value: 2.531232017204237 - type: main_score value: 7.211018199704869 task: type: Classification - dataset: config: he name: MTEB MassiveScenarioClassification (he) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 7.948844072798819 - type: f1 value: 2.1010859887190896 - type: f1_weighted value: 3.0480176454133283 - type: main_score value: 7.948844072798819 task: type: Classification - dataset: config: nl name: MTEB MassiveScenarioClassification (nl) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 38.92277422528283 - type: f1 value: 35.488036321576146 - type: f1_weighted value: 38.18536556200914 - type: main_score value: 38.92277422528283 task: type: Classification - dataset: config: km name: MTEB MassiveScenarioClassification (km) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 8.150516478111165 - type: f1 value: 2.72691932389948 - type: f1_weighted value: 3.3948665965609117 - type: main_score value: 8.150516478111165 task: type: Classification - dataset: config: default name: MTEB MedrxivClusteringP2P (default) revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73 split: test type: mteb/medrxiv-clustering-p2p metrics: - type: main_score value: 20.786832589263845 - type: v_measure value: 20.786832589263845 - type: v_measure_std value: 1.6048001943974946 task: type: Clustering - dataset: config: default name: MTEB MedrxivClusteringS2S (default) revision: 35191c8c0dca72d8ff3efcd72aa802307d469663 split: test type: mteb/medrxiv-clustering-s2s metrics: - type: main_score value: 18.181247067178756 - type: v_measure value: 18.181247067178756 - type: v_measure_std value: 1.5798786706707373 task: type: Clustering - dataset: config: default name: MTEB NYSJudicialEthicsLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 45.20547945205479 - type: ap value: 50.160551683623055 - type: ap_weighted value: 50.160551683623055 - type: f1 value: 44.53941120607787 - type: f1_weighted value: 44.28963561383653 - type: main_score value: 45.20547945205479 task: type: Classification - dataset: config: default name: MTEB NewsClassification (default) revision: eb185aade064a813bc0b7f42de02595523103ca4 split: test type: fancyzhx/ag_news metrics: - type: accuracy value: 73.78552631578948 - type: f1 value: 73.47724204580956 - type: f1_weighted value: 73.47724204580956 - type: main_score value: 73.78552631578948 task: type: Classification - dataset: config: default name: MTEB OPP115DataRetentionLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 69.31818181818183 - type: ap value: 64.09705159705157 - type: ap_weighted value: 64.09705159705157 - type: f1 value: 69.12280701754385 - type: f1_weighted value: 69.12280701754386 - type: main_score value: 69.31818181818183 task: type: Classification - dataset: config: default name: MTEB OPP115DataSecurityLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 63.868065967016484 - type: ap value: 62.05622742346708 - type: ap_weighted value: 62.05622742346708 - type: f1 value: 60.25914242202488 - type: f1_weighted value: 60.22323273501004 - type: main_score value: 63.868065967016484 task: type: Classification - dataset: config: default name: MTEB OPP115DoNotTrackLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 88.18181818181819 - type: ap value: 85.12727272727273 - type: ap_weighted value: 85.12727272727273 - type: f1 value: 88.15734989648034 - type: f1_weighted value: 88.15734989648034 - type: main_score value: 88.18181818181819 task: type: Classification - dataset: config: default name: MTEB OPP115FirstPartyCollectionUseLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 69.55896452540749 - type: ap value: 64.53342029559877 - type: ap_weighted value: 64.53342029559877 - type: f1 value: 69.32286869541191 - type: f1_weighted value: 69.31770813082186 - type: main_score value: 69.55896452540749 task: type: Classification - dataset: config: default name: MTEB OPP115InternationalAndSpecificAudiencesLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 77.75510204081633 - type: ap value: 75.20843296586462 - type: ap_weighted value: 75.20843296586462 - type: f1 value: 77.09799280479909 - type: f1_weighted value: 77.11382676229348 - type: main_score value: 77.75510204081633 task: type: Classification - dataset: config: default name: MTEB OPP115PolicyChangeLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 89.0951276102088 - type: ap value: 87.15879085780726 - type: ap_weighted value: 87.15879085780726 - type: f1 value: 89.04203698995461 - type: f1_weighted value: 89.04380667729642 - type: main_score value: 89.0951276102088 task: type: Classification - dataset: config: default name: MTEB OPP115ThirdPartySharingCollectionLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 64.27672955974842 - type: ap value: 62.893075413619535 - type: ap_weighted value: 62.893075413619535 - type: f1 value: 60.459952085405675 - type: f1_weighted value: 60.4135944642598 - type: main_score value: 64.27672955974842 task: type: Classification - dataset: config: default name: MTEB OPP115UserAccessEditAndDeletionLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 67.09956709956711 - type: ap value: 62.92853137890984 - type: ap_weighted value: 62.92853137890984 - type: f1 value: 66.41414141414141 - type: f1_weighted value: 66.39337093882548 - type: main_score value: 67.09956709956711 task: type: Classification - dataset: config: default name: MTEB OPP115UserChoiceControlLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 70.69857697283311 - type: ap value: 63.961545634799855 - type: ap_weighted value: 63.961545634799855 - type: f1 value: 70.33565944829778 - type: f1_weighted value: 70.34414874711732 - type: main_score value: 70.69857697283311 task: type: Classification - dataset: config: default name: MTEB OralArgumentQuestionPurposeLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 20.51282051282051 - type: f1 value: 17.434477437885 - type: f1_weighted value: 21.50138868825342 - type: main_score value: 20.51282051282051 task: type: Classification - dataset: config: default name: MTEB OverrulingLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 69.580078125 - type: ap value: 64.66695246425695 - type: ap_weighted value: 64.66695246425695 - type: f1 value: 69.55969170904413 - type: f1_weighted value: 69.5473829295991 - type: main_score value: 69.580078125 task: type: Classification - dataset: config: default name: MTEB PROALegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 49.47368421052632 - type: ap value: 49.47368421052632 - type: ap_weighted value: 49.47368421052632 - type: f1 value: 33.09859154929578 - type: f1_weighted value: 32.750185322461085 - type: main_score value: 49.47368421052632 task: type: Classification - dataset: config: default name: MTEB PatentClassification (default) revision: 2f38a1dfdecfacee0184d74eaeafd3c0fb49d2a6 split: test type: ccdv/patent-classification metrics: - type: accuracy value: 29.306640625000004 - type: f1 value: 22.127646065227754 - type: f1_weighted value: 26.66185625260182 - type: main_score value: 29.306640625000004 task: type: Classification - dataset: config: default name: MTEB PersonalJurisdictionLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 51.99999999999999 - type: ap value: 44.107526881720425 - type: ap_weighted value: 44.107526881720425 - type: f1 value: 51.92307692307692 - type: f1_weighted value: 51.61538461538463 - type: main_score value: 51.99999999999999 task: type: Classification - dataset: config: default name: MTEB PoemSentimentClassification (default) revision: 329d529d875a00c47ec71954a1a96ae167584770 split: test type: google-research-datasets/poem_sentiment metrics: - type: accuracy value: 35.96153846153845 - type: f1 value: 25.717059445124445 - type: f1_weighted value: 42.39026561619051 - type: main_score value: 35.96153846153845 task: type: Classification - dataset: config: default name: MTEB PoemSentimentClassification (default) revision: 329d529d875a00c47ec71954a1a96ae167584770 split: validation type: google-research-datasets/poem_sentiment metrics: - type: accuracy value: 35.80952380952381 - type: f1 value: 26.76432080315997 - type: f1_weighted value: 41.90402765909788 - type: main_score value: 35.80952380952381 task: type: Classification - dataset: config: default name: MTEB RUParaPhraserSTS (default) revision: 43265056790b8f7c59e0139acb4be0a8dad2c8f4 split: test type: merionum/ru_paraphraser metrics: - type: cosine_pearson value: 65.17293362215221 - type: cosine_spearman value: 72.14872507255558 - type: euclidean_pearson value: 69.39028550512482 - type: euclidean_spearman value: 72.14872507255558 - type: main_score value: 72.14872507255558 - type: manhattan_pearson value: 69.30934614737492 - type: manhattan_spearman value: 72.04933049290007 task: type: STS - dataset: config: default name: MTEB RedditClustering (default) revision: 24640382cdbf8abc73003fb0fa6d111a705499eb split: test type: mteb/reddit-clustering metrics: - type: main_score value: 26.275710753496597 - type: v_measure value: 26.275710753496597 - type: v_measure_std value: 4.029689555202136 task: type: Clustering - dataset: config: default name: MTEB RedditClusteringP2P (default) revision: 385e3cb46b4cfa89021f56c4380204149d0efe33 split: test type: mteb/reddit-clustering-p2p metrics: - type: main_score value: 40.4828876757081 - type: v_measure value: 40.4828876757081 - type: v_measure_std value: 10.162859998011204 task: type: Clustering - dataset: config: default name: MTEB RiaNewsRetrieval (default) revision: 82374b0bbacda6114f39ff9c5b925fa1512ca5d7 split: test type: ai-forever/ria-news-retrieval metrics: - type: main_score value: 51.271 - type: map_at_1 value: 36.21 - type: map_at_10 value: 46.208 - type: map_at_100 value: 47.004000000000005 - type: map_at_1000 value: 47.044000000000004 - type: map_at_20 value: 46.693 - type: map_at_3 value: 43.669999999999995 - type: map_at_5 value: 45.196 - type: mrr_at_1 value: 36.22 - type: mrr_at_10 value: 46.21178571428571 - type: mrr_at_100 value: 47.007420014661236 - type: mrr_at_1000 value: 47.04734848842366 - type: mrr_at_20 value: 46.69688042104938 - type: mrr_at_3 value: 43.668333333333585 - type: mrr_at_5 value: 45.199833333333274 - type: nauc_map_at_1000_diff1 value: 46.94937854830209 - type: nauc_map_at_1000_max value: 20.810031674720868 - type: nauc_map_at_1000_std value: -2.8474964036416845 - type: nauc_map_at_100_diff1 value: 46.93710679472339 - type: nauc_map_at_100_max value: 20.808355966268614 - type: nauc_map_at_100_std value: -2.8341393346842607 - type: nauc_map_at_10_diff1 value: 46.85305633304179 - type: nauc_map_at_10_max value: 20.74714400194472 - type: nauc_map_at_10_std value: -3.0251519873045534 - type: nauc_map_at_1_diff1 value: 52.76907950247656 - type: nauc_map_at_1_max value: 20.909404191190152 - type: nauc_map_at_1_std value: -4.486212769404569 - type: nauc_map_at_20_diff1 value: 46.854283528399826 - type: nauc_map_at_20_max value: 20.774565284237017 - type: nauc_map_at_20_std value: -2.8952917224271846 - type: nauc_map_at_3_diff1 value: 47.6120187355803 - type: nauc_map_at_3_max value: 20.94624350299643 - type: nauc_map_at_3_std value: -3.5249841066101704 - type: nauc_map_at_5_diff1 value: 46.961741404854 - type: nauc_map_at_5_max value: 20.84061893727113 - type: nauc_map_at_5_std value: -3.2560895841762707 - type: nauc_mrr_at_1000_diff1 value: 46.94210158390746 - type: nauc_mrr_at_1000_max value: 20.823017819566672 - type: nauc_mrr_at_1000_std value: -2.873564388596409 - type: nauc_mrr_at_100_diff1 value: 46.92983853646228 - type: nauc_mrr_at_100_max value: 20.821328345843625 - type: nauc_mrr_at_100_std value: -2.860179131955564 - type: nauc_mrr_at_10_diff1 value: 46.845920501930316 - type: nauc_mrr_at_10_max value: 20.760199941251056 - type: nauc_mrr_at_10_std value: -3.0506119945281385 - type: nauc_mrr_at_1_diff1 value: 52.7384650230153 - type: nauc_mrr_at_1_max value: 20.916918175962735 - type: nauc_mrr_at_1_std value: -4.553119995428164 - type: nauc_mrr_at_20_diff1 value: 46.84707480256205 - type: nauc_mrr_at_20_max value: 20.78745076885492 - type: nauc_mrr_at_20_std value: -2.921144125415831 - type: nauc_mrr_at_3_diff1 value: 47.621438923503305 - type: nauc_mrr_at_3_max value: 20.964983104645327 - type: nauc_mrr_at_3_std value: -3.5359639119054154 - type: nauc_mrr_at_5_diff1 value: 46.95496065526142 - type: nauc_mrr_at_5_max value: 20.85370692098222 - type: nauc_mrr_at_5_std value: -3.2815901993324985 - type: nauc_ndcg_at_1000_diff1 value: 45.22512963946746 - type: nauc_ndcg_at_1000_max value: 20.827437126737433 - type: nauc_ndcg_at_1000_std value: -1.5970972641072643 - type: nauc_ndcg_at_100_diff1 value: 44.870296183306195 - type: nauc_ndcg_at_100_max value: 20.734194655306457 - type: nauc_ndcg_at_100_std value: -1.1285720744844427 - type: nauc_ndcg_at_10_diff1 value: 44.428914407493004 - type: nauc_ndcg_at_10_max value: 20.440243514420057 - type: nauc_ndcg_at_10_std value: -2.1210028369378167 - type: nauc_ndcg_at_1_diff1 value: 52.76907950247656 - type: nauc_ndcg_at_1_max value: 20.909404191190152 - type: nauc_ndcg_at_1_std value: -4.486212769404569 - type: nauc_ndcg_at_20_diff1 value: 44.333669717530185 - type: nauc_ndcg_at_20_max value: 20.503130801298607 - type: nauc_ndcg_at_20_std value: -1.6040287688898405 - type: nauc_ndcg_at_3_diff1 value: 45.988171772625634 - type: nauc_ndcg_at_3_max value: 20.901834276482294 - type: nauc_ndcg_at_3_std value: -3.228341348463241 - type: nauc_ndcg_at_5_diff1 value: 44.77257666022731 - type: nauc_ndcg_at_5_max value: 20.70409124701764 - type: nauc_ndcg_at_5_std value: -2.7157792836026826 - type: nauc_precision_at_1000_diff1 value: 24.715455802573878 - type: nauc_precision_at_1000_max value: 25.642760620422127 - type: nauc_precision_at_1000_std value: 20.124139669932596 - type: nauc_precision_at_100_diff1 value: 31.317204301075428 - type: nauc_precision_at_100_max value: 20.717841497411385 - type: nauc_precision_at_100_std value: 15.071826819138575 - type: nauc_precision_at_10_diff1 value: 35.455731038677605 - type: nauc_precision_at_10_max value: 19.1279684555736 - type: nauc_precision_at_10_std value: 1.47750077627525 - type: nauc_precision_at_1_diff1 value: 52.76907950247656 - type: nauc_precision_at_1_max value: 20.909404191190152 - type: nauc_precision_at_1_std value: -4.486212769404569 - type: nauc_precision_at_20_diff1 value: 33.12837939512509 - type: nauc_precision_at_20_max value: 19.114872213547194 - type: nauc_precision_at_20_std value: 4.913450374911581 - type: nauc_precision_at_3_diff1 value: 41.17113816710835 - type: nauc_precision_at_3_max value: 20.751510760974718 - type: nauc_precision_at_3_std value: -2.3503705806184496 - type: nauc_precision_at_5_diff1 value: 37.71917213552412 - type: nauc_precision_at_5_max value: 20.221342669216565 - type: nauc_precision_at_5_std value: -0.9301420941546075 - type: nauc_recall_at_1000_diff1 value: 24.715455802574407 - type: nauc_recall_at_1000_max value: 25.64276062042252 - type: nauc_recall_at_1000_std value: 20.124139669932728 - type: nauc_recall_at_100_diff1 value: 31.31720430107529 - type: nauc_recall_at_100_max value: 20.717841497411516 - type: nauc_recall_at_100_std value: 15.071826819138751 - type: nauc_recall_at_10_diff1 value: 35.455731038677655 - type: nauc_recall_at_10_max value: 19.127968455573654 - type: nauc_recall_at_10_std value: 1.47750077627532 - type: nauc_recall_at_1_diff1 value: 52.76907950247656 - type: nauc_recall_at_1_max value: 20.909404191190152 - type: nauc_recall_at_1_std value: -4.486212769404569 - type: nauc_recall_at_20_diff1 value: 33.12837939512524 - type: nauc_recall_at_20_max value: 19.1148722135474 - type: nauc_recall_at_20_std value: 4.91345037491176 - type: nauc_recall_at_3_diff1 value: 41.171138167108374 - type: nauc_recall_at_3_max value: 20.751510760974682 - type: nauc_recall_at_3_std value: -2.35037058061848 - type: nauc_recall_at_5_diff1 value: 37.71917213552414 - type: nauc_recall_at_5_max value: 20.221342669216575 - type: nauc_recall_at_5_std value: -0.9301420941545763 - type: ndcg_at_1 value: 36.21 - type: ndcg_at_10 value: 51.271 - type: ndcg_at_100 value: 55.289 - type: ndcg_at_1000 value: 56.401 - type: ndcg_at_20 value: 53.028 - type: ndcg_at_3 value: 46.078 - type: ndcg_at_5 value: 48.825 - type: precision_at_1 value: 36.21 - type: precision_at_10 value: 6.7250000000000005 - type: precision_at_100 value: 0.864 - type: precision_at_1000 value: 0.095 - type: precision_at_20 value: 3.7089999999999996 - type: precision_at_3 value: 17.68 - type: precision_at_5 value: 11.940000000000001 - type: recall_at_1 value: 36.21 - type: recall_at_10 value: 67.25 - type: recall_at_100 value: 86.4 - type: recall_at_1000 value: 95.26 - type: recall_at_20 value: 74.18 - type: recall_at_3 value: 53.04 - type: recall_at_5 value: 59.699999999999996 task: type: Retrieval - dataset: config: default name: MTEB RuBQReranking (default) revision: 2e96b8f098fa4b0950fc58eacadeb31c0d0c7fa2 split: test type: ai-forever/rubq-reranking metrics: - type: main_score value: 62.15027154459556 - type: map value: 62.15027154459556 - type: mrr value: 68.09500782905037 - type: nAUC_map_diff1 value: 33.062970148901556 - type: nAUC_map_max value: 11.090302786599219 - type: nAUC_map_std value: 5.660375803457896 - type: nAUC_mrr_diff1 value: 35.578332777596685 - type: nAUC_mrr_max value: 14.981311816105839 - type: nAUC_mrr_std value: 5.550039824115788 task: type: Reranking - dataset: config: default name: MTEB RuBQRetrieval (default) revision: e19b6ffa60b3bc248e0b41f4cc37c26a55c2a67b split: test type: ai-forever/rubq-retrieval metrics: - type: main_score value: 51.734 - type: map_at_1 value: 28.510999999999996 - type: map_at_10 value: 43.631 - type: map_at_100 value: 44.988 - type: map_at_1000 value: 45.052 - type: map_at_20 value: 44.462 - type: map_at_3 value: 38.937 - type: map_at_5 value: 41.833 - type: mrr_at_1 value: 41.312056737588655 - type: mrr_at_10 value: 53.36138316634781 - type: mrr_at_100 value: 53.949276632310216 - type: mrr_at_1000 value: 53.97463197704906 - type: mrr_at_20 value: 53.72140863635181 - type: mrr_at_3 value: 50.43341213553989 - type: mrr_at_5 value: 52.32466509062269 - type: nauc_map_at_1000_diff1 value: 28.763838953386795 - type: nauc_map_at_1000_max value: 24.058720207454833 - type: nauc_map_at_1000_std value: 0.43914028345667794 - type: nauc_map_at_100_diff1 value: 28.74115734128027 - type: nauc_map_at_100_max value: 24.067201633751907 - type: nauc_map_at_100_std value: 0.48479657643151175 - type: nauc_map_at_10_diff1 value: 28.78055585777882 - type: nauc_map_at_10_max value: 23.660824446842014 - type: nauc_map_at_10_std value: -0.13417257945838412 - type: nauc_map_at_1_diff1 value: 31.726698171475988 - type: nauc_map_at_1_max value: 18.706684051084675 - type: nauc_map_at_1_std value: -3.1112088462944576 - type: nauc_map_at_20_diff1 value: 28.821888050893524 - type: nauc_map_at_20_max value: 24.054108877450066 - type: nauc_map_at_20_std value: 0.29933097295171895 - type: nauc_map_at_3_diff1 value: 29.414059668041187 - type: nauc_map_at_3_max value: 21.603288627966425 - type: nauc_map_at_3_std value: -1.2582454726026868 - type: nauc_map_at_5_diff1 value: 28.763709067820066 - type: nauc_map_at_5_max value: 22.83472652858084 - type: nauc_map_at_5_std value: -0.9139576784503077 - type: nauc_mrr_at_1000_diff1 value: 32.788260400997885 - type: nauc_mrr_at_1000_max value: 26.645815716166126 - type: nauc_mrr_at_1000_std value: -1.751195655856463 - type: nauc_mrr_at_100_diff1 value: 32.77886459571929 - type: nauc_mrr_at_100_max value: 26.65637126850806 - type: nauc_mrr_at_100_std value: -1.7267980184678584 - type: nauc_mrr_at_10_diff1 value: 32.78874216502045 - type: nauc_mrr_at_10_max value: 26.4839655119896 - type: nauc_mrr_at_10_std value: -1.9790149014956449 - type: nauc_mrr_at_1_diff1 value: 35.13232635364635 - type: nauc_mrr_at_1_max value: 23.697653866746013 - type: nauc_mrr_at_1_std value: -3.229619940147812 - type: nauc_mrr_at_20_diff1 value: 32.77802354989702 - type: nauc_mrr_at_20_max value: 26.68040225454969 - type: nauc_mrr_at_20_std value: -1.75616956975016 - type: nauc_mrr_at_3_diff1 value: 32.984816761600435 - type: nauc_mrr_at_3_max value: 26.13901825373233 - type: nauc_mrr_at_3_std value: -2.52193076369521 - type: nauc_mrr_at_5_diff1 value: 32.84967841683121 - type: nauc_mrr_at_5_max value: 26.529547373322448 - type: nauc_mrr_at_5_std value: -2.5581887401849595 - type: nauc_ndcg_at_1000_diff1 value: 28.596338371171104 - type: nauc_ndcg_at_1000_max value: 26.398864343527546 - type: nauc_ndcg_at_1000_std value: 2.0928142009674264 - type: nauc_ndcg_at_100_diff1 value: 28.25901263389625 - type: nauc_ndcg_at_100_max value: 26.93052809711281 - type: nauc_ndcg_at_100_std value: 3.1368035623322266 - type: nauc_ndcg_at_10_diff1 value: 28.273504061219295 - type: nauc_ndcg_at_10_max value: 25.70274506672966 - type: nauc_ndcg_at_10_std value: 1.031980357515916 - type: nauc_ndcg_at_1_diff1 value: 35.288927336386486 - type: nauc_ndcg_at_1_max value: 23.407964640774143 - type: nauc_ndcg_at_1_std value: -3.2088824424845743 - type: nauc_ndcg_at_20_diff1 value: 28.27252389476242 - type: nauc_ndcg_at_20_max value: 26.959280568356686 - type: nauc_ndcg_at_20_std value: 2.355748254409649 - type: nauc_ndcg_at_3_diff1 value: 29.507109145825144 - type: nauc_ndcg_at_3_max value: 23.171704666301913 - type: nauc_ndcg_at_3_std value: -1.4521550440778286 - type: nauc_ndcg_at_5_diff1 value: 28.488416363267216 - type: nauc_ndcg_at_5_max value: 24.63470555569984 - type: nauc_ndcg_at_5_std value: -0.9243408985702865 - type: nauc_precision_at_1000_diff1 value: -1.6853041487515183 - type: nauc_precision_at_1000_max value: 7.960967030916032 - type: nauc_precision_at_1000_std value: 3.6491508412352784 - type: nauc_precision_at_100_diff1 value: 1.1138125936003078 - type: nauc_precision_at_100_max value: 14.425287491557784 - type: nauc_precision_at_100_std value: 8.976522577047673 - type: nauc_precision_at_10_diff1 value: 9.746060862351767 - type: nauc_precision_at_10_max value: 21.23608774117671 - type: nauc_precision_at_10_std value: 5.704741335087523 - type: nauc_precision_at_1_diff1 value: 35.288927336386486 - type: nauc_precision_at_1_max value: 23.407964640774143 - type: nauc_precision_at_1_std value: -3.2088824424845743 - type: nauc_precision_at_20_diff1 value: 6.326610022834949 - type: nauc_precision_at_20_max value: 20.35842844947274 - type: nauc_precision_at_20_std value: 8.561077634074318 - type: nauc_precision_at_3_diff1 value: 20.23921207457269 - type: nauc_precision_at_3_max value: 22.983126702497753 - type: nauc_precision_at_3_std value: 0.3762065769613514 - type: nauc_precision_at_5_diff1 value: 14.130374029335451 - type: nauc_precision_at_5_max value: 22.27280203101339 - type: nauc_precision_at_5_std value: 1.4403304333986182 - type: nauc_recall_at_1000_diff1 value: 5.336939388003354 - type: nauc_recall_at_1000_max value: 31.706880957377347 - type: nauc_recall_at_1000_std value: 34.42854130495 - type: nauc_recall_at_100_diff1 value: 13.06348098921675 - type: nauc_recall_at_100_max value: 35.43003105581946 - type: nauc_recall_at_100_std value: 28.949432461425634 - type: nauc_recall_at_10_diff1 value: 19.58510835348359 - type: nauc_recall_at_10_max value: 25.98205980928563 - type: nauc_recall_at_10_std value: 6.643640648680416 - type: nauc_recall_at_1_diff1 value: 31.726698171475988 - type: nauc_recall_at_1_max value: 18.706684051084675 - type: nauc_recall_at_1_std value: -3.1112088462944576 - type: nauc_recall_at_20_diff1 value: 17.50381042355996 - type: nauc_recall_at_20_max value: 31.185904487900324 - type: nauc_recall_at_20_std value: 13.510200942211565 - type: nauc_recall_at_3_diff1 value: 24.227382984516147 - type: nauc_recall_at_3_max value: 21.40248626451014 - type: nauc_recall_at_3_std value: -0.469137375497106 - type: nauc_recall_at_5_diff1 value: 21.25980638967181 - type: nauc_recall_at_5_max value: 23.853364661344404 - type: nauc_recall_at_5_std value: 0.7407724495151051 - type: ndcg_at_1 value: 41.253 - type: ndcg_at_10 value: 51.734 - type: ndcg_at_100 value: 56.796 - type: ndcg_at_1000 value: 58.044 - type: ndcg_at_20 value: 53.982 - type: ndcg_at_3 value: 44.448 - type: ndcg_at_5 value: 48.306 - type: precision_at_1 value: 41.253 - type: precision_at_10 value: 10.674 - type: precision_at_100 value: 1.437 - type: precision_at_1000 value: 0.159 - type: precision_at_20 value: 6.0280000000000005 - type: precision_at_3 value: 24.901 - type: precision_at_5 value: 18.038 - type: recall_at_1 value: 28.510999999999996 - type: recall_at_10 value: 65.646 - type: recall_at_100 value: 86.37 - type: recall_at_1000 value: 94.926 - type: recall_at_20 value: 73.236 - type: recall_at_3 value: 47.492000000000004 - type: recall_at_5 value: 56.552 task: type: Retrieval - dataset: config: default name: MTEB RuReviewsClassification (default) revision: f6d2c31f4dc6b88f468552750bfec05b4b41b05a split: test type: ai-forever/ru-reviews-classification metrics: - type: accuracy value: 60.6591796875 - type: f1 value: 60.34177974754267 - type: f1_weighted value: 60.3424791407144 - type: main_score value: 60.6591796875 task: type: Classification - dataset: config: default name: MTEB RuSTSBenchmarkSTS (default) revision: 7cf24f325c6da6195df55bef3d86b5e0616f3018 split: test type: ai-forever/ru-stsbenchmark-sts metrics: - type: cosine_pearson value: 78.67181755069355 - type: cosine_spearman value: 78.48157070388886 - type: euclidean_pearson value: 78.16400243944963 - type: euclidean_spearman value: 78.48124817526005 - type: main_score value: 78.48157070388886 - type: manhattan_pearson value: 78.04437263885238 - type: manhattan_spearman value: 78.34292373482941 task: type: STS - dataset: config: default name: MTEB RuSciBenchGRNTIClassification (default) revision: 673a610d6d3dd91a547a0d57ae1b56f37ebbf6a1 split: test type: ai-forever/ru-scibench-grnti-classification metrics: - type: accuracy value: 52.9296875 - type: f1 value: 51.36892216551846 - type: f1_weighted value: 51.38263945115431 - type: main_score value: 52.9296875 task: type: Classification - dataset: config: default name: MTEB RuSciBenchGRNTIClusteringP2P (default) revision: 673a610d6d3dd91a547a0d57ae1b56f37ebbf6a1 split: test type: ai-forever/ru-scibench-grnti-classification metrics: - type: main_score value: 47.548401486969844 - type: v_measure value: 47.548401486969844 - type: v_measure_std value: 0.9652047055316595 task: type: Clustering - dataset: config: default name: MTEB RuSciBenchOECDClassification (default) revision: 26c88e99dcaba32bb45d0e1bfc21902337f6d471 split: test type: ai-forever/ru-scibench-oecd-classification metrics: - type: accuracy value: 40.7861328125 - type: f1 value: 38.417161317304625 - type: f1_weighted value: 38.41751508417981 - type: main_score value: 40.7861328125 task: type: Classification - dataset: config: default name: MTEB RuSciBenchOECDClusteringP2P (default) revision: 26c88e99dcaba32bb45d0e1bfc21902337f6d471 split: test type: ai-forever/ru-scibench-oecd-classification metrics: - type: main_score value: 41.44039335680795 - type: v_measure value: 41.44039335680795 - type: v_measure_std value: 1.2447867997057736 task: type: Clustering - dataset: config: default name: MTEB SCDBPAccountabilityLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 64.64379947229551 - type: ap value: 91.77095548714944 - type: ap_weighted value: 91.77095548714944 - type: f1 value: 56.37541231445849 - type: f1_weighted value: 70.25628045216064 - type: main_score value: 64.64379947229551 task: type: Classification - dataset: config: default name: MTEB SCDBPAuditsLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 59.89445910290237 - type: ap value: 75.9408508806894 - type: ap_weighted value: 75.9408508806894 - type: f1 value: 59.26805814808528 - type: f1_weighted value: 61.147261012536525 - type: main_score value: 59.89445910290237 task: type: Classification - dataset: config: default name: MTEB SCDBPCertificationLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 59.78835978835979 - type: ap value: 79.40365504574285 - type: ap_weighted value: 79.40365504574285 - type: f1 value: 56.06802055297283 - type: f1_weighted value: 62.49406105045939 - type: main_score value: 59.78835978835979 task: type: Classification - dataset: config: default name: MTEB SCDBPTrainingLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 59.102902374670194 - type: ap value: 78.86277214171828 - type: ap_weighted value: 78.86277214171828 - type: f1 value: 58.122144043570934 - type: f1_weighted value: 60.91223239928431 - type: main_score value: 59.102902374670194 task: type: Classification - dataset: config: default name: MTEB SCDBPVerificationLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 62.796833773087066 - type: ap value: 66.09764646131225 - type: ap_weighted value: 66.09764646131225 - type: f1 value: 62.562263119916494 - type: f1_weighted value: 62.19476909661592 - type: main_score value: 62.796833773087066 task: type: Classification - dataset: config: default name: MTEB SCDDAccountabilityLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 60.84656084656085 - type: ap value: 96.40608145845859 - type: ap_weighted value: 96.40608145845859 - type: f1 value: 46.04166666666668 - type: f1_weighted value: 71.16512345679011 - type: main_score value: 60.84656084656085 task: type: Classification - dataset: config: default name: MTEB SCDDAuditsLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 61.741424802110814 - type: ap value: 94.08312772646127 - type: ap_weighted value: 94.08312772646127 - type: f1 value: 50.59825064499599 - type: f1_weighted value: 69.72736628137642 - type: main_score value: 61.741424802110814 task: type: Classification - dataset: config: default name: MTEB SCDDCertificationLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 62.43386243386243 - type: ap value: 92.94462068443907 - type: ap_weighted value: 92.94462068443907 - type: f1 value: 49.37181663837012 - type: f1_weighted value: 70.32551510197236 - type: main_score value: 62.43386243386243 task: type: Classification - dataset: config: default name: MTEB SCDDTrainingLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 53.825857519788926 - type: ap value: 89.02073335965477 - type: ap_weighted value: 89.02073335965477 - type: f1 value: 47.22918407128933 - type: f1_weighted value: 60.86559112527728 - type: main_score value: 53.825857519788926 task: type: Classification - dataset: config: default name: MTEB SCDDVerificationLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 49.07651715039577 - type: ap value: 76.04960744098202 - type: ap_weighted value: 76.04960744098202 - type: f1 value: 47.939930963310914 - type: f1_weighted value: 51.65413225324895 - type: main_score value: 49.07651715039577 task: type: Classification - dataset: config: zh name: MTEB STS22 (zh) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 10.783707479640047 - type: cosine_spearman value: 32.82859566062349 - type: euclidean_pearson value: 21.280811252412548 - type: euclidean_spearman value: 32.82859566062349 - type: main_score value: 32.82859566062349 - type: manhattan_pearson value: 21.510100649883686 - type: manhattan_spearman value: 32.924353350152195 task: type: STS - dataset: config: de-fr name: MTEB STS22 (de-fr) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 10.185699265034293 - type: cosine_spearman value: 17.504453225721367 - type: euclidean_pearson value: 11.256743769494715 - type: euclidean_spearman value: 17.504453225721367 - type: main_score value: 17.504453225721367 - type: manhattan_pearson value: 9.741426548627869 - type: manhattan_spearman value: 16.976476678309815 task: type: STS - dataset: config: pl-en name: MTEB STS22 (pl-en) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 44.8697112464095 - type: cosine_spearman value: 42.075721562892944 - type: euclidean_pearson value: 43.40637455102888 - type: euclidean_spearman value: 42.075721562892944 - type: main_score value: 42.075721562892944 - type: manhattan_pearson value: 45.13522626066653 - type: manhattan_spearman value: 42.53935152687679 task: type: STS - dataset: config: ru name: MTEB STS22 (ru) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 51.4108131114559 - type: cosine_spearman value: 60.05716921675363 - type: euclidean_pearson value: 52.595208834301246 - type: euclidean_spearman value: 60.05157835366835 - type: main_score value: 60.05716921675363 - type: manhattan_pearson value: 52.49640999228367 - type: manhattan_spearman value: 59.89412865698913 task: type: STS - dataset: config: fr name: MTEB STS22 (fr) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 26.610436064600535 - type: cosine_spearman value: 42.00247648193326 - type: euclidean_pearson value: 33.894760545223065 - type: euclidean_spearman value: 42.00247648193326 - type: main_score value: 42.00247648193326 - type: manhattan_pearson value: 33.80795212984925 - type: manhattan_spearman value: 42.14922985413102 task: type: STS - dataset: config: de name: MTEB STS22 (de) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: -5.737945045891398 - type: cosine_spearman value: 8.163885149544491 - type: euclidean_pearson value: -2.214478704390943 - type: euclidean_spearman value: 8.16472976205313 - type: main_score value: 8.163885149544491 - type: manhattan_pearson value: -1.7539096573944195 - type: manhattan_spearman value: 8.6906872178124 task: type: STS - dataset: config: tr name: MTEB STS22 (tr) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 2.043942714330888 - type: cosine_spearman value: 15.459553758272923 - type: euclidean_pearson value: 8.816942314411607 - type: euclidean_spearman value: 15.459553758272923 - type: main_score value: 15.459553758272923 - type: manhattan_pearson value: 9.32963790399984 - type: manhattan_spearman value: 15.7857074615967 task: type: STS - dataset: config: de-en name: MTEB STS22 (de-en) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 17.695301514955418 - type: cosine_spearman value: 21.545599222945675 - type: euclidean_pearson value: 18.353827841283753 - type: euclidean_spearman value: 21.545599222945675 - type: main_score value: 21.545599222945675 - type: manhattan_pearson value: 17.009036963688505 - type: manhattan_spearman value: 20.508582325360287 task: type: STS - dataset: config: it name: MTEB STS22 (it) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 32.630588839696415 - type: cosine_spearman value: 39.69250140711604 - type: euclidean_pearson value: 37.54122176804933 - type: euclidean_spearman value: 39.69250140711604 - type: main_score value: 39.69250140711604 - type: manhattan_pearson value: 37.79703600372667 - type: manhattan_spearman value: 39.742229485575024 task: type: STS - dataset: config: pl name: MTEB STS22 (pl) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 0.3113685198259237 - type: cosine_spearman value: 9.707385637292596 - type: euclidean_pearson value: -2.4832855952463206 - type: euclidean_spearman value: 9.80177503118972 - type: main_score value: 9.707385637292596 - type: manhattan_pearson value: -2.325293004138977 - type: manhattan_spearman value: 10.060452403624826 task: type: STS - dataset: config: fr-pl name: MTEB STS22 (fr-pl) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 47.546556133158575 - type: cosine_spearman value: 39.440531887330785 - type: euclidean_pearson value: 48.2920143634797 - type: euclidean_spearman value: 39.440531887330785 - type: main_score value: 39.440531887330785 - type: manhattan_pearson value: 45.769523538925824 - type: manhattan_spearman value: 50.709255283710995 task: type: STS - dataset: config: de-pl name: MTEB STS22 (de-pl) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 0.33007020080694816 - type: cosine_spearman value: 25.52831180119127 - type: euclidean_pearson value: 5.7124033000823164 - type: euclidean_spearman value: 25.52831180119127 - type: main_score value: 25.52831180119127 - type: manhattan_pearson value: 5.62314566860622 - type: manhattan_spearman value: 23.83463610871175 task: type: STS - dataset: config: ar name: MTEB STS22 (ar) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 22.766025640460693 - type: cosine_spearman value: 27.950069575571522 - type: euclidean_pearson value: 26.551723755491363 - type: euclidean_spearman value: 27.939678639817668 - type: main_score value: 27.950069575571522 - type: manhattan_pearson value: 26.681060475093854 - type: manhattan_spearman value: 27.986878582632468 task: type: STS - dataset: config: es-en name: MTEB STS22 (es-en) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 38.597910358452815 - type: cosine_spearman value: 42.766194189894094 - type: euclidean_pearson value: 39.991306255692045 - type: euclidean_spearman value: 42.766194189894094 - type: main_score value: 42.766194189894094 - type: manhattan_pearson value: 39.74918349185897 - type: manhattan_spearman value: 42.574140880355976 task: type: STS - dataset: config: es-it name: MTEB STS22 (es-it) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 31.245905627830638 - type: cosine_spearman value: 32.83240215980029 - type: euclidean_pearson value: 33.06481984956772 - type: euclidean_spearman value: 32.83240215980029 - type: main_score value: 32.83240215980029 - type: manhattan_pearson value: 32.75706899386791 - type: manhattan_spearman value: 32.334081823391806 task: type: STS - dataset: config: es name: MTEB STS22 (es) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 16.966347433363 - type: cosine_spearman value: 45.3129129914676 - type: euclidean_pearson value: 28.50940505249936 - type: euclidean_spearman value: 45.3129129914676 - type: main_score value: 45.3129129914676 - type: manhattan_pearson value: 28.314847203862147 - type: manhattan_spearman value: 45.72042962859271 task: type: STS - dataset: config: zh-en name: MTEB STS22 (zh-en) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 34.66358594216254 - type: cosine_spearman value: 31.24659955360722 - type: euclidean_pearson value: 34.878197534840744 - type: euclidean_spearman value: 31.24659955360722 - type: main_score value: 31.24659955360722 - type: manhattan_pearson value: 34.70743093532992 - type: manhattan_spearman value: 30.441251812127955 task: type: STS - dataset: config: en name: MTEB STS22 (en) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 41.376318618780324 - type: cosine_spearman value: 47.061970299820764 - type: euclidean_pearson value: 44.89590651276241 - type: euclidean_spearman value: 47.061970299820764 - type: main_score value: 47.061970299820764 - type: manhattan_pearson value: 44.780089700405576 - type: manhattan_spearman value: 46.742447019531525 task: type: STS - dataset: config: default name: MTEB SensitiveTopicsClassification (default) revision: 416b34a802308eac30e4192afc0ff99bb8dcc7f2 split: test type: ai-forever/sensitive-topics-classification metrics: - type: accuracy value: 24.443359375 - type: f1 value: 21.903258801323084 - type: lrap value: 36.34758843315896 - type: main_score value: 24.443359375 task: type: MultilabelClassification - dataset: config: default name: MTEB StackExchangeClustering (default) revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259 split: test type: mteb/stackexchange-clustering metrics: - type: main_score value: 33.50613168016603 - type: v_measure value: 33.50613168016603 - type: v_measure_std value: 3.91782276122889 task: type: Clustering - dataset: config: default name: MTEB StackExchangeClusteringP2P (default) revision: 815ca46b2622cec33ccafc3735d572c266efdb44 split: test type: mteb/stackexchange-clustering-p2p metrics: - type: main_score value: 27.98150942889309 - type: v_measure value: 27.98150942889309 - type: v_measure_std value: 2.0056109104136226 task: type: Clustering - dataset: config: default name: MTEB TERRa (default) revision: 7b58f24536063837d644aab9a023c62199b2a612 split: dev type: ai-forever/terra-pairclassification metrics: - type: cosine_accuracy value: 59.60912052117264 - type: cosine_accuracy_threshold value: 81.55556917190552 - type: cosine_ap value: 56.08760299515377 - type: cosine_f1 value: 67.33167082294264 - type: cosine_f1_threshold value: 78.14505100250244 - type: cosine_precision value: 54.43548387096774 - type: cosine_recall value: 88.23529411764706 - type: dot_accuracy value: 59.60912052117264 - type: dot_accuracy_threshold value: 81.55556917190552 - type: dot_ap value: 56.08760299515377 - type: dot_f1 value: 67.33167082294264 - type: dot_f1_threshold value: 78.14503908157349 - type: dot_precision value: 54.43548387096774 - type: dot_recall value: 88.23529411764706 - type: euclidean_accuracy value: 59.60912052117264 - type: euclidean_accuracy_threshold value: 60.736143589019775 - type: euclidean_ap value: 56.08760299515377 - type: euclidean_f1 value: 67.33167082294264 - type: euclidean_f1_threshold value: 66.11342430114746 - type: euclidean_precision value: 54.43548387096774 - type: euclidean_recall value: 88.23529411764706 - type: main_score value: 56.265447472512676 - type: manhattan_accuracy value: 60.91205211726385 - type: manhattan_accuracy_threshold value: 877.9421806335449 - type: manhattan_ap value: 56.265447472512676 - type: manhattan_f1 value: 67.16791979949875 - type: manhattan_f1_threshold value: 930.9440612792969 - type: manhattan_precision value: 54.47154471544715 - type: manhattan_recall value: 87.58169934640523 - type: max_ap value: 56.265447472512676 - type: max_f1 value: 67.33167082294264 - type: max_precision value: 54.47154471544715 - type: max_recall value: 88.23529411764706 - type: similarity_accuracy value: 59.60912052117264 - type: similarity_accuracy_threshold value: 81.55557513237 - type: similarity_ap value: 56.08760299515377 - type: similarity_f1 value: 67.33167082294264 - type: similarity_f1_threshold value: 78.1450629234314 - type: similarity_precision value: 54.43548387096774 - type: similarity_recall value: 88.23529411764706 task: type: PairClassification - dataset: config: default name: MTEB TelemarketingSalesRuleLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 51.06382978723404 - type: ap value: 64.12529550827422 - type: ap_weighted value: 64.12529550827422 - type: f1 value: 48.74348032242769 - type: f1_weighted value: 46.65516580410197 - type: main_score value: 51.06382978723404 task: type: Classification - dataset: config: default name: MTEB TextualismToolDictionariesLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 69.1588785046729 - type: ap value: 13.91484942886812 - type: ap_weighted value: 13.91484942886812 - type: f1 value: 53.57001972386588 - type: f1_weighted value: 75.94757507050821 - type: main_score value: 69.1588785046729 task: type: Classification - dataset: config: default name: MTEB TextualismToolPlainLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 52.121212121212125 - type: ap value: 44.68029172320217 - type: ap_weighted value: 44.68029172320217 - type: f1 value: 50.48433048433048 - type: f1_weighted value: 48.79288612621945 - type: main_score value: 52.121212121212125 task: type: Classification - dataset: config: default name: MTEB ToxicChatClassification (default) revision: 3e0319203c7162b9c9f8015b594441f979c199bc split: test type: lmsys/toxic-chat metrics: - type: accuracy value: 73.56529209621992 - type: ap value: 21.641229801673067 - type: ap_weighted value: 21.641229801673067 - type: f1 value: 60.19489676894062 - type: f1_weighted value: 77.21280694246968 - type: main_score value: 73.56529209621992 task: type: Classification - dataset: config: default name: MTEB ToxicConversationsClassification (default) revision: edfaf9da55d3dd50d43143d90c1ac476895ae6de split: test type: mteb/toxic_conversations_50k metrics: - type: accuracy value: 57.7734375 - type: ap value: 9.305482173252097 - type: ap_weighted value: 9.305482173252097 - type: f1 value: 44.43839832998249 - type: f1_weighted value: 67.10615100631958 - type: main_score value: 57.7734375 task: type: Classification - dataset: config: default name: MTEB TweetSentimentExtractionClassification (default) revision: d604517c81ca91fe16a244d1248fc021f9ecee7a split: test type: mteb/tweet_sentiment_extraction metrics: - type: accuracy value: 55.29994340690435 - type: f1 value: 55.3098112653406 - type: f1_weighted value: 54.4846442708958 - type: main_score value: 55.29994340690435 task: type: Classification - dataset: config: default name: MTEB TweetTopicSingleClassification (default) revision: 87b7a0d1c402dbb481db649569c556d9aa27ac05 split: test_2021 type: cardiffnlp/tweet_topic_single metrics: - type: accuracy value: 52.522150029533364 - type: f1 value: 40.24714634897976 - type: f1_weighted value: 57.39523757985323 - type: main_score value: 52.522150029533364 task: type: Classification - dataset: config: default name: MTEB TwentyNewsgroupsClustering (default) revision: 6125ec4e24fa026cec8a478383ee943acfbd5449 split: test type: mteb/twentynewsgroups-clustering metrics: - type: main_score value: 19.90344454285597 - type: v_measure value: 19.90344454285597 - type: v_measure_std value: 1.8260774855268984 task: type: Clustering - dataset: config: default name: MTEB UCCVCommonLawLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 52.127659574468076 - type: ap value: 42.829212190914326 - type: ap_weighted value: 42.829212190914326 - type: f1 value: 50.50895050895051 - type: f1_weighted value: 51.84200503349441 - type: main_score value: 52.127659574468076 task: type: Classification - dataset: config: default name: MTEB UnfairTOSLegalBenchClassification (default) revision: 12ca3b695563788fead87a982ad1a068284413f4 split: test type: nguha/legalbench metrics: - type: accuracy value: 19.3359375 - type: f1 value: 11.24236763925133 - type: f1_weighted value: 27.137659267661597 - type: main_score value: 19.3359375 task: type: Classification - dataset: config: default name: MTEB VieMedEVBitextMining (default) revision: d03c69413bc53d1cea5a5375b3a953c4fee35ecd split: test type: nhuvo/MedEV metrics: - type: accuracy value: 8.69140625 - type: f1 value: 7.772120924359041 - type: main_score value: 7.772120924359041 - type: precision value: 7.525730353438241 - type: recall value: 8.69140625 task: type: BitextMining - dataset: config: default name: MTEB WikiCitiesClustering (default) revision: ddc9ee9242fa65332597f70e967ecc38b9d734fa split: test type: jinaai/cities_wiki_clustering metrics: - type: main_score value: 56.66855146861069 - type: v_measure value: 56.66855146861069 - type: v_measure_std value: 0.0 task: type: Clustering - dataset: config: default name: MTEB YahooAnswersTopicsClassification (default) revision: 78fccffa043240c80e17a6b1da724f5a1057e8e5 split: test type: community-datasets/yahoo_answers_topics metrics: - type: accuracy value: 41.787109375 - type: f1 value: 40.33967050694529 - type: f1_weighted value: 40.3509380795682 - type: main_score value: 41.787109375 task: type: Classification - dataset: config: default name: MTEB YelpReviewFullClassification (default) revision: c1f9ee939b7d05667af864ee1cb066393154bf85 split: test type: Yelp/yelp_review_full metrics: - type: accuracy value: 43.5888671875 - type: f1 value: 42.36578282497966 - type: f1_weighted value: 42.363220099893724 - type: main_score value: 43.5888671875 task: type: Classification --- Быстрая модель BERT для расчетов эмбеддингов предложений на русском языке. Модель основана на cointegrated/rubert-tiny2 - имеет аналогичные размеры контекста (2048), ембеддинга (312) и быстродействие. ## Использование ## Метрики Оценки модели на бенчмарке encodechka: | model | CPU | GPU | size | Mean S | Mean S+W | dim | |:-----------------------------------|----------:|---------:|---------:|----------:|-----------:|-------:| | sergeyzh/LaBSE-ru-turbo | 120.40 | 8.05 | 490 | 0.789 | 0.702 | 768 | | BAAI/bge-m3 | 523.40 | 22.50 | 2166 | 0.787 | 0.696 | 1024 | | intfloat/multilingual-e5-large | 506.80 | 30.80 | 2136 | 0.780 | 0.686 | 1024 | | intfloat/multilingual-e5-base | 130.61 | 14.39 | 1061 | 0.761 | 0.669 | 768 | | **sergeyzh/rubert-tiny-turbo** | 5.51 | 3.25 | 111 | 0.749 | 0.667 | 312 | | intfloat/multilingual-e5-small | 40.86 | 12.09 | 449 | 0.742 | 0.645 | 384 | | cointegrated/rubert-tiny2 | 5.51 | 3.25 | 111 | 0.704 | 0.638 | 312 | | model | STS | PI | NLI | SA | TI | IA | IC | ICX | NE1 | NE2 | |:-----------------------------------|:---------|:---------|:---------|:---------|:---------|:---------|:---------|:---------|:---------|:---------| | sergeyzh/LaBSE-ru-turbo | 0.864 | 0.748 | 0.490 | 0.814 | 0.974 | 0.806 | 0.815 | 0.801 | 0.305 | 0.404 | | BAAI/bge-m3 | 0.864 | 0.749 | 0.510 | 0.819 | 0.973 | 0.792 | 0.809 | 0.783 | 0.240 | 0.422 | | intfloat/multilingual-e5-large | 0.862 | 0.727 | 0.473 | 0.810 | 0.979 | 0.798 | 0.819 | 0.773 | 0.224 | 0.374 | | intfloat/multilingual-e5-base | 0.835 | 0.704 | 0.459 | 0.796 | 0.964 | 0.783 | 0.802 | 0.738 | 0.235 | 0.376 | | **sergeyzh/rubert-tiny-turbo** | 0.828 | 0.722 | 0.476 | 0.787 | 0.955 | 0.757 | 0.780 | 0.685 | 0.305 | 0.373 | | intfloat/multilingual-e5-small | 0.822 | 0.714 | 0.457 | 0.758 | 0.957 | 0.761 | 0.779 | 0.691 | 0.234 | 0.275 | | cointegrated/rubert-tiny2 | 0.750 | 0.651 | 0.417 | 0.737 | 0.937 | 0.746 | 0.757 | 0.638 | 0.360 | 0.386 | Оценки модели на бенчмарке ruMTEB: |Model Name | Metric | sbert_large_ mt_nlu_ru | sbert_large_ nlu_ru | rubert-tiny2 | rubert-tiny-turbo | multilingual-e5-small | multilingual-e5-base | multilingual-e5-large | |:----------------------------------|:--------------------|-----------------------:|--------------------:|----------------:|------------------:|----------------------:|---------------------:|----------------------:| |CEDRClassification | Accuracy | 0.368 | 0.358 | 0.369 | 0.390 | 0.401 | 0.423 | **0.448** | |GeoreviewClassification | Accuracy | 0.397 | 0.400 | 0.396 | 0.414 | 0.447 | 0.461 | **0.497** | |GeoreviewClusteringP2P | V-measure | 0.584 | 0.590 | 0.442 | 0.597 | 0.586 | 0.545 | **0.605** | |HeadlineClassification | Accuracy | 0.772 | **0.793** | 0.742 | 0.686 | 0.732 | 0.757 | 0.758 | |InappropriatenessClassification | Accuracy | **0.646** | 0.625 | 0.586 | 0.591 | 0.592 | 0.588 | 0.616 | |KinopoiskClassification | Accuracy | 0.503 | 0.495 | 0.491 | 0.505 | 0.500 | 0.509 | **0.566** | |RiaNewsRetrieval | NDCG@10 | 0.214 | 0.111 | 0.140 | 0.513 | 0.700 | 0.702 | **0.807** | |RuBQReranking | MAP@10 | 0.561 | 0.468 | 0.461 | 0.622 | 0.715 | 0.720 | **0.756** | |RuBQRetrieval | NDCG@10 | 0.298 | 0.124 | 0.109 | 0.517 | 0.685 | 0.696 | **0.741** | |RuReviewsClassification | Accuracy | 0.589 | 0.583 | 0.570 | 0.607 | 0.612 | 0.630 | **0.653** | |RuSTSBenchmarkSTS | Pearson correlation | 0.712 | 0.588 | 0.694 | 0.787 | 0.781 | 0.796 | **0.831** | |RuSciBenchGRNTIClassification | Accuracy | 0.542 | 0.539 | 0.456 | 0.529 | 0.550 | 0.563 | **0.582** | |RuSciBenchGRNTIClusteringP2P | V-measure | **0.522** | 0.504 | 0.414 | 0.481 | 0.511 | 0.516 | 0.520 | |RuSciBenchOECDClassification | Accuracy | 0.438 | 0.430 | 0.355 | 0.415 | 0.427 | 0.423 | **0.445** | |RuSciBenchOECDClusteringP2P | V-measure | **0.473** | 0.464 | 0.381 | 0.411 | 0.443 | 0.448 | 0.450 | |SensitiveTopicsClassification | Accuracy | **0.285** | 0.280 | 0.220 | 0.244 | 0.228 | 0.234 | 0.257 | |TERRaClassification | Average Precision | 0.520 | 0.502 | 0.519 | 0.563 | 0.551 | 0.550 | **0.584** | |Model Name | Metric | sbert_large_ mt_nlu_ru | sbert_large_ nlu_ru | rubert-tiny2 | rubert-tiny-turbo | multilingual-e5-small | multilingual-e5-base | multilingual-e5-large | |:----------------------------------|:--------------------|-----------------------:|--------------------:|----------------:|------------------:|----------------------:|----------------------:|---------------------:| |Classification | Accuracy | 0.554 | 0.552 | 0.514 | 0.535 | 0.551 | 0.561 | **0.588** | |Clustering | V-measure | **0.526** | 0.519 | 0.412 | 0.496 | 0.513 | 0.503 | 0.525 | |MultiLabelClassification | Accuracy | 0.326 | 0.319 | 0.294 | 0.317 | 0.314 | 0.329 | **0.353** | |PairClassification | Average Precision | 0.520 | 0.502 | 0.519 | 0.563 | 0.551 | 0.550 | **0.584** | |Reranking | MAP@10 | 0.561 | 0.468 | 0.461 | 0.622 | 0.715 | 0.720 | **0.756** | |Retrieval | NDCG@10 | 0.256 | 0.118 | 0.124 | 0.515 | 0.697 | 0.699 | **0.774** | |STS | Pearson correlation | 0.712 | 0.588 | 0.694 | 0.787 | 0.781 | 0.796 | **0.831** | |Average | Average | 0.494 | 0.438 | 0.431 | 0.548 | 0.588 | 0.594 | **0.630** |", + "model_explanation_gemini": "Generates Russian-language sentence embeddings for similarity tasks, optimized as a lightweight variant of 'cointegrated/rubert-tiny2'.\n\nFeatures: \n- Language: Russian \n- Tasks: Sentence similarity, feature extraction \n- Architecture: Pretrained transformer (sentence-transformers) \n- Base model: cointegrated/rubert-tiny2 \n- Size: Tiny (lightweight) \n- Datasets: Fine-tuned on IlyaGusev/gazeta and zloel" +} \ No newline at end of file diff --git a/model_data_json/shi-labs_oneformer_cityscapes_dinat_large.json b/model_data_json/shi-labs_oneformer_cityscapes_dinat_large.json new file mode 100644 index 0000000000000000000000000000000000000000..de7a7a1065ccf52413f620045823ab7c4a9d5fc5 --- /dev/null +++ b/model_data_json/shi-labs_oneformer_cityscapes_dinat_large.json @@ -0,0 +1,18 @@ +{ + "model_id": "shi-labs/oneformer_cityscapes_dinat_large", + "downloads": 75688, + "tags": [ + "transformers", + "pytorch", + "oneformer", + "vision", + "image-segmentation", + "dataset:huggan/cityscapes", + "arxiv:2211.06220", + "license:mit", + "endpoints_compatible", + "region:us" + ], + "description": "--- license: mit tags: - vision - image-segmentation datasets: - huggan/cityscapes widget: - src: example_title: Cityscapes --- # OneFormer OneFormer model trained on the Cityscapes dataset (large-sized version, Dinat backbone). It was introduced in the paper OneFormer: One Transformer to Rule Universal Image Segmentation by Jain et al. and first released in this repository. !model image ## Model description OneFormer is the first multi-task universal image segmentation framework. It needs to be trained only once with a single universal architecture, a single model, and on a single dataset, to outperform existing specialized models across semantic, instance, and panoptic segmentation tasks. OneFormer uses a task token to condition the model on the task in focus, making the architecture task-guided for training, and task-dynamic for inference, all with a single model. !model image ## Intended uses & limitations You can use this particular checkpoint for semantic, instance and panoptic segmentation. See the model hub to look for other fine-tuned versions on a different dataset. ### How to use Here is how to use this model: For more examples, please refer to the documentation. ### Citation", + "model_explanation_gemini": "\"OneFormer is a universal image segmentation model trained on Cityscapes, capable of performing semantic, instance, and panoptic segmentation tasks with a single Dinat-large backbone, using task tokens to dynamically adapt to each task.\"\n\n### Model Features: \n- **Multi-task segmentation**: Handles semantic, instance, and panoptic segmentation. \n- **Single-model architecture**: Eliminates the need for task-specific models. \n- **Task conditioning**: Uses task tokens to dynamically switch between tasks. \n- **" +} \ No newline at end of file diff --git a/model_data_json/skshmjn_Pokemon-classifier-gen9-1025.json b/model_data_json/skshmjn_Pokemon-classifier-gen9-1025.json new file mode 100644 index 0000000000000000000000000000000000000000..54b8222f1f75e90c54477f4ad54b337b68d66277 --- /dev/null +++ b/model_data_json/skshmjn_Pokemon-classifier-gen9-1025.json @@ -0,0 +1,23 @@ +{ + "model_id": "skshmjn/Pokemon-classifier-gen9-1025", + "downloads": 78944, + "tags": [ + "transformers", + "safetensors", + "vit", + "image-classification", + "vision", + "pokemon", + "pytorch", + "en", + "dataset:custom", + "base_model:google/vit-base-patch16-224-in21k", + "base_model:finetune:google/vit-base-patch16-224-in21k", + "license:apache-2.0", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: en tags: - vision - image-classification - pokemon - pytorch - transformers license: apache-2.0 datasets: - custom pipeline_tag: image-classification model_name: Pokemon Classifier Gen9 model_id: skshmjn/Pokemon-classifier-gen9-1025 library_name: transformers framework: PyTorch widget: - src: test.jpg metrics: - accuracy base_model: - google/vit-base-patch16-224-in21k --- # Model Card for Pokemon Classifier Gen9 ## Model Overview This is a fine-tuned ViT (Vision Transformer) model for Pokémon image classification. The model is trained to classify upto Gen9 (1025) Pokémon images. ## Intended Use This model is designed for image classification tasks, specifically for identifying Pokémon characters. It can be used for: - Pokémon-themed apps - Educational projects - Pokémon identification in images **Note**: The model is not designed for general-purpose image classification. ## How to Use Here's how you can load and use the model with the Hugging Face library: ```python from transformers import ViTForImageClassification, ViTImageProcessor from PIL import Image import torch # Define the device device = \"cuda\" if torch.cuda.is_available() else \"cpu\" # Load the model and image processor model_id = \"skshmjn/Pokemon-classifier-gen9-1025\" model = ViTForImageClassification.from_pretrained(model_id).to(device) image_processor = ViTImageProcessor.from_pretrained(model_id) # Load and process an image img = Image.open('test.jpg').convert(\"RGB\") inputs = image_processor(images=img, return_tensors='pt').to(device) # Make predictions outputs = model(**inputs) predicted_id = outputs.logits.argmax(-1).item() predicted_pokemon = model.config.id2label[predicted_id] # Print predicted class print(f\"Predicted Pokémon Pokédex number: {predicted_id+1}\") print(f\"Predicted Pokémon: {predicted_pokemon}\")", + "model_explanation_gemini": "Classifies Pokémon images up to Generation 9 (1025 Pokémon) using a fine-tuned Vision Transformer (ViT) model. \n\n**Features**: \n- Image classification task (Pokémon-specific) \n- Supports up to Generation 9 (1025 classes) \n- Fine-tuned from `google/vit-base-patch16-224-in21k` \n- Uses PyTorch and Hugging Face Transformers \n\n**Comparison**: Unlike general-purpose image classifiers, this model is specialized" +} \ No newline at end of file diff --git a/model_data_json/stabilityai_stable-diffusion-3.5-large.json b/model_data_json/stabilityai_stable-diffusion-3.5-large.json new file mode 100644 index 0000000000000000000000000000000000000000..d7b50d8de4056c50fa4a3435c25be0f26e374ec4 --- /dev/null +++ b/model_data_json/stabilityai_stable-diffusion-3.5-large.json @@ -0,0 +1,17 @@ +{ + "model_id": "stabilityai/stable-diffusion-3.5-large", + "downloads": 141943, + "tags": [ + "diffusers", + "safetensors", + "text-to-image", + "stable-diffusion", + "en", + "arxiv:2403.03206", + "license:other", + "diffusers:StableDiffusion3Pipeline", + "region:us" + ], + "description": "--- license: other license_name: stabilityai-ai-community license_link: LICENSE.md tags: - text-to-image - stable-diffusion - diffusers inference: true extra_gated_prompt: >- By clicking \"Agree\", you agree to the License Agreement and acknowledge Stability AI's Privacy Policy. extra_gated_fields: Name: text Email: text Country: country Organization or Affiliation: text Receive email updates and promotions on Stability AI products, services, and research?: type: select options: - 'Yes' - 'No' What do you intend to use the model for?: type: select options: - Research - Personal use - Creative Professional - Startup - Enterprise I agree to the License Agreement and acknowledge Stability AI's Privacy Policy: checkbox language: - en pipeline_tag: text-to-image --- # Stable Diffusion 3.5 Large !3.5 Large Demo Image ## Model !MMDiT Stable Diffusion 3.5 Large is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency. Please note: This model is released under the Stability Community License. Visit Stability AI to learn or contact us for commercial licensing details. ### Model Description - **Developed by:** Stability AI - **Model type:** MMDiT text-to-image generative model - **Model Description:** This model generates images based on text prompts. It is a Multimodal Diffusion Transformer that use three fixed, pretrained text encoders, and with QK-normalization to improve training stability. ### License - **Community License:** Free for research, non-commercial, and commercial use for organizations or individuals with less than $1M in total annual revenue. More details can be found in the Community License Agreement. Read more at - **For individuals and organizations with annual revenue above $1M**: please contact us to get an Enterprise License. ### Model Sources For local or self-hosted use, we recommend ComfyUI for node-based UI inference, or diffusers or GitHub for programmatic use. - **ComfyUI:** Github, Example Workflow - **Huggingface Space:** Space - **Diffusers**: See below. - **GitHub**: GitHub. - **API Endpoints:** - Stability AI API - Replicate - Deepinfra ### Implementation Details - **QK Normalization:** Implements the QK normalization technique to improve training Stability. - **Text Encoders:** - CLIPs: OpenCLIP-ViT/G, CLIP-ViT/L, context length 77 tokens - T5: T5-xxl, context length 77/256 tokens at different stages of training - **Training Data and Strategy:** This model was trained on a wide variety of data, including synthetic data and filtered publicly available data. For more technical details of the original MMDiT architecture, please refer to the Research paper. ### Model Performance See blog for our study about comparative performance in prompt adherence and aesthetic quality. ## File Structure Click here to access the Files and versions tab ## Using with Diffusers Upgrade to the latest version of the 🧨 diffusers library and then you can run ### Quantizing the model with diffusers Reduce your VRAM usage and have the model fit on 🤏 VRAM GPUs ### Fine-tuning Please see the fine-tuning guide here. ## Uses ### Intended Uses Intended uses include the following: * Generation of artworks and use in design and other artistic processes. * Applications in educational or creative tools. * Research on generative models, including understanding the limitations of generative models. All uses of the model must be in accordance with our Acceptable Use Policy. ### Out-of-Scope Uses The model was not trained to be factual or true representations of people or events. As such, using the model to generate such content is out-of-scope of the abilities of this model. ## Safety As part of our safety-by-design and responsible AI deployment approach, we take deliberate measures to ensure Integrity starts at the early stages of development. We implement safety measures throughout the development of our models. We have implemented safety mitigations that are intended to reduce the risk of certain harms, however we recommend that developers conduct their own testing and apply additional mitigations based on their specific use cases. For more about our approach to Safety, please visit our Safety page. ### Integrity Evaluation Our integrity evaluation methods include structured evaluations and red-teaming testing for certain harms. Testing was conducted primarily in English and may not cover all possible harms. ### Risks identified and mitigations: * Harmful content: We have used filtered data sets when training our models and implemented safeguards that attempt to strike the right balance between usefulness and preventing harm. However, this does not guarantee that all possible harmful content has been removed. TAll developers and deployers should exercise caution and implement content safety guardrails based on their specific product policies and application use cases. * Misuse: Technical limitations and developer and end-user education can help mitigate against malicious applications of models. All users are required to adhere to our Acceptable Use Policy, including when applying fine-tuning and prompt engineering mechanisms. Please reference the Stability AI Acceptable Use Policy for information on violative uses of our products. * Privacy violations: Developers and deployers are encouraged to adhere to privacy regulations with techniques that respect data privacy. ### Contact Please report any issues with the model or contact us: * Safety issues: safety@stability.ai * Security issues: security@stability.ai * Privacy issues: privacy@stability.ai * License and general: * Enterprise license:", + "model_explanation_gemini": "Generates high-quality images from text prompts using a Multimodal Diffusion Transformer (MMDiT) with improved typography, prompt understanding, and efficiency. \n\n**Features:** \n- Multimodal Diffusion Transformer (MMDiT) architecture \n- Enhanced image quality and typography \n- Better complex prompt comprehension \n- Resource-efficient generation \n- Three fixed pretrained text encoders (CLIP-ViT/G, CLIP-ViT/L, T5-xxl) \n- Q" +} \ No newline at end of file diff --git a/model_data_json/stabilityai_stable-diffusion-3.5-medium.json b/model_data_json/stabilityai_stable-diffusion-3.5-medium.json new file mode 100644 index 0000000000000000000000000000000000000000..bf99e21cac2eb0d52cf880a97d5eaf8e18efab65 --- /dev/null +++ b/model_data_json/stabilityai_stable-diffusion-3.5-medium.json @@ -0,0 +1,17 @@ +{ + "model_id": "stabilityai/stable-diffusion-3.5-medium", + "downloads": 407393, + "tags": [ + "diffusers", + "safetensors", + "text-to-image", + "stable-diffusion", + "en", + "arxiv:2403.03206", + "license:other", + "diffusers:StableDiffusion3Pipeline", + "region:us" + ], + "description": "--- license: other license_name: stabilityai-ai-community license_link: LICENSE.md tags: - text-to-image - stable-diffusion - diffusers inference: true extra_gated_prompt: >- By clicking \"Agree\", you agree to the License Agreement and acknowledge Stability AI's Privacy Policy. extra_gated_fields: Name: text Email: text Country: country Organization or Affiliation: text Receive email updates and promotions on Stability AI products, services, and research?: type: select options: - 'Yes' - 'No' What do you intend to use the model for?: type: select options: - Research - Personal use - Creative Professional - Startup - Enterprise I agree to the License Agreement and acknowledge Stability AI's Privacy Policy: checkbox language: - en pipeline_tag: text-to-image --- # Stable Diffusion 3.5 Medium !3.5 Medium Demo Image ## Model !MMDiT-X Stable Diffusion 3.5 Medium is a Multimodal Diffusion Transformer with improvements (MMDiT-X) text-to-image model that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency. Please note: This model is released under the Stability Community License. Visit Stability AI to learn or contact us for commercial licensing details. ### Model Description - **Developed by:** Stability AI - **Model type:** MMDiT-X text-to-image generative model - **Model Description:** This model generates images based on text prompts. It is a Multimodal Diffusion Transformer ( with improvements that use three fixed, pretrained text encoders, with QK-normalization to improve training stability, and dual attention blocks in the first 12 transformer layers. ### License - **Community License:** Free for research, non-commercial, and commercial use for organizations or individuals with less than $1M in total annual revenue. More details can be found in the Community License Agreement. Read more at - **For individuals and organizations with annual revenue above $1M**: please contact us to get an Enterprise License. ### Model Sources For local or self-hosted use, we recommend ComfyUI for node-based UI inference, or diffusers or GitHub for programmatic use. - **ComfyUI:** Github, Example Workflow - **Huggingface Space:** Space - **Diffusers**: See below. - **GitHub**: GitHub. - **API Endpoints:** - Stability AI API ### Implementation Details - **MMDiT-X:** Introduces self-attention modules in the first 13 layers of the transformer, enhancing multi-resolution generation and overall image coherence. - **QK Normalization:** Implements the QK normalization technique to improve training Stability. - **Mixed-Resolution Training:** - Progressive training stages: 256 → 512 → 768 → 1024 → 1440 resolution - The final stage included mixed-scale image training to boost multi-resolution generation performance - Extended positional embedding space to 384x384 (latent) at lower resolution stages - Employed random crop augmentation on positional embeddings to enhance transformer layer robustness across the entire range of mixed resolutions and aspect ratios. For example, given a 64x64 latent image, we add a randomly cropped 64x64 embedding from the 192x192 embedding space during training as the input to the x stream. These enhancements collectively contribute to the model's improved performance in multi-resolution image generation, coherence, and adaptability across various text-to-image tasks. - **Text Encoders:** - CLIPs: OpenCLIP-ViT/G, CLIP-ViT/L, context length 77 tokens - T5: T5-xxl, context length 77/256 tokens at different stages of training - **Training Data and Strategy:** This model was trained on a wide variety of data, including synthetic data and filtered publicly available data. For more technical details of the original MMDiT architecture, please refer to the Research paper. ### Usage & Limitations - While this model can handle long prompts, you may observe artifacts on the edge of generations when T5 tokens go over 256. Pay attention to the token limits when using this model in your workflow, and shortern prompts if artifacts becomes too obvious. - The medium model has a different training data distribution than the large model, so it may not respond to the same prompt similarly. - We recommend sampling with **Skip Layer Guidance** for better structure and anatomy coherency. ### Model Performance See blog for our study about comparative performance in prompt adherence and aesthetic quality. ## File Structure Click here to access the Files and versions tab ## Using with Diffusers Upgrade to the latest version of the 🧨 diffusers library and then you can run ### Quantizing the model with diffusers Reduce your VRAM usage and have the model fit on 🤏 VRAM GPUs ### Fine-tuning Please see the fine-tuning guide here. ## Uses ### Intended Uses Intended uses include the following: * Generation of artworks and use in design and other artistic processes. * Applications in educational or creative tools. * Research on generative models, including understanding the limitations of generative models. All uses of the model must be in accordance with our Acceptable Use Policy. ### Out-of-Scope Uses The model was not trained to be factual or true representations of people or events. As such, using the model to generate such content is out-of-scope of the abilities of this model. ## Safety As part of our safety-by-design and responsible AI deployment approach, we take deliberate measures to ensure Integrity starts at the early stages of development. We implement safety measures throughout the development of our models. We have implemented safety mitigations that are intended to reduce the risk of certain harms, however we recommend that developers conduct their own testing and apply additional mitigations based on their specific use cases. For more about our approach to Safety, please visit our Safety page. ### Integrity Evaluation Our integrity evaluation methods include structured evaluations and red-teaming testing for certain harms. Testing was conducted primarily in English and may not cover all possible harms. ### Risks identified and mitigations: * Harmful content: We have used filtered data sets when training our models and implemented safeguards that attempt to strike the right balance between usefulness and preventing harm. However, this does not guarantee that all possible harmful content has been removed. TAll developers and deployers should exercise caution and implement content safety guardrails based on their specific product policies and application use cases. * Misuse: Technical limitations and developer and end-user education can help mitigate against malicious applications of models. All users are required to adhere to our Acceptable Use Policy, including when applying fine-tuning and prompt engineering mechanisms. Please reference the Stability AI Acceptable Use Policy for information on violative uses of our products. * Privacy violations: Developers and deployers are encouraged to adhere to privacy regulations with techniques that respect data privacy. ### Contact Please report any issues with the model or contact us: * Safety issues: safety@stability.ai * Security issues: security@stability.ai * Privacy issues: privacy@stability.ai * License and general: * Enterprise license:", + "model_explanation_gemini": "Generates high-quality images from text prompts with improved typography, complex prompt understanding, and resource efficiency using a Multimodal Diffusion Transformer (MMDiT-X) architecture. \n\n**Features:** \n- Multimodal Diffusion Transformer (MMDiT-X) with enhanced self-attention and QK normalization \n- Improved image quality, typography, and prompt comprehension \n- Supports mixed-resolution training (256 → 1440) \n- Uses three pretrained text encoders (CLIP-Vi" +} \ No newline at end of file diff --git a/model_data_json/starvector_starvector-8b-im2svg.json b/model_data_json/starvector_starvector-8b-im2svg.json new file mode 100644 index 0000000000000000000000000000000000000000..fc4ab02be79091be8255c4a1a16ff8e058c64c7a --- /dev/null +++ b/model_data_json/starvector_starvector-8b-im2svg.json @@ -0,0 +1,18 @@ +{ + "model_id": "starvector/starvector-8b-im2svg", + "downloads": 81636, + "tags": [ + "transformers", + "safetensors", + "starvector", + "text-generation", + "custom_code", + "en", + "arxiv:2312.11556", + "license:apache-2.0", + "autotrain_compatible", + "region:us" + ], + "description": "--- library_name: transformers license: apache-2.0 language: - en --- # Model Card for StarVector !image/png StarVector is a foundation model for generating Scalable Vector Graphics (SVG) code from images and text. It utilizes a Vision-Language Modeling architecture to understand both visual and textual inputs, enabling high-quality vectorization and text-guided SVG creation. ## Model Details ### Model Description This is the model card for the StarVector model, a 🤗 transformers model. StarVector is a foundation model for generating Scalable Vector Graphics (SVG) code from images and text. It utilizes a Vision-Language Modeling architecture to understand both visual and textual inputs, enabling high-quality vectorization and text-guided SVG creation. - **Developed by:** ServiceNow Research, Mila - Quebec AI Institute, ETS, Montreal. - **Shared by :** Juan A Rodriguez, Abhay Puri, Shubham Agarwal, Issam H. Laradji, Sai Rajeswar, Pau Rodriguez, David Vazquez, Christopher Pal, Marco Pedersoli. - **Model type:** Vision-Language Model for SVG Generation. - **Language(s) (NLP):** English. - **License:** Apache 2.0 ### Model Architecture The StarVector architecture integrates an image encoder and a Large Language Model (LLM) Adapter to generate SVG code from both image and text inputs. Images are first converted into embeddings using a Vision Transformer (ViT), after which the LLM Adapter maps these embeddings into the LLM's embedding space to create visual tokens. Text prompts are handled through the LLM’s tokenizer and embedder. This unified multimodal approach ensures precise and contextually rich SVG output.
\"Figure
Figure 2: a) StarVector Architecture: StarVector projects images into embeddings via an image encoder, then maps these embeddings to the LLM hidden space using an LLM Adapter, generating Visual Tokens. Text conditioning is achieved with the LLM's tokenizer and embedder. The model learns to map token sequences (visual or textual) to SVG code. The symbol ⊕ denotes mutually exclusive operations (image-to- SVG or text-to-SVG), while ‖ indicates sequence concatenation. Figure 2: b)Vision Model and Adapter: The image encoder employs a Vision Transformer (ViT) to process image patches sequentially. The LLM Adapter non-linearly projects embeddings into visual tokens for LLM integration.
### Model Sources - **Repository:** - **Paper:** ## Uses ### Direct Use Image-to-SVG generation, Text-to-SVG generation. ### Downstream Use Creation of icons, logotypes, technical diagrams, and other vector graphics. ### Out-of-Scope Use Generating realistic photographic images or complex 3D graphics. ## Bias, Risks, and Limitations Potential biases may exist in the model due to the composition of the training data (SVG-Stack). The model's ability to perfectly vectorize all types of images and interpret all textual instructions may have limitations. Users should be aware of these potential issues, especially in critical applications. ### Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. Further investigation into the model's behavior across different types of inputs is recommended. ## How to Get Started with the Model Use the code below to get started with the model. ## Training Details ### Training Data SVG-Stack: A dataset of over 2 million SVG samples. ### Training Procedure The model utilizes a Vision-Language Modeling architecture. Images are projected into embeddings via an image encoder, then mapped to the LLM hidden space using an LLM Adapter, generating Visual Tokens. Text conditioning is achieved with the LLM's tokenizer and embedder. The model learns to map token sequences (visual or textual) to SVG code. ## Evaluation ### Testing Data & Factors #### Testing Data SVG-Bench #### Factors SVG-Stack, SVG-Fonts, SVG-Icons, SVG-Emoji, SVG-Diagrams. ## Models StarVector models achieve state-of-the-art performance on SVG generation tasks We provide Hugging Face 🤗 model checkpoints for image2SVG vectorization, for 💫 StarVector-8B and 💫 StarVector-1B. These are the results on SVG-Bench, using the DinoScore metric. | Method | SVG-Stack | SVG-Fonts | SVG-Icons | SVG-Emoji | SVG-Diagrams | |--------------------|-----------|-----------|-----------|-----------|--------------| | AutoTrace | 0.942 | 0.954 | 0.946 | 0.975 | 0.874 | | Potrace | 0.898 | 0.967 | 0.972 | 0.882 | 0.875 | | VTracer | 0.954 | 0.964 | 0.940 | 0.981 | 0.882 | | Im2Vec | 0.692 | 0.733 | 0.754 | 0.732 | - | | LIVE | 0.934 | 0.956 | 0.959 | 0.969 | 0.870 | | DiffVG | 0.810 | 0.821 | 0.952 | 0.814 | 0.822 | | GPT-4-V | 0.852 | 0.842 | 0.848 | 0.850 | - | | 💫 **StarVector-1B** | 0.926 | 0.978 | 0.975 | 0.929 | 0.943 | | 💫 **StarVector-8B** | 0.966 | 0.982 | 0.984 | 0.981 | 0.959 | **Note:** StarVector models will not work for natural images or illustrations, as they have not been trained on those images. They excel in vectorizing icons, logotypes, technical diagrams, graphs, and charts. As shown in the table above, StarVector-8B achieves the highest performance across all benchmark datasets, demonstrating its effectiveness in generating high-quality SVG code from images. The model's ability to understand and reproduce complex vector graphics makes it particularly valuable for applications requiring precise vectorization of icons, logos, and technical diagrams. ## Summary StarVector represents a significant advancement in the field of vector graphics generation. By combining the power of vision-language models with a comprehensive training dataset, we've created a system that can accurately translate images into high-quality SVG code. The model's performance on SVG-Bench demonstrates its effectiveness across a wide range of vector graphics tasks. We believe that StarVector will enable new applications in design, illustration, and technical documentation, making vector graphics more accessible and easier to create. We invite the research community to build upon our work and explore new directions in this exciting field. For more details, please refer to our paper and explore our code repository. ## BibTeX entry and citation info", + "model_explanation_gemini": "Generates Scalable Vector Graphics (SVG) code from images or text using a vision-language modeling architecture. \n\n**Features:** \n- Image-to-SVG and text-to-SVG generation \n- Vision-Language Model (ViT + LLM Adapter) architecture \n- Supports high-quality vectorization and text-guided SVG creation \n- Trained on SVG-Stack dataset (2M+ SVG samples) \n- Outputs icons, logotypes, diagrams, and other vector graphics \n\n**Comparison" +} \ No newline at end of file diff --git a/model_data_json/stepfun-ai_GOT-OCR2_0.json b/model_data_json/stepfun-ai_GOT-OCR2_0.json new file mode 100644 index 0000000000000000000000000000000000000000..21cdd90860cb60c5220809e9b5be61fcd7bf26bd --- /dev/null +++ b/model_data_json/stepfun-ai_GOT-OCR2_0.json @@ -0,0 +1,21 @@ +{ + "model_id": "stepfun-ai/GOT-OCR2_0", + "downloads": 80978, + "tags": [ + "safetensors", + "GOT", + "got", + "vision-language", + "ocr2.0", + "custom_code", + "image-text-to-text", + "multilingual", + "arxiv:2409.01704", + "arxiv:2405.14295", + "arxiv:2312.06109", + "license:apache-2.0", + "region:us" + ], + "description": "--- pipeline_tag: image-text-to-text language: - multilingual tags: - got - vision-language - ocr2.0 - custom_code license: apache-2.0 ---

General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

🔋Online Demo | 🌟GitHub | 📜Paper
Haoran Wei*, Chenglong Liu*, Jinyue Chen, Jia Wang, Lingyu Kong, Yanming Xu, Zheng Ge, Liang Zhao, Jianjian Sun, Yuang Peng, Chunrui Han, Xiangyu Zhang !image/jpeg ## Usage Inference using Huggingface transformers on NVIDIA GPUs. Requirements tested on python 3.10: More details about 'ocr_type', 'ocr_box', 'ocr_color', and 'render' can be found at our GitHub. Our training codes are available at our GitHub. ## More Multimodal Projects 👏 Welcome to explore more multimodal projects of our team: Vary | Fox | OneChart ## Citation If you find our work helpful, please consider citing our papers 📝 and liking this project ❤️!", + "model_explanation_gemini": "Performs multilingual image-to-text OCR (Optical Character Recognition) with end-to-end unified processing, supporting custom configurations like text type, bounding boxes, and colors.\n\nModel Features: \n- Multilingual image-text-to-text conversion \n- End-to-end unified OCR processing \n- Customizable OCR parameters (text type, bounding boxes, colors) \n- GPU-optimized inference \n- Open-source training code \n\nComparison: \nThis model positions itself as \"OCR-2.0\" with unified end" +} \ No newline at end of file diff --git a/model_data_json/tasksource_ModernBERT-large-nli.json b/model_data_json/tasksource_ModernBERT-large-nli.json new file mode 100644 index 0000000000000000000000000000000000000000..0c8106f97d7ce7bc165fadddab486ed2b164331e --- /dev/null +++ b/model_data_json/tasksource_ModernBERT-large-nli.json @@ -0,0 +1,25 @@ +{ + "model_id": "tasksource/ModernBERT-large-nli", + "downloads": 82059, + "tags": [ + "transformers", + "safetensors", + "modernbert", + "text-classification", + "instruct", + "natural-language-inference", + "nli", + "zero-shot-classification", + "en", + "dataset:nyu-mll/glue", + "dataset:facebook/anli", + "base_model:answerdotai/ModernBERT-large", + "base_model:finetune:answerdotai/ModernBERT-large", + "license:apache-2.0", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- library_name: transformers base_model: - answerdotai/ModernBERT-large license: apache-2.0 language: - en pipeline_tag: zero-shot-classification datasets: - nyu-mll/glue - facebook/anli tags: - instruct - natural-language-inference - nli --- # Model Card for Model ID This model is ModernBERT multi-task fine-tuned on tasksource NLI tasks, including MNLI, ANLI, SICK, WANLI, doc-nli, LingNLI, FOLIO, FOL-NLI, LogicNLI, Label-NLI and all datasets in the below table). This is the equivalent of an \"instruct\" version. The model was trained for 200k steps on an Nvidia A30 GPU. It is very good at reasoning tasks (better than llama 3.1 8B Instruct on ANLI and FOLIO), long context reasoning, sentiment analysis and zero-shot classification with new labels. The following table shows model test accuracy. These are the scores for the same single transformer with different classification heads on top. Further gains can be obtained by fine-tuning on a single-task, e.g. SST, but it this checkpoint is great for zero-shot classification and natural language inference (contradiction/entailment/neutral classification). | test_name | test_accuracy | |:--------------------------------------|----------------:| | glue/mnli | 0.89 | | glue/qnli | 0.96 | | glue/rte | 0.91 | | glue/wnli | 0.64 | | glue/mrpc | 0.81 | | glue/qqp | 0.87 | | glue/cola | 0.87 | | glue/sst2 | 0.96 | | super_glue/boolq | 0.66 | | super_glue/cb | 0.86 | | super_glue/multirc | 0.9 | | super_glue/wic | 0.71 | | super_glue/axg | 1 | | anli/a1 | 0.72 | | anli/a2 | 0.54 | | anli/a3 | 0.55 | | sick/label | 0.91 | | sick/entailment_AB | 0.93 | | snli | 0.94 | | scitail/snli_format | 0.95 | | hans | 1 | | WANLI | 0.77 | | recast/recast_ner | 0.85 | | recast/recast_sentiment | 0.97 | | recast/recast_verbnet | 0.89 | | recast/recast_megaveridicality | 0.87 | | recast/recast_verbcorner | 0.87 | | recast/recast_kg_relations | 0.9 | | recast/recast_factuality | 0.95 | | recast/recast_puns | 0.98 | | probability_words_nli/reasoning_1hop | 1 | | probability_words_nli/usnli | 0.79 | | probability_words_nli/reasoning_2hop | 0.98 | | nan-nli | 0.85 | | nli_fever | 0.78 | | breaking_nli | 0.99 | | conj_nli | 0.72 | | fracas | 0.79 | | dialogue_nli | 0.94 | | mpe | 0.75 | | dnc | 0.91 | | recast_white/fnplus | 0.76 | | recast_white/sprl | 0.9 | | recast_white/dpr | 0.84 | | add_one_rte | 0.94 | | paws/labeled_final | 0.96 | | pragmeval/pdtb | 0.56 | | lex_glue/scotus | 0.58 | | lex_glue/ledgar | 0.85 | | dynasent/dynabench.dynasent.r1.all/r1 | 0.83 | | dynasent/dynabench.dynasent.r2.all/r2 | 0.76 | | cycic_classification | 0.96 | | lingnli | 0.91 | | monotonicity-entailment | 0.97 | | scinli | 0.88 | | naturallogic | 0.93 | | dynahate | 0.86 | | syntactic-augmentation-nli | 0.94 | | autotnli | 0.92 | | defeasible-nli/atomic | 0.83 | | defeasible-nli/snli | 0.8 | | help-nli | 0.96 | | nli-veridicality-transitivity | 0.99 | | lonli | 0.99 | | dadc-limit-nli | 0.79 | | folio | 0.71 | | tomi-nli | 0.54 | | puzzte | 0.59 | | temporal-nli | 0.93 | | counterfactually-augmented-snli | 0.81 | | cnli | 0.9 | | boolq-natural-perturbations | 0.72 | | equate | 0.65 | | logiqa-2.0-nli | 0.58 | | mindgames | 0.96 | | ConTRoL-nli | 0.66 | | logical-fallacy | 0.38 | | cladder | 0.89 | | conceptrules_v2 | 1 | | zero-shot-label-nli | 0.79 | | scone | 1 | | monli | 1 | | SpaceNLI | 1 | | propsegment/nli | 0.92 | | FLD.v2/default | 0.91 | | FLD.v2/star | 0.78 | | SDOH-NLI | 0.99 | | scifact_entailment | 0.87 | | feasibilityQA | 0.79 | | AdjectiveScaleProbe-nli | 1 | | resnli | 1 | | semantic_fragments_nli | 1 | | dataset_train_nli | 0.95 | | nlgraph | 0.97 | | ruletaker | 0.99 | | PARARULE-Plus | 1 | | logical-entailment | 0.93 | | nope | 0.56 | | LogicNLI | 0.91 | | contract-nli/contractnli_a/seg | 0.88 | | contract-nli/contractnli_b/full | 0.84 | | nli4ct_semeval2024 | 0.72 | | biosift-nli | 0.92 | | SIGA-nli | 0.57 | | FOL-nli | 0.79 | | doc-nli | 0.81 | | mctest-nli | 0.92 | | natural-language-satisfiability | 0.92 | | idioms-nli | 0.83 | | lifecycle-entailment | 0.79 | | MSciNLI | 0.84 | | hover-3way/nli | 0.92 | | seahorse_summarization_evaluation | 0.81 | | missing-item-prediction/contrastive | 0.88 | | Pol_NLI | 0.93 | | synthetic-retrieval-NLI/count | 0.72 | | synthetic-retrieval-NLI/position | 0.9 | | synthetic-retrieval-NLI/binary | 0.92 | | babi_nli | 0.98 | # Usage ## [ZS] Zero-shot classification pipeline NLI training data of this model includes label-nli, a NLI dataset specially constructed to improve this kind of zero-shot classification. ## [NLI] Natural language inference pipeline ## Backbone for further fune-tuning This checkpoint has stronger reasoning and fine-grained abilities than the base version and can be used for further fine-tuning. # Citation", + "model_explanation_gemini": "A multi-task fine-tuned ModernBERT model specializing in zero-shot classification, natural language inference (NLI), and reasoning tasks, outperforming models like Llama 3.1 8B Instruct on benchmarks like ANLI and FOLIO. \n\n**Features**: \n- Zero-shot classification with new labels \n- Natural language inference (contradiction/entailment/neutral classification) \n- Strong performance on reasoning tasks (long-context, logical, and sentiment analysis) \n- Multi" +} \ No newline at end of file diff --git a/model_data_json/tiiuae_falcon-7b.json b/model_data_json/tiiuae_falcon-7b.json new file mode 100644 index 0000000000000000000000000000000000000000..e4cb52af8f8a7bd08ddcc03281f28468f402c6b3 --- /dev/null +++ b/model_data_json/tiiuae_falcon-7b.json @@ -0,0 +1,26 @@ +{ + "model_id": "tiiuae/falcon-7b", + "downloads": 70655, + "tags": [ + "transformers", + "pytorch", + "safetensors", + "falcon", + "text-generation", + "custom_code", + "en", + "dataset:tiiuae/falcon-refinedweb", + "arxiv:2205.14135", + "arxiv:1911.02150", + "arxiv:2101.00027", + "arxiv:2005.14165", + "arxiv:2104.09864", + "arxiv:2306.01116", + "license:apache-2.0", + "autotrain_compatible", + "text-generation-inference", + "region:us" + ], + "description": "--- datasets: - tiiuae/falcon-refinedweb language: - en inference: false license: apache-2.0 new_version: tiiuae/falcon-11B --- # 🚀 Falcon-7B **Falcon-7B is a 7B parameters causal decoder-only model built by TII and trained on 1,500B tokens of RefinedWeb enhanced with curated corpora. It is made available under the Apache 2.0 license.** *Paper coming soon* 😊. 🤗 To get started with Falcon (inference, finetuning, quantization, etc.), we recommend reading this great blogpost fron HF! ## Why use Falcon-7B? * **It outperforms comparable open-source models** (e.g., MPT-7B, StableLM, RedPajama etc.), thanks to being trained on 1,500B tokens of RefinedWeb enhanced with curated corpora. See the OpenLLM Leaderboard. * **It features an architecture optimized for inference**, with FlashAttention (Dao et al., 2022) and multiquery (Shazeer et al., 2019). * **It is made available under a permissive Apache 2.0 license allowing for commercial use**, without any royalties or restrictions. ⚠️ **This is a raw, pretrained model, which should be further finetuned for most usecases.** If you are looking for a version better suited to taking generic instructions in a chat format, we recommend taking a look at Falcon-7B-Instruct. 🔥 **Looking for an even more powerful model?** Falcon-40B is Falcon-7B's big brother! 💥 **Falcon LLMs require PyTorch 2.0 for use with !** For fast inference with Falcon, check-out Text Generation Inference! Read more in this blogpost. You will need **at least 16GB of memory** to swiftly run inference with Falcon-7B. # Model Card for Falcon-7B ## Model Details ### Model Description - **Developed by:** - **Model type:** Causal decoder-only; - **Language(s) (NLP):** English, German, Spanish, French (and limited capabilities in Italian, Portuguese, Polish, Dutch, Romanian, Czech, Swedish); - **License:** Apache 2.0. ### Model Source - **Paper:** *coming soon*. ## Uses ### Direct Use Research on large language models; as a foundation for further specialization and finetuning for specific usecases (e.g., summarization, text generation, chatbot, etc.) ### Out-of-Scope Use Production use without adequate assessment of risks and mitigation; any use cases which may be considered irresponsible or harmful. ## Bias, Risks, and Limitations Falcon-7B is trained on English and French data only, and will not generalize appropriately to other languages. Furthermore, as it is trained on a large-scale corpora representative of the web, it will carry the stereotypes and biases commonly encountered online. ### Recommendations We recommend users of Falcon-7B to consider finetuning it for the specific set of tasks of interest, and for guardrails and appropriate precautions to be taken for any production use. ## How to Get Started with the Model ## Training Details ### Training Data Falcon-7B was trained on 1,500B tokens of RefinedWeb, a high-quality filtered and deduplicated web dataset which we enhanced with curated corpora. Significant components from our curated copora were inspired by The Pile (Gao et al., 2020). | **Data source** | **Fraction** | **Tokens** | **Sources** | |--------------------|--------------|------------|-----------------------------------| | RefinedWeb-English | 79% | 1,185B | massive web crawl | | Books | 7% | 110B | | | Conversations | 6% | 85B | Reddit, StackOverflow, HackerNews | | Code | 3% | 45B | | | RefinedWeb-French | 3% | 45B | massive web crawl | | Technical | 2% | 30B | arXiv, PubMed, USPTO, etc. | The data was tokenized with the Falcon-7B/40B tokenizer. ### Training Procedure Falcon-7B was trained on 384 A100 40GB GPUs, using a 2D parallelism strategy (PP=2, DP=192) combined with ZeRO. #### Training Hyperparameters | **Hyperparameter** | **Value** | **Comment** | |--------------------|------------|-------------------------------------------| | Precision | | | | Optimizer | AdamW | | | Learning rate | 6e-4 | 4B tokens warm-up, cosine decay to 1.2e-5 | | Weight decay | 1e-1 | | | Z-loss | 1e-4 | | | Batch size | 2304 | 30B tokens ramp-up | #### Speeds, Sizes, Times Training happened in early March 2023 and took about two weeks. ## Evaluation *Paper coming soon*. See the OpenLLM Leaderboard for early results. ## Technical Specifications ### Model Architecture and Objective Falcon-7B is a causal decoder-only model trained on a causal language modeling task (i.e., predict the next token). The architecture is broadly adapted from the GPT-3 paper (Brown et al., 2020), with the following differences: * **Positionnal embeddings:** rotary (Su et al., 2021); * **Attention:** multiquery (Shazeer et al., 2019) and FlashAttention (Dao et al., 2022); * **Decoder-block:** parallel attention/MLP with a single layer norm. | **Hyperparameter** | **Value** | **Comment** | |--------------------|-----------|----------------------------------------| | Layers | 32 | | | | 4544 | Increased to compensate for multiquery | | | 64 | Reduced to optimise for FlashAttention | | Vocabulary | 65024 | | | Sequence length | 2048 | | ### Compute Infrastructure #### Hardware Falcon-7B was trained on AWS SageMaker, on 384 A100 40GB GPUs in P4d instances. #### Software Falcon-7B was trained a custom distributed training codebase, Gigatron. It uses a 3D parallelism approach combined with ZeRO and high-performance Triton kernels (FlashAttention, etc.) ## Citation *Paper coming soon* 😊. In the meanwhile, you can use the following information to cite: To learn more about the pretraining dataset, see the 📓 RefinedWeb paper. ## License Falcon-7B is made available under the Apache 2.0 license. ## Contact falconllm@tii.ae", + "model_explanation_gemini": "Falcon-7B is a 7-billion-parameter, causal decoder-only model trained on 1,500B tokens of refined web and curated data, optimized for inference and outperforming similar open-source models like MPT-7B and StableLM under a permissive Apache 2.0 license. \n\n**Features:** \n- **Architecture:** Causal decoder-only with FlashAttention and multiquery optimization. \n- **Training Data:** 1,500B tokens from" +} \ No newline at end of file diff --git a/model_data_json/tiiuae_falcon-rw-1b.json b/model_data_json/tiiuae_falcon-rw-1b.json new file mode 100644 index 0000000000000000000000000000000000000000..fd9181788611bf0aab120373c28e37bbe08eb341 --- /dev/null +++ b/model_data_json/tiiuae_falcon-rw-1b.json @@ -0,0 +1,23 @@ +{ + "model_id": "tiiuae/falcon-rw-1b", + "downloads": 72026, + "tags": [ + "transformers", + "pytorch", + "falcon", + "text-generation", + "custom_code", + "en", + "dataset:tiiuae/falcon-refinedweb", + "arxiv:2306.01116", + "arxiv:2005.14165", + "arxiv:2108.12409", + "arxiv:2205.14135", + "license:apache-2.0", + "autotrain_compatible", + "text-generation-inference", + "region:us" + ], + "description": "--- datasets: - tiiuae/falcon-refinedweb language: - en inference: false license: apache-2.0 --- # Falcon-RW-1B **Falcon-RW-1B is a 1B parameters causal decoder-only model built by TII and trained on 350B tokens of RefinedWeb. It is made available under the Apache 2.0 license.** See the 📓 paper on arXiv for more details. RefinedWeb is a high-quality web dataset built by leveraging stringent filtering and large-scale deduplication. Falcon-RW-1B, trained on RefinedWeb only, matches or outperforms comparable models trained on curated data. ⚠️ Falcon is now available as a core model in the library! To use the in-library version, please install the latest version of with , then simply remove the argument from . ⚠️ This model is intended for use as a **research artifact**, to study the influence of training on web data alone. **If you are interested in state-of-the-art models, we recommend using Falcon-7B/40B, both trained on >1,000 billion tokens.** 💥 **Falcon LLMs require PyTorch 2.0 for use with !** # Model Card for Falcon-RW-1B ## Model Details ### Model Description - **Developed by:** - **Model type:** Causal decoder-only; - **Language(s) (NLP):** English; - **License:** Apache 2.0. ### Model Source - **Paper:** ## Uses ### Direct Use Research on large language models, specifically the influence of adequately filtered and deduplicated web data on the properties of large language models (fairness, safety, limitations, capabilities, etc.). ### Out-of-Scope Use Production use without adequate assessment of risks and mitigation; any use cases which may be considered irresponsible or harmful. Broadly speaking, we would recommend Falcon-7B/40B for any use not directly related to research on web data pipelines. ## Bias, Risks, and Limitations Falcon-RW-1B is trained on English data only, and will not generalize appropriately to other languages. Furthermore, as it is trained on a large-scale corpora representative of the web, it will carry the stereotypes and biases commonly encountered online. ### Recommendations We recommend users of Falcon-RW-1B to consider finetuning it for the specific set of tasks of interest, and for guardrails and appropriate precautions to be taken for any production use. ## How to Get Started with the Model ## Training Details ### Training Data Falcon-RW-1B was trained on 350B tokens of RefinedWeb, a high-quality filtered and deduplicated web dataset. The data was tokenized with the GPT-2 tokenizer. ### Training Procedure Falcon-RW-1B was trained on 32 A100 40GB GPUs, using only data parallelism with ZeRO. #### Training Hyperparameters Hyperparameters were adapted from the GPT-3 paper (Brown et al., 2020). | **Hyperparameter** | **Value** | **Comment** | |--------------------|------------|-------------------------------------------| | Precision | | | | Optimizer | AdamW | | | Learning rate | 2e-4 | 500M tokens warm-up, cosine decay to 2e-5 | | Weight decay | 1e-1 | | | Batch size | 512 | 4B tokens ramp-up | #### Speeds, Sizes, Times Training happened in early December 2022 and took about six days. ## Evaluation See the 📓 paper on arXiv for in-depth evaluation. ## Technical Specifications ### Model Architecture and Objective Falcon-RW-1B is a causal decoder-only model trained on a causal language modeling task (i.e., predict the next token). The architecture is adapted from the GPT-3 paper (Brown et al., 2020), but uses ALiBi (Ofir et al., 2021) and FlashAttention (Dao et al., 2022). | **Hyperparameter** | **Value** | **Comment** | |--------------------|-----------|----------------------------------------| | Layers | 24 | | | | 2048 | | | | 64 | Reduced to optimise for FlashAttention | | Vocabulary | 50304 | | | Sequence length | 2048 | | ### Compute Infrastructure #### Hardware Falcon-RW-1B was trained on AWS SageMaker, on 32 A100 40GB GPUs in P4d instances. #### Software Falcon-RW-1B was trained a custom distributed training codebase, Gigatron. It uses a 3D parallelism approach combined with ZeRO and high-performance Triton kernels (FlashAttention, etc.) ## Citation ## Contact falconllm@tii.ae", + "model_explanation_gemini": "\"Falcon-RW-1B is a 1-billion-parameter English causal decoder-only model trained on 350B tokens of RefinedWeb for researching the impact of filtered web data on large language models, outperforming comparable models trained on curated data.\"\n\n**Model Features:** \n- **Architecture:** Causal decoder-only (GPT-3 style) with ALiBi and FlashAttention \n- **Training Data:** 350B tokens from RefinedWeb (filtered, ded" +} \ No newline at end of file diff --git a/model_data_json/timm_ViT-SO400M-14-SigLIP.json b/model_data_json/timm_ViT-SO400M-14-SigLIP.json new file mode 100644 index 0000000000000000000000000000000000000000..eda4b33462c6e17c4e94601c7bfadb8b334fac8e --- /dev/null +++ b/model_data_json/timm_ViT-SO400M-14-SigLIP.json @@ -0,0 +1,17 @@ +{ + "model_id": "timm/ViT-SO400M-14-SigLIP", + "downloads": 72007, + "tags": [ + "open_clip", + "safetensors", + "clip", + "siglip", + "zero-shot-image-classification", + "dataset:webli", + "arxiv:2303.15343", + "license:apache-2.0", + "region:us" + ], + "description": "--- tags: - clip - siglip library_name: open_clip pipeline_tag: zero-shot-image-classification license: apache-2.0 datasets: - webli --- # Model card for ViT-SO400M-14-SigLIP A SigLIP (Sigmoid loss for Language-Image Pre-training) model trained on WebLI. This model has been converted to PyTorch from the original JAX checkpoints in Big Vision. These weights are usable in both OpenCLIP (image + text) and timm (image only). ## Model Details - **Model Type:** Contrastive Image-Text, Zero-Shot Image Classification. - **Original:** - **Dataset:** WebLI - **Papers:** - Sigmoid loss for language image pre-training: ## Model Usage ### With OpenCLIP ### With (for image embeddings) ## Citation", + "model_explanation_gemini": "A SigLIP model trained on WebLI for zero-shot image classification and contrastive image-text tasks, usable in OpenCLIP (image + text) and timm (image only). \n\n**Features:** \n- Model type: Contrastive Image-Text, Zero-Shot Image Classification \n- Dataset: WebLI \n- Frameworks: OpenCLIP (image + text), timm (image only) \n- Original implementation: Converted from JAX (Big Vision) to PyTor" +} \ No newline at end of file diff --git a/model_data_json/timm_convnext_tiny.in12k.json b/model_data_json/timm_convnext_tiny.in12k.json new file mode 100644 index 0000000000000000000000000000000000000000..c17e972cc602d5d41983c3083d90e3d9a963517b --- /dev/null +++ b/model_data_json/timm_convnext_tiny.in12k.json @@ -0,0 +1,17 @@ +{ + "model_id": "timm/convnext_tiny.in12k", + "downloads": 69867, + "tags": [ + "timm", + "pytorch", + "safetensors", + "image-classification", + "transformers", + "dataset:imagenet-12k", + "arxiv:2201.03545", + "license:apache-2.0", + "region:us" + ], + "description": "--- license: apache-2.0 library_name: timm tags: - image-classification - timm - transformers datasets: - imagenet-12k --- # Model card for convnext_tiny.in12k A ConvNeXt image classification model. Trained in on ImageNet-12k (a 11821 class subset of full ImageNet-22k) by Ross Wightman. ImageNet-12k training done on TPUs thanks to support of the TRC program. ## Model Details - **Model Type:** Image classification / feature backbone - **Model Stats:** - Params (M): 36.9 - GMACs: 4.5 - Activations (M): 13.4 - Image size: 224 x 224 - **Papers:** - A ConvNet for the 2020s: - **Original:** - **Dataset:** ImageNet-12k ## Model Usage ### Image Classification ### Feature Map Extraction ### Image Embeddings ## Model Comparison Explore the dataset and runtime metrics of this model in timm model results. All timing numbers from eager model PyTorch 1.13 on RTX 3090 w/ AMP. | model |top1 |top5 |img_size|param_count|gmacs |macts |samples_per_sec|batch_size| |------------------------------------------------------------------------------------------------------------------------------|------|------|--------|-----------|------|------|---------------|----------| | convnextv2_huge.fcmae_ft_in22k_in1k_512 |88.848|98.742|512 |660.29 |600.81|413.07|28.58 |48 | | convnextv2_huge.fcmae_ft_in22k_in1k_384 |88.668|98.738|384 |660.29 |337.96|232.35|50.56 |64 | | convnext_xxlarge.clip_laion2b_soup_ft_in1k |88.612|98.704|256 |846.47 |198.09|124.45|122.45 |256 | | convnext_large_mlp.clip_laion2b_soup_ft_in12k_in1k_384 |88.312|98.578|384 |200.13 |101.11|126.74|196.84 |256 | | convnextv2_large.fcmae_ft_in22k_in1k_384 |88.196|98.532|384 |197.96 |101.1 |126.74|128.94 |128 | | convnext_large_mlp.clip_laion2b_soup_ft_in12k_in1k_320 |87.968|98.47 |320 |200.13 |70.21 |88.02 |283.42 |256 | | convnext_xlarge.fb_in22k_ft_in1k_384 |87.75 |98.556|384 |350.2 |179.2 |168.99|124.85 |192 | | convnextv2_base.fcmae_ft_in22k_in1k_384 |87.646|98.422|384 |88.72 |45.21 |84.49 |209.51 |256 | | convnext_large.fb_in22k_ft_in1k_384 |87.476|98.382|384 |197.77 |101.1 |126.74|194.66 |256 | | convnext_large_mlp.clip_laion2b_augreg_ft_in1k |87.344|98.218|256 |200.13 |44.94 |56.33 |438.08 |256 | | convnextv2_large.fcmae_ft_in22k_in1k |87.26 |98.248|224 |197.96 |34.4 |43.13 |376.84 |256 | | convnext_base.clip_laion2b_augreg_ft_in12k_in1k_384 |87.138|98.212|384 |88.59 |45.21 |84.49 |365.47 |256 | | convnext_xlarge.fb_in22k_ft_in1k |87.002|98.208|224 |350.2 |60.98 |57.5 |368.01 |256 | | convnext_base.fb_in22k_ft_in1k_384 |86.796|98.264|384 |88.59 |45.21 |84.49 |366.54 |256 | | convnextv2_base.fcmae_ft_in22k_in1k |86.74 |98.022|224 |88.72 |15.38 |28.75 |624.23 |256 | | convnext_large.fb_in22k_ft_in1k |86.636|98.028|224 |197.77 |34.4 |43.13 |581.43 |256 | | convnext_base.clip_laiona_augreg_ft_in1k_384 |86.504|97.97 |384 |88.59 |45.21 |84.49 |368.14 |256 | | convnext_base.clip_laion2b_augreg_ft_in12k_in1k |86.344|97.97 |256 |88.59 |20.09 |37.55 |816.14 |256 | | convnextv2_huge.fcmae_ft_in1k |86.256|97.75 |224 |660.29 |115.0 |79.07 |154.72 |256 | | convnext_small.in12k_ft_in1k_384 |86.182|97.92 |384 |50.22 |25.58 |63.37 |516.19 |256 | | convnext_base.clip_laion2b_augreg_ft_in1k |86.154|97.68 |256 |88.59 |20.09 |37.55 |819.86 |256 | | convnext_base.fb_in22k_ft_in1k |85.822|97.866|224 |88.59 |15.38 |28.75 |1037.66 |256 | | convnext_small.fb_in22k_ft_in1k_384 |85.778|97.886|384 |50.22 |25.58 |63.37 |518.95 |256 | | convnextv2_large.fcmae_ft_in1k |85.742|97.584|224 |197.96 |34.4 |43.13 |375.23 |256 | | convnext_small.in12k_ft_in1k |85.174|97.506|224 |50.22 |8.71 |21.56 |1474.31 |256 | | convnext_tiny.in12k_ft_in1k_384 |85.118|97.608|384 |28.59 |13.14 |39.48 |856.76 |256 | | convnextv2_tiny.fcmae_ft_in22k_in1k_384 |85.112|97.63 |384 |28.64 |13.14 |39.48 |491.32 |256 | | convnextv2_base.fcmae_ft_in1k |84.874|97.09 |224 |88.72 |15.38 |28.75 |625.33 |256 | | convnext_small.fb_in22k_ft_in1k |84.562|97.394|224 |50.22 |8.71 |21.56 |1478.29 |256 | | convnext_large.fb_in1k |84.282|96.892|224 |197.77 |34.4 |43.13 |584.28 |256 | | convnext_tiny.in12k_ft_in1k |84.186|97.124|224 |28.59 |4.47 |13.44 |2433.7 |256 | | convnext_tiny.fb_in22k_ft_in1k_384 |84.084|97.14 |384 |28.59 |13.14 |39.48 |862.95 |256 | | convnextv2_tiny.fcmae_ft_in22k_in1k |83.894|96.964|224 |28.64 |4.47 |13.44 |1452.72 |256 | | convnext_base.fb_in1k |83.82 |96.746|224 |88.59 |15.38 |28.75 |1054.0 |256 | | convnextv2_nano.fcmae_ft_in22k_in1k_384 |83.37 |96.742|384 |15.62 |7.22 |24.61 |801.72 |256 | | convnext_small.fb_in1k |83.142|96.434|224 |50.22 |8.71 |21.56 |1464.0 |256 | | convnextv2_tiny.fcmae_ft_in1k |82.92 |96.284|224 |28.64 |4.47 |13.44 |1425.62 |256 | | convnext_tiny.fb_in22k_ft_in1k |82.898|96.616|224 |28.59 |4.47 |13.44 |2480.88 |256 | | convnext_nano.in12k_ft_in1k |82.282|96.344|224 |15.59 |2.46 |8.37 |3926.52 |256 | | convnext_tiny_hnf.a2h_in1k |82.216|95.852|224 |28.59 |4.47 |13.44 |2529.75 |256 | | convnext_tiny.fb_in1k |82.066|95.854|224 |28.59 |4.47 |13.44 |2346.26 |256 | | convnextv2_nano.fcmae_ft_in22k_in1k |82.03 |96.166|224 |15.62 |2.46 |8.37 |2300.18 |256 | | convnextv2_nano.fcmae_ft_in1k |81.83 |95.738|224 |15.62 |2.46 |8.37 |2321.48 |256 | | convnext_nano_ols.d1h_in1k |80.866|95.246|224 |15.65 |2.65 |9.38 |3523.85 |256 | | convnext_nano.d1h_in1k |80.768|95.334|224 |15.59 |2.46 |8.37 |3915.58 |256 | | convnextv2_pico.fcmae_ft_in1k |80.304|95.072|224 |9.07 |1.37 |6.1 |3274.57 |256 | | convnext_pico.d1_in1k |79.526|94.558|224 |9.05 |1.37 |6.1 |5686.88 |256 | | convnext_pico_ols.d1_in1k |79.522|94.692|224 |9.06 |1.43 |6.5 |5422.46 |256 | | convnextv2_femto.fcmae_ft_in1k |78.488|93.98 |224 |5.23 |0.79 |4.57 |4264.2 |256 | | convnext_femto_ols.d1_in1k |77.86 |93.83 |224 |5.23 |0.82 |4.87 |6910.6 |256 | | convnext_femto.d1_in1k |77.454|93.68 |224 |5.22 |0.79 |4.57 |7189.92 |256 | | convnextv2_atto.fcmae_ft_in1k |76.664|93.044|224 |3.71 |0.55 |3.81 |4728.91 |256 | | convnext_atto_ols.a2_in1k |75.88 |92.846|224 |3.7 |0.58 |4.11 |7963.16 |256 | | convnext_atto.d2_in1k |75.664|92.9 |224 |3.7 |0.55 |3.81 |8439.22 |256 | ## Citation", + "model_explanation_gemini": "A ConvNeXt-based image classification model trained on ImageNet-12k for categorizing images into 11,821 classes or extracting features/embeddings. \n\n**Features:** \n- **Model Type:** Image classification / feature backbone \n- **Params:** 36.9M \n- **Input Size:** 224x224 pixels \n- **Training Dataset:** ImageNet-12k (11,821 classes) \n- **Performance Metrics:** 85.118% top-1" +} \ No newline at end of file diff --git a/model_data_json/timm_tf_efficientnetv2_s.in21k_ft_in1k.json b/model_data_json/timm_tf_efficientnetv2_s.in21k_ft_in1k.json new file mode 100644 index 0000000000000000000000000000000000000000..d6288088b4768c9bb70b0f3e163e2f8911df8896 --- /dev/null +++ b/model_data_json/timm_tf_efficientnetv2_s.in21k_ft_in1k.json @@ -0,0 +1,18 @@ +{ + "model_id": "timm/tf_efficientnetv2_s.in21k_ft_in1k", + "downloads": 76011, + "tags": [ + "timm", + "pytorch", + "safetensors", + "image-classification", + "transformers", + "dataset:imagenet-1k", + "dataset:imagenet-21k", + "arxiv:2104.00298", + "license:apache-2.0", + "region:us" + ], + "description": "--- tags: - image-classification - timm - transformers library_name: timm license: apache-2.0 datasets: - imagenet-1k - imagenet-21k --- # Model card for tf_efficientnetv2_s.in21k_ft_in1k A EfficientNet-v2 image classification model. Trained on ImageNet-21k and fine-tuned on ImageNet-1k in Tensorflow by paper authors, ported to PyTorch by Ross Wightman. ## Model Details - **Model Type:** Image classification / feature backbone - **Model Stats:** - Params (M): 21.5 - GMACs: 5.4 - Activations (M): 22.7 - Image size: train = 300 x 300, test = 384 x 384 - **Papers:** - EfficientNetV2: Smaller Models and Faster Training: - **Dataset:** ImageNet-1k - **Pretrain Dataset:** ImageNet-21k - **Original:** ## Model Usage ### Image Classification ### Feature Map Extraction ### Image Embeddings ## Model Comparison Explore the dataset and runtime metrics of this model in timm model results. ## Citation", + "model_explanation_gemini": "\"An EfficientNet-v2 image classification model pretrained on ImageNet-21k and fine-tuned on ImageNet-1k, capable of feature extraction and embeddings.\"\n\nModel Features: \n- **Type**: Image classification / feature backbone \n- **Params**: 21.5M \n- **Input size**: Train (300x300), Test (384x384) \n- **Pretrained on**: ImageNet-21k \n- **Fine-tuned on**: ImageNet-1" +} \ No newline at end of file diff --git a/model_data_json/timm_vit_small_patch16_224.dino.json b/model_data_json/timm_vit_small_patch16_224.dino.json new file mode 100644 index 0000000000000000000000000000000000000000..6ffb8bf727974836028dad905b95b034f8320eb9 --- /dev/null +++ b/model_data_json/timm_vit_small_patch16_224.dino.json @@ -0,0 +1,17 @@ +{ + "model_id": "timm/vit_small_patch16_224.dino", + "downloads": 72865, + "tags": [ + "timm", + "pytorch", + "safetensors", + "image-feature-extraction", + "transformers", + "arxiv:2104.14294", + "arxiv:2010.11929", + "license:apache-2.0", + "region:us" + ], + "description": "--- license: apache-2.0 library_name: timm tags: - image-feature-extraction - timm - transformers --- # Model card for vit_small_patch16_224.dino A Vision Transformer (ViT) image feature model. Trained with Self-Supervised DINO method. ## Model Details - **Model Type:** Image classification / feature backbone - **Model Stats:** - Params (M): 21.7 - GMACs: 4.3 - Activations (M): 8.2 - Image size: 224 x 224 - **Papers:** - Emerging Properties in Self-Supervised Vision Transformers: - An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale: - **Pretrain Dataset:** ImageNet-1k - **Original:** ## Model Usage ### Image Classification ### Image Embeddings ## Model Comparison Explore the dataset and runtime metrics of this model in timm model results. ## Citation", + "model_explanation_gemini": "A Vision Transformer (ViT) model trained with self-supervised DINO for image classification and feature extraction, using 224x224 resolution inputs.\n\nModel Features: \n- **Model Type**: Vision Transformer (ViT-small, patch16) \n- **Training Method**: Self-supervised DINO \n- **Primary Tasks**: Image classification, feature embedding extraction \n- **Input Resolution**: 224x224 pixels \n- **Model Size**: 21.7M parameters, 4.3" +} \ No newline at end of file diff --git a/model_data_json/togethercomputer_evo-1-8k-base.json b/model_data_json/togethercomputer_evo-1-8k-base.json new file mode 100644 index 0000000000000000000000000000000000000000..fbc620a561907fe7c2164cbab4ead6498bd1d32a --- /dev/null +++ b/model_data_json/togethercomputer_evo-1-8k-base.json @@ -0,0 +1,28 @@ +{ + "model_id": "togethercomputer/evo-1-8k-base", + "downloads": 73720, + "tags": [ + "transformers", + "safetensors", + "stripedhyena", + "text-generation", + "long context", + "deep signal processing", + "hybrid", + "biology", + "genomics", + "custom_code", + "arxiv:2302.10866", + "arxiv:2203.14343", + "arxiv:2310.18780", + "arxiv:2206.11893", + "arxiv:2303.06349", + "arxiv:2102.02611", + "arxiv:2210.09298", + "license:apache-2.0", + "autotrain_compatible", + "region:us" + ], + "description": "--- license: apache-2.0 tags: - stripedhyena - long context - deep signal processing - hybrid - biology - genomics --- ## Evo-1 (Phase 1)

### News We identified and fixed an issue related to a wrong permutation of some projections, which affects generation quality. To use the new model revision, please load as follows: ### About Evo is a biological foundation model capable of long-context modeling and design. Evo uses the StripedHyena architecture to enable modeling of sequences at a single-nucleotide, byte-level resolution with near-linear scaling of compute and memory relative to context length. Evo has 7 billion parameters and is trained on OpenGenome, a prokaryotic whole-genome dataset containing ~300 billion tokens. Technical details about Evo can be found in our preprint and our accompanying blog posts. Evo was collaboratively developed by the Arc Institute and TogetherAI. As part of our commitment to open science, we release **weights of 15 intermediate pretraining checkpoints** for phase 1 and phase 2 of pretraining. The checkpoints are available as branches of the corresponding HuggingFace repository. **Evo-1 (Phase 1)** is our first model in the Evo family, trained at a context length of 8k. | Checkpoint Name | Description | |----------------------------------------|-------------| | | A model pretrained with 8,192 context. We use this model as the base model for molecular-scale finetuning tasks. | | | A model pretrained with 131,072 context using as the initialization. We use this model to reason about and generate sequences at the genome scale. | ### Model Architecture StripedHyena is a deep signal processing, hybrid architecture composed of multi-head attention and gated convolutions arranged in Hyena blocks, improving over decoder-only Transformers. StripedHyena is designed to leverage the specialization of each of its layer classes, with Hyena layers implementing the bulk of the computation required for sequence processing and attention layers supplementing the ability to perform targeted pattern recall. Some highlights of the architecture: - **Efficient autoregressive generation** via a recurrent mode (>500k generation with a single 80GB GPU) - **Significantly faster training and finetuning** at long context (>3x at 131k) - **Improved scaling laws over state-of-the-art architectures** (e.g., Transformer++) on both natural language and biological sequences. - **Robust to training beyond the compute-optimal frontier** e.g., training way beyond Chinchilla-optimal token amounts (see preprint for details -- more details to come) ### How to use Evo Example usage is provided in the standalone repo. #### Parametrization for Inference and Finetuning One of the advantages of deep signal processing models is their flexibility. Different parametrizations of convolutions can be used depending on the memory, expressivity and causality requirements of pretraining, finetuning or inference workloads. The main classes are: - Modal canonical: unconstrained poles (reference, reference), or constrained poles (reference, reference). - Companion canonical / rational: TBA. - Hypernetworks: hypernetwork (reference), modulated hypernetwork (reference). - Explicit: modulated explicit (reference). StripedHyena is a mixed precision model. Make sure to keep your and in precision, especially for longer prompts or training. ### Disclaimer To use StripedHyena outside of the playground, you will need to install custom kernels. Please follow the instructions from the standalone repository. ## Cite", + "model_explanation_gemini": "\"Evo-1-8k-base is a 7B-parameter biological foundation model for long-context genomic sequence modeling and design, using the StripedHyena hybrid architecture for efficient byte-level processing.\"\n\n### Model Features: \n- **Architecture**: StripedHyena (Hyena blocks + multi-head attention) \n- **Purpose**: Single-nucleotide, byte-level genome sequence modeling \n- **Context Length**: 8k tokens (base version) \n- **Training Data**:" +} \ No newline at end of file diff --git a/model_data_json/typeform_distilbert-base-uncased-mnli.json b/model_data_json/typeform_distilbert-base-uncased-mnli.json new file mode 100644 index 0000000000000000000000000000000000000000..5c314758fed8d1dc6b5f0e75c4ed4bda43796eb7 --- /dev/null +++ b/model_data_json/typeform_distilbert-base-uncased-mnli.json @@ -0,0 +1,22 @@ +{ + "model_id": "typeform/distilbert-base-uncased-mnli", + "downloads": 70769, + "tags": [ + "transformers", + "pytorch", + "tf", + "safetensors", + "distilbert", + "text-classification", + "zero-shot-classification", + "en", + "dataset:multi_nli", + "arxiv:1910.09700", + "arxiv:2105.09680", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "--- language: en pipeline_tag: zero-shot-classification tags: - distilbert datasets: - multi_nli metrics: - accuracy --- # DistilBERT base model (uncased) ## Table of Contents - Model Details - How to Get Started With the Model - Uses - Risks, Limitations and Biases - Training - Evaluation - Environmental Impact ## Model Details **Model Description:** This is the uncased DistilBERT model fine-tuned on Multi-Genre Natural Language Inference (MNLI) dataset for the zero-shot classification task. - **Developed by:** The Typeform team. - **Model Type:** Zero-Shot Classification - **Language(s):** English - **License:** Unknown - **Parent Model:** See the distilbert base uncased model for more information about the Distilled-BERT base model. ## How to Get Started with the Model ## Uses This model can be used for text classification tasks. ## Risks, Limitations and Biases **CONTENT WARNING: Readers should be aware this section contains content that is disturbing, offensive, and can propagate historical and current stereotypes.** Significant research has explored bias and fairness issues with language models (see, e.g., Sheng et al. (2021) and Bender et al. (2021)). ## Training #### Training Data This model of DistilBERT-uncased is pretrained on the Multi-Genre Natural Language Inference (MultiNLI) corpus. It is a crowd-sourced collection of 433k sentence pairs annotated with textual entailment information. The corpus covers a range of genres of spoken and written text, and supports a distinctive cross-genre generalization evaluation. This model is also **not** case-sensitive, i.e., it does not make a difference between \"english\" and \"English\". #### Training Procedure Training is done on a p3.2xlarge AWS EC2 with the following hyperparameters: ## Evaluation #### Evaluation Results When fine-tuned on downstream tasks, this model achieves the following results: - **Epoch = ** 5.0 - **Evaluation Accuracy =** 0.8206875508543532 - **Evaluation Loss =** 0.8706700205802917 - ** Evaluation Runtime = ** 17.8278 - ** Evaluation Samples per second = ** 551.498 MNLI and MNLI-mm results: | Task | MNLI | MNLI-mm | |:----:|:----:|:----:| | | 82.0 | 82.0 | ## Environmental Impact Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). We present the hardware type based on the associated paper. **Hardware Type:** 1 NVIDIA Tesla V100 GPUs **Hours used:** Unknown **Cloud Provider:** AWS EC2 P3 **Compute Region:** Unknown **Carbon Emitted:** (Power consumption x Time x Carbon produced based on location of power grid): Unknown", + "model_explanation_gemini": "A zero-shot text classification model based on DistilBERT, fine-tuned on the MNLI dataset for English tasks without requiring task-specific training data.\n\n**Features:** \n- Zero-shot classification \n- English language \n- Uncased (case-insensitive) \n- Fine-tuned on MultiNLI (433k sentence pairs) \n- DistilBERT base architecture (efficient, distilled version of BERT) \n\n**Comparison:** \nUnlike standard classification models requiring labeled training data for each task," +} \ No newline at end of file diff --git a/model_data_json/unsloth_DeepSeek-V3-0324-GGUF.json b/model_data_json/unsloth_DeepSeek-V3-0324-GGUF.json new file mode 100644 index 0000000000000000000000000000000000000000..07f95853a1856fbc0eb5a96c8841f81e1d656f22 --- /dev/null +++ b/model_data_json/unsloth_DeepSeek-V3-0324-GGUF.json @@ -0,0 +1,25 @@ +{ + "model_id": "unsloth/DeepSeek-V3-0324-GGUF", + "downloads": 78841, + "tags": [ + "transformers", + "gguf", + "deepseek_v3", + "text-generation", + "deepseek", + "unsloth", + "custom_code", + "en", + "arxiv:2412.19437", + "base_model:deepseek-ai/DeepSeek-V3-0324", + "base_model:quantized:deepseek-ai/DeepSeek-V3-0324", + "license:mit", + "autotrain_compatible", + "endpoints_compatible", + "fp8", + "region:us", + "conversational" + ], + "description": "--- base_model: deepseek-ai/DeepSeek-V3-0324 language: - en library_name: transformers license: mit tags: - deepseek_v3 - deepseek - unsloth - transformers --- Our DeepSeek-V3-0324 GGUFs allow you to run the model in llama.cpp, LMStudio, Open WebUI and other inference frameworks. Includes 1-4-bit Dynamic versions, which yields better accuracy and results than standard quantization. | MoE Bits | Type | Disk Size | Accuracy | Link | Details | |----------|----------|-------------|----------|------------------------------------------------------------------------------------------------------------|---------------------------------------------------| | 1.78bit (prelim) | IQ1_S | **186GB** | Ok | Link | in MoE mixture of 2.06/1.78bit | | 1.93bit (prelim) | IQ1_M | **196GB** | Fair | Link | in MoE mixture of 2.06/1.93bit | | 2.42bit | IQ2_XXS | **219GB** | Recommended | Link | in MoE all 2.42bit | | 2.71bit | Q2_K_XL | **248GB** | Recommended | Link | in MoE mixture of 3.5/2.71bit | | 3.5bit | Q3_K_XL | **321GB** | Great | Link | in MoE mixture of 4.5/3.5bit | | 4.5bit | Q4_K_XL | **405GB** | Best | Link | in MoE mixture of 5.5/4.5bit | Prelim = preliminary - through our testing, they're generally fine but sometimes don't produce the best code and so more work/testing needs to be done. 2.71bit was found to be the best in terms of performance/size and produces code that is great and works well. 2.42bit was also found to pass all our tests. So, for best results, use the 2.42-bit (IQ2_XXS) or 2.71-bit (Q2_K_XL) versions. Though not a must, try to have at least 180GB+ combined VRAM + RAM. Thank you to the DeepSeek team for releasing their March update to the DeepSeek V3 models. Also thank you to bartowski for providing imatric V3 quants. # Finetune your own Reasoning model like R1 with Unsloth! We have a free Google Colab notebook for turning Llama 3.1 (8B) into a reasoning model: ## ✨ Finetune for Free All notebooks are **beginner friendly**! Add your dataset, click \"Run All\", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face. | Unsloth supports | Free Notebooks | Performance | Memory use | |-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------| | **GRPO with Phi-4 (14B)** | ▶️ Start on Colab-GRPO.ipynb) | 2x faster | 80% less | | **Llama-3.2 (3B)** | ▶️ Start on Colab-Conversational.ipynb) | 2.4x faster | 58% less | | **Llama-3.2 (11B vision)** | ▶️ Start on Colab-Vision.ipynb) | 2x faster | 60% less | | **Qwen2 VL (7B)** | ▶️ Start on Colab-Vision.ipynb) | 1.8x faster | 60% less | | **Qwen2.5 (7B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2x faster | 60% less | | **Llama-3.1 (8B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2.4x faster | 58% less | | **Phi-3.5 (mini)** | ▶️ Start on Colab | 2x faster | 50% less | | **Gemma 2 (9B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2.4x faster | 58% less | | **Mistral (7B)** | ▶️ Start on Colab-Conversational.ipynb) | 2.2x faster | 62% less |
\"DeepSeek-V3\"

## Features DeepSeek-V3-0324 demonstrates notable improvements over its predecessor, DeepSeek-V3, in several key aspects. !Model Performance ### Reasoning Capabilities - Significant improvements in benchmark performance: - MMLU-Pro: 75.9 → 81.2 (+5.3) - GPQA: 59.1 → 68.4 (+9.3) - AIME: 39.6 → 59.4 (+19.8) - LiveCodeBench: 39.2 → 49.2 (+10.0) ### Front-End Web Development - Improved the executability of the code - More aesthetically pleasing web pages and game front-ends ### Chinese Writing Proficiency - Enhanced style and content quality: - Aligned with the R1 writing style - Better quality in medium-to-long-form writing - Feature Enhancements - Improved multi-turn interactive rewriting - Optimized translation quality and letter writing ### Chinese Search Capabilities - Enhanced report analysis requests with more detailed outputs ### Function Calling Improvements - Increased accuracy in Function Calling, fixing issues from previous V3 versions --- ## Usage Recommendations ### System Prompt In the official DeepSeek web/app, we use the same system prompt with a specific date. For example, ### Temperature In our web and application environments, the temperature parameter $T_{model}$ is set to 0.3. Because many users use the default temperature 1.0 in API call, we have implemented an API temperature $T_{api}$ mapping mechanism that adjusts the input API temperature value of 1.0 to the most suitable model temperature setting of 0.3. $$ T_{model} = T_{api} \\times 0.3 \\quad (0 \\leq T_{api} \\leq 1) $$ $$ T_{model} = T_{api} - 0.7 \\quad (1 < T_{api} \\leq 2) $$ Thus, if you call V3 via API, temperature 1.0 equals to the model temperature 0.3. ### Prompts for File Uploading and Web Search For file uploading, please follow the template to create prompts, where {file_name}, {file_content} and {question} are arguments. For Web Search, {search_results}, {cur_date}, and {question} are arguments. For Chinese query, we use the prompt: For English query, we use the prompt: ## How to Run Locally The model structure of DeepSeek-V3-0324 is exactly the same as DeepSeek-V3. Please visit DeepSeek-V3 repo for more information about running this model locally. **This model supports features such as function calling, JSON output, and FIM completion. For instructions on how to construct prompts to use these features, please refer to DeepSeek-V2.5 repo.** **NOTE: Hugging Face's Transformers has not been directly supported yet.** ## License This repository and the model weights are licensed under the MIT License. ## Citation ## Contact If you have any questions, please raise an issue or contact us at service@deepseek.com.", + "model_explanation_gemini": "A quantized version of DeepSeek-V3-0324 optimized for efficient local inference with improved accuracy through selective quantization, compatible with llama.cpp and other frameworks. \n\n**Features:** \n- Multiple quantization levels (1-4 bit) with dynamic versions for better accuracy \n- Optimized for local inference (llama.cpp, LMStudio, Open WebUI) \n- Recommended variants: 2.42-bit (IQ2_XXS) and 2.71-bit (Q2_K" +} \ No newline at end of file diff --git a/model_data_json/unsloth_Llama-3.2-1B-Instruct-unsloth-bnb-4bit.json b/model_data_json/unsloth_Llama-3.2-1B-Instruct-unsloth-bnb-4bit.json new file mode 100644 index 0000000000000000000000000000000000000000..d609fee5b90011a9da7928d85a1defde5804ea8e --- /dev/null +++ b/model_data_json/unsloth_Llama-3.2-1B-Instruct-unsloth-bnb-4bit.json @@ -0,0 +1,27 @@ +{ + "model_id": "unsloth/Llama-3.2-1B-Instruct-unsloth-bnb-4bit", + "downloads": 68982, + "tags": [ + "transformers", + "safetensors", + "llama", + "text-generation", + "llama-3", + "meta", + "facebook", + "unsloth", + "conversational", + "en", + "base_model:meta-llama/Llama-3.2-1B-Instruct", + "base_model:quantized:meta-llama/Llama-3.2-1B-Instruct", + "license:llama3.2", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "4-bit", + "bitsandbytes", + "region:us" + ], + "description": "--- base_model: meta-llama/Llama-3.2-1B-Instruct language: - en library_name: transformers license: llama3.2 tags: - llama-3 - llama - meta - facebook - unsloth - transformers --- ## ***See our collection for all versions of Llama 3.2 including GGUF, 4-bit and original 16-bit formats.*** *Dynamic 4-bit: Unsloth's Dynamic 4-bit Quants selectively avoids quantizing certain parameters, greatly increase accuracy than standard 4-bit.
See our full collection of Unsloth quants on Hugging Face here.*
# Finetune Llama 3.2, Gemma 2, Mistral 2-5x faster with 70% less memory via Unsloth! We have a free Google Colab Tesla T4 notebook for Llama 3.2 (3B) here: unsloth/Llama-3.2-1B-Instruct-unsloth-bnb-4bit For more details on the model, please go to Meta's original model card ## ✨ Finetune for Free All notebooks are **beginner friendly**! Add your dataset, click \"Run All\", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face. | Unsloth supports | Free Notebooks | Performance | Memory use | |-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------| | **Llama-3.2 (3B)** | ▶️ Start on Colab-Conversational.ipynb) | 2.4x faster | 58% less | | **Llama-3.2 (11B vision)** | ▶️ Start on Colab-Vision.ipynb) | 2x faster | 60% less | | **Qwen2 VL (7B)** | ▶️ Start on Colab-Vision.ipynb) | 1.8x faster | 60% less | | **Qwen2.5 (7B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2x faster | 60% less | | **Llama-3.1 (8B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2.4x faster | 58% less | | **Phi-3.5 (mini)** | ▶️ Start on Colab | 2x faster | 50% less | | **Gemma 2 (9B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2.4x faster | 58% less | | **Mistral (7B)** | ▶️ Start on Colab-Conversational.ipynb) | 2.2x faster | 62% less | - This Llama 3.2 conversational notebook-Conversational.ipynb) is useful for ShareGPT ChatML / Vicuna templates. - This text completion notebook-Text_Completion.ipynb) is for raw text. This DPO notebook replicates Zephyr. - \\* Kaggle has 2x T4s, but we use 1. Due to overhead, 1x T4 is 5x faster. ## Special Thanks A huge thank you to the Meta and Llama team for creating and releasing these models. ## Model Information The Meta Llama 3.2 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction-tuned generative models in 1B and 3B sizes (text in/text out). The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks. They outperform many of the available open source and closed chat models on common industry benchmarks. **Model developer**: Meta **Model Architecture:** Llama 3.2 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. **Supported languages:** English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai are officially supported. Llama 3.2 has been trained on a broader collection of languages than these 8 supported languages. Developers may fine-tune Llama 3.2 models for languages beyond these supported languages, provided they comply with the Llama 3.2 Community License and the Acceptable Use Policy. Developers are always expected to ensure that their deployments, including those that involve additional languages, are completed safely and responsibly. **Llama 3.2 family of models** Token counts refer to pretraining data only. All model versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date:** Sept 25, 2024 **Status:** This is a static model trained on an offline dataset. Future versions may be released that improve model capabilities and safety. **License:** Use of Llama 3.2 is governed by the Llama 3.2 Community License (a custom, commercial license agreement). Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model README. For more technical information about generation parameters and recipes for how to use Llama 3.1 in applications, please go here.", + "model_explanation_gemini": "A 4-bit quantized version of Meta's Llama-3.2-1B-Instruct model optimized for faster inference and lower memory usage while maintaining higher accuracy than standard 4-bit quantization, designed for multilingual dialogue tasks like retrieval and summarization. \n\n**Features:** \n- Dynamic 4-bit quantization (selective parameter quantization for improved accuracy) \n- Optimized for multilingual conversational use (English, German, French, etc.) \n- 70% less memory usage and 2" +} \ No newline at end of file diff --git a/model_data_json/unsloth_Llama-3.2-3B-Instruct-bnb-4bit.json b/model_data_json/unsloth_Llama-3.2-3B-Instruct-bnb-4bit.json new file mode 100644 index 0000000000000000000000000000000000000000..6483145c731dc979074f7fdf4ebf974df53c7af7 --- /dev/null +++ b/model_data_json/unsloth_Llama-3.2-3B-Instruct-bnb-4bit.json @@ -0,0 +1,27 @@ +{ + "model_id": "unsloth/Llama-3.2-3B-Instruct-bnb-4bit", + "downloads": 69409, + "tags": [ + "transformers", + "safetensors", + "llama", + "text-generation", + "llama-3", + "meta", + "facebook", + "unsloth", + "conversational", + "en", + "base_model:meta-llama/Llama-3.2-3B-Instruct", + "base_model:quantized:meta-llama/Llama-3.2-3B-Instruct", + "license:llama3.2", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "4-bit", + "bitsandbytes", + "region:us" + ], + "description": "--- base_model: meta-llama/Llama-3.2-3B-Instruct language: - en library_name: transformers license: llama3.2 tags: - llama-3 - llama - meta - facebook - unsloth - transformers --- ## ***See our collection for all versions of Llama 3.2 including GGUF, 4-bit and original 16-bit formats.*** # Finetune Llama 3.2, Gemma 2, Mistral 2-5x faster with 70% less memory via Unsloth! We have a free Google Colab Tesla T4 notebook for Llama 3.2 (3B) here: # unsloth/Llama-3.2-3B-Instruct-bnb-4bit For more details on the model, please go to Meta's original model card ## ✨ Finetune for Free All notebooks are **beginner friendly**! Add your dataset, click \"Run All\", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face. | Unsloth supports | Free Notebooks | Performance | Memory use | |-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------| | **Llama-3.2 (3B)** | ▶️ Start on Colab | 2.4x faster | 58% less | | **Llama-3.1 (11B vision)** | ▶️ Start on Colab | 2.4x faster | 58% less | | **Llama-3.1 (8B)** | ▶️ Start on Colab | 2.4x faster | 58% less | | **Phi-3.5 (mini)** | ▶️ Start on Colab | 2x faster | 50% less | | **Gemma 2 (9B)** | ▶️ Start on Colab | 2.4x faster | 58% less | | **Mistral (7B)** | ▶️ Start on Colab | 2.2x faster | 62% less | | **DPO - Zephyr** | ▶️ Start on Colab | 1.9x faster | 19% less | - This conversational notebook is useful for ShareGPT ChatML / Vicuna templates. - This text completion notebook is for raw text. This DPO notebook replicates Zephyr. - \\* Kaggle has 2x T4s, but we use 1. Due to overhead, 1x T4 is 5x faster. ## Special Thanks A huge thank you to the Meta and Llama team for creating and releasing these models. ## Model Information The Meta Llama 3.2 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction-tuned generative models in 1B and 3B sizes (text in/text out). The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks. They outperform many of the available open source and closed chat models on common industry benchmarks. **Model developer**: Meta **Model Architecture:** Llama 3.2 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. **Supported languages:** English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai are officially supported. Llama 3.2 has been trained on a broader collection of languages than these 8 supported languages. Developers may fine-tune Llama 3.2 models for languages beyond these supported languages, provided they comply with the Llama 3.2 Community License and the Acceptable Use Policy. Developers are always expected to ensure that their deployments, including those that involve additional languages, are completed safely and responsibly. **Llama 3.2 family of models** Token counts refer to pretraining data only. All model versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date:** Sept 25, 2024 **Status:** This is a static model trained on an offline dataset. Future versions may be released that improve model capabilities and safety. **License:** Use of Llama 3.2 is governed by the Llama 3.2 Community License (a custom, commercial license agreement). Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model README. For more technical information about generation parameters and recipes for how to use Llama 3.1 in applications, please go here.", + "model_explanation_gemini": "A 4-bit quantized version of Meta's Llama-3.2-3B-Instruct model optimized for efficient multilingual dialogue tasks, offering faster finetuning and lower memory usage via Unsloth.\n\n**Features:** \n- Base model: Meta's Llama-3.2-3B-Instruct \n- 4-bit quantization (bnb-4bit) \n- Optimized for multilingual dialogue (English, German, French, etc.) \n- Faster finetuning (2" +} \ No newline at end of file diff --git a/model_data_json/unsloth_Meta-Llama-3.1-8B-bnb-4bit.json b/model_data_json/unsloth_Meta-Llama-3.1-8B-bnb-4bit.json new file mode 100644 index 0000000000000000000000000000000000000000..26bc4b2eb1f4480b17d2b9d0e553532f1322838a --- /dev/null +++ b/model_data_json/unsloth_Meta-Llama-3.1-8B-bnb-4bit.json @@ -0,0 +1,27 @@ +{ + "model_id": "unsloth/Meta-Llama-3.1-8B-bnb-4bit", + "downloads": 80572, + "tags": [ + "transformers", + "safetensors", + "llama", + "text-generation", + "llama-3", + "meta", + "facebook", + "unsloth", + "en", + "arxiv:2204.05149", + "base_model:meta-llama/Llama-3.1-8B", + "base_model:quantized:meta-llama/Llama-3.1-8B", + "license:llama3.1", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "4-bit", + "bitsandbytes", + "region:us" + ], + "description": "--- base_model: meta-llama/Meta-Llama-3.1-8B language: - en library_name: transformers license: llama3.1 tags: - llama-3 - llama - meta - facebook - unsloth - transformers --- # Finetune Llama 3.2, Gemma 2, Mistral 2-5x faster with 70% less memory via Unsloth! We have a free Google Colab Tesla T4 notebook for Llama 3.1 (8B) here: ## ✨ Finetune for Free All notebooks are **beginner friendly**! Add your dataset, click \"Run All\", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face. | Unsloth supports | Free Notebooks | Performance | Memory use | |-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------| | **Llama-3.2 (3B)** | ▶️ Start on Colab | 2.4x faster | 58% less | | **Llama-3.2 (11B vision)** | ▶️ Start on Colab | 2x faster | 60% less | | **Llama-3.1 (8B)** | ▶️ Start on Colab | 2.4x faster | 58% less | | **Qwen2 VL (7B)** | ▶️ Start on Colab | 1.8x faster | 60% less | | **Qwen2.5 (7B)** | ▶️ Start on Colab | 2x faster | 60% less | | **Phi-3.5 (mini)** | ▶️ Start on Colab | 2x faster | 50% less | | **Gemma 2 (9B)** | ▶️ Start on Colab | 2.4x faster | 58% less | | **Mistral (7B)** | ▶️ Start on Colab | 2.2x faster | 62% less | | **DPO - Zephyr** | ▶️ Start on Colab | 1.9x faster | 19% less | - This conversational notebook is useful for ShareGPT ChatML / Vicuna templates. - This text completion notebook is for raw text. This DPO notebook replicates Zephyr. - \\* Kaggle has 2x T4s, but we use 1. Due to overhead, 1x T4 is 5x faster. ## Special Thanks A huge thank you to the Meta and Llama team for creating and releasing these models. ## Model Information The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes (text in/text out). The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks. **Model developer**: Meta **Model Architecture:** Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
Training Data Params Input modalities Output modalities Context length GQA Token count Knowledge cutoff
Llama 3.1 (text only) A new mix of publicly available online data. 8B Multilingual Text Multilingual Text and code 128k Yes 15T+ December 2023
70B Multilingual Text Multilingual Text and code 128k Yes
405B Multilingual Text Multilingual Text and code 128k Yes
**Supported languages:** English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. **Llama 3.1 family of models**. Token counts refer to pretraining data only. All model versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date:** July 23, 2024. **Status:** This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. **License:** A custom commercial license, the Llama 3.1 Community License, is available at: Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model README. For more technical information about generation parameters and recipes for how to use Llama 3.1 in applications, please go here. ## Intended Use **Intended Use Cases** Llama 3.1 is intended for commercial and research use in multiple languages. Instruction tuned text only models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. The Llama 3.1 model collection also supports the ability to leverage the outputs of its models to improve other models including synthetic data generation and distillation. The Llama 3.1 Community License allows for these use cases. **Out-of-scope** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3.1 Community License. Use in languages beyond those explicitly referenced as supported in this model card**. **Note: Llama 3.1 has been trained on a broader collection of languages than the 8 supported languages. Developers may fine-tune Llama 3.1 models for languages beyond the 8 supported languages provided they comply with the Llama 3.1 Community License and the Acceptable Use Policy and in such cases are responsible for ensuring that any uses of Llama 3.1 in additional languages is done in a safe and responsible manner. ## How to use This repository contains two versions of Meta-Llama-3.1-8B-Instruct, for use with transformers and with the original codebase. ### Use with transformers Starting with onward, you can run conversational inference using the Transformers abstraction or by leveraging the Auto classes with the function. Make sure to update your transformers installation via . Note: You can also find detailed recipes on how to use the model locally, with , assisted generations, quantised and more at []( ### Use with Please, follow the instructions in the repository To download Original checkpoints, see the example command below leveraging : ## Hardware and Software **Training Factors** We used custom training libraries, Meta's custom built GPU cluster, and production infrastructure for pretraining. Fine-tuning, annotation, and evaluation were also performed on production infrastructure. **Training utilized a cumulative of** 39.3M GPU hours of computation on H100-80GB (TDP of 700W) type hardware, per the table below. Training time is the total GPU time required for training each model and power consumption is the peak power capacity per GPU device used, adjusted for power usage efficiency. **Training Greenhouse Gas Emissions** Estimated total location-based greenhouse gas emissions were **11,390** tons CO2eq for training. Since 2020, Meta has maintained net zero greenhouse gas emissions in its global operations and matched 100% of its electricity use with renewable energy, therefore the total market-based greenhouse gas emissions for training were 0 tons CO2eq.
Training Time (GPU hours) Training Power Consumption (W) Training Location-Based Greenhouse Gas Emissions

(tons CO2eq)

Training Market-Based Greenhouse Gas Emissions

(tons CO2eq)

Llama 3.1 8B 1.46M 700 420 0
Llama 3.1 70B 7.0M 700 2,040 0
Llama 3.1 405B 30.84M 700 8,930 0
Total 39.3M
11,390 0
The methodology used to determine training energy use and greenhouse gas emissions can be found here. Since Meta is openly releasing these models, the training energy use and greenhouse gas emissions will not be incurred by others. ## Training Data **Overview:** Llama 3.1 was pretrained on ~15 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over 25M synthetically generated examples. **Data Freshness:** The pretraining data has a cutoff of December 2023. ## Benchmark scores In this section, we report the results for Llama 3.1 models on standard automatic benchmarks. For all the evaluations, we use our internal evaluations library. ### Base pretrained models
Category Benchmark # Shots Metric Llama 3 8B Llama 3.1 8B Llama 3 70B Llama 3.1 70B Llama 3.1 405B
General MMLU 5 macro_avg/acc_char 66.7 66.7 79.5 79.3 85.2
MMLU-Pro (CoT) 5 macro_avg/acc_char 36.2 37.1 55.0 53.8 61.6
AGIEval English 3-5 average/acc_char 47.1 47.8 63.0 64.6 71.6
CommonSenseQA 7 acc_char 72.6 75.0 83.8 84.1 85.8
Winogrande 5 acc_char - 60.5 - 83.3 86.7
BIG-Bench Hard (CoT) 3 average/em 61.1 64.2 81.3 81.6 85.9
ARC-Challenge 25 acc_char 79.4 79.7 93.1 92.9 96.1
Knowledge reasoning TriviaQA-Wiki 5 em 78.5 77.6 89.7 89.8 91.8
Reading comprehension SQuAD 1 em 76.4 77.0 85.6 81.8 89.3
QuAC (F1) 1 f1 44.4 44.9 51.1 51.1 53.6
BoolQ 0 acc_char 75.7 75.0 79.0 79.4 80.0
DROP (F1) 3 f1 58.4 59.5 79.7 79.6 84.8
### Instruction tuned models
Category Benchmark # Shots Metric Llama 3 8B Instruct Llama 3.1 8B Instruct Llama 3 70B Instruct Llama 3.1 70B Instruct Llama 3.1 405B Instruct
General MMLU 5 macro_avg/acc 68.5 69.4 82.0 83.6 87.3
MMLU (CoT) 0 macro_avg/acc 65.3 73.0 80.9 86.0 88.6
MMLU-Pro (CoT) 5 micro_avg/acc_char 45.5 48.3 63.4 66.4 73.3
IFEval 76.8 80.4 82.9 87.5 88.6
Reasoning ARC-C 0 acc 82.4 83.4 94.4 94.8 96.9
GPQA 0 em 34.6 30.4 39.5 41.7 50.7
Code HumanEval 0 pass@1 60.4 72.6 81.7 80.5 89.0
MBPP ++ base version 0 pass@1 70.6 72.8 82.5 86.0 88.6
Multipl-E HumanEval 0 pass@1 - 50.8 - 65.5 75.2
Multipl-E MBPP 0 pass@1 - 52.4 - 62.0 65.7
Math GSM-8K (CoT) 8 em_maj1@1 80.6 84.5 93.0 95.1 96.8
MATH (CoT) 0 final_em 29.1 51.9 51.0 68.0 73.8
Tool Use API-Bank 0 acc 48.3 82.6 85.1 90.0 92.0
BFCL 0 acc 60.3 76.1 83.0 84.8 88.5
Gorilla Benchmark API Bench 0 acc 1.7 8.2 14.7 29.7 35.3
Nexus (0-shot) 0 macro_avg/acc 18.1 38.5 47.8 56.7 58.7
Multilingual Multilingual MGSM (CoT) 0 em - 68.9 - 86.9 91.6
#### Multilingual benchmarks
Category Benchmark Language Llama 3.1 8B Llama 3.1 70B Llama 3.1 405B
General MMLU (5-shot, macro_avg/acc) Portuguese 62.12 80.13 84.95
Spanish 62.45 80.05 85.08
Italian 61.63 80.4 85.04
German 60.59 79.27 84.36
French 62.34 79.82 84.66
Hindi 50.88 74.52 80.31
Thai 50.32 72.95 78.21
## Responsibility & Safety As part of our Responsible release approach, we followed a three-pronged strategy to managing trust & safety risks: * Enable developers to deploy helpful, safe and flexible experiences for their target audience and for the use cases supported by Llama. * Protect developers against adversarial users aiming to exploit Llama capabilities to potentially cause harm. * Provide protections for the community to help prevent the misuse of our models. ### Responsible deployment Llama is a foundational technology designed to be used in a variety of use cases, examples on how Meta’s Llama models have been responsibly deployed can be found in our Community Stories webpage. Our approach is to build the most helpful models enabling the world to benefit from the technology power, by aligning our model safety for the generic use cases addressing a standard set of harms. Developers are then in the driver seat to tailor safety for their use case, defining their own policy and deploying the models with the necessary safeguards in their Llama systems. Llama 3.1 was developed following the best practices outlined in our Responsible Use Guide, you can refer to the Responsible Use Guide to learn more. #### Llama 3.1 instruct Our main objectives for conducting safety fine-tuning are to provide the research community with a valuable resource for studying the robustness of safety fine-tuning, as well as to offer developers a readily available, safe, and powerful model for various applications to reduce the developer workload to deploy safe AI systems. For more details on the safety mitigations implemented please read the Llama 3 paper. **Fine-tuning data** We employ a multi-faceted approach to data collection, combining human-generated data from our vendors with synthetic data to mitigate potential safety risks. We’ve developed many large language model (LLM)-based classifiers that enable us to thoughtfully select high-quality prompts and responses, enhancing data quality control. **Refusals and Tone** Building on the work we started with Llama 3, we put a great emphasis on model refusals to benign prompts as well as refusal tone. We included both borderline and adversarial prompts in our safety data strategy, and modified our safety data responses to follow tone guidelines. #### Llama 3.1 systems **Large language models, including Llama 3.1, are not designed to be deployed in isolation but instead should be deployed as part of an overall AI system with additional safety guardrails as required.** Developers are expected to deploy system safeguards when building agentic systems. Safeguards are key to achieve the right helpfulness-safety alignment as well as mitigating safety and security risks inherent to the system and any integration of the model or system with external tools. As part of our responsible release approach, we provide the community with safeguards that developers should deploy with Llama models or other LLMs, including Llama Guard 3, Prompt Guard and Code Shield. All our reference implementations demos contain these safeguards by default so developers can benefit from system-level safety out-of-the-box. #### New capabilities Note that this release introduces new capabilities, including a longer context window, multilingual inputs and outputs and possible integrations by developers with third party tools. Building with these new capabilities requires specific considerations in addition to the best practices that generally apply across all Generative AI use cases. **Tool-use**: Just like in standard software development, developers are responsible for the integration of the LLM with the tools and services of their choice. They should define a clear policy for their use case and assess the integrity of the third party services they use to be aware of the safety and security limitations when using this capability. Refer to the Responsible Use Guide for best practices on the safe deployment of the third party safeguards. **Multilinguality**: Llama 3.1 supports 7 languages in addition to English: French, German, Hindi, Italian, Portuguese, Spanish, and Thai. Llama may be able to output text in other languages than those that meet performance thresholds for safety and helpfulness. We strongly discourage developers from using this model to converse in non-supported languages without implementing finetuning and system controls in alignment with their policies and the best practices shared in the Responsible Use Guide. ### Evaluations We evaluated Llama models for common use cases as well as specific capabilities. Common use cases evaluations measure safety risks of systems for most commonly built applications including chat bot, coding assistant, tool calls. We built dedicated, adversarial evaluation datasets and evaluated systems composed of Llama models and Llama Guard 3 to filter input prompt and output response. It is important to evaluate applications in context, and we recommend building dedicated evaluation dataset for your use case. Prompt Guard and Code Shield are also available if relevant to the application. Capability evaluations measure vulnerabilities of Llama models inherent to specific capabilities, for which were crafted dedicated benchmarks including long context, multilingual, tools calls, coding or memorization. **Red teaming** For both scenarios, we conducted recurring red teaming exercises with the goal of discovering risks via adversarial prompting and we used the learnings to improve our benchmarks and safety tuning datasets. We partnered early with subject-matter experts in critical risk areas to understand the nature of these real-world harms and how such models may lead to unintended harm for society. Based on these conversations, we derived a set of adversarial goals for the red team to attempt to achieve, such as extracting harmful information or reprogramming the model to act in a potentially harmful capacity. The red team consisted of experts in cybersecurity, adversarial machine learning, responsible AI, and integrity in addition to multilingual content specialists with background in integrity issues in specific geographic markets. ### Critical and other risks We specifically focused our efforts on mitigating the following critical risk areas: **1- CBRNE (Chemical, Biological, Radiological, Nuclear, and Explosive materials) helpfulness** To assess risks related to proliferation of chemical and biological weapons, we performed uplift testing designed to assess whether use of Llama 3.1 models could meaningfully increase the capabilities of malicious actors to plan or carry out attacks using these types of weapons. **2. Child Safety** Child Safety risk assessments were conducted using a team of experts, to assess the model’s capability to produce outputs that could result in Child Safety risks and inform on any necessary and appropriate risk mitigations via fine tuning. We leveraged those expert red teaming sessions to expand the coverage of our evaluation benchmarks through Llama 3 model development. For Llama 3, we conducted new in-depth sessions using objective based methodologies to assess the model risks along multiple attack vectors including the additional languages Llama 3 is trained on. We also partnered with content specialists to perform red teaming exercises assessing potentially violating content while taking account of market specific nuances or experiences. **3. Cyber attack enablement** Our cyber attack uplift study investigated whether LLMs can enhance human capabilities in hacking tasks, both in terms of skill level and speed. Our attack automation study focused on evaluating the capabilities of LLMs when used as autonomous agents in cyber offensive operations, specifically in the context of ransomware attacks. This evaluation was distinct from previous studies that considered LLMs as interactive assistants. The primary objective was to assess whether these models could effectively function as independent agents in executing complex cyber-attacks without human intervention. Our study of Llama-3.1-405B’s social engineering uplift for cyber attackers was conducted to assess the effectiveness of AI models in aiding cyber threat actors in spear phishing campaigns. Please read our Llama 3.1 Cyber security whitepaper to learn more. ### Community Generative AI safety requires expertise and tooling, and we believe in the strength of the open community to accelerate its progress. We are active members of open consortiums, including the AI Alliance, Partnership on AI and MLCommons, actively contributing to safety standardization and transparency. We encourage the community to adopt taxonomies like the MLCommons Proof of Concept evaluation to facilitate collaboration and transparency on safety and content evaluations. Our Purple Llama tools are open sourced for the community to use and widely distributed across ecosystem partners including cloud service providers. We encourage community contributions to our Github repository. We also set up the Llama Impact Grants program to identify and support the most compelling applications of Meta’s Llama model for societal benefit across three categories: education, climate and open innovation. The 20 finalists from the hundreds of applications can be found here. Finally, we put in place a set of resources including an output reporting mechanism and bug bounty program to continuously improve the Llama technology with the help of the community. ## Ethical Considerations and Limitations The core values of Llama 3.1 are openness, inclusivity and helpfulness. It is meant to serve everyone, and to work for a wide range of use cases. It is thus designed to be accessible to people across many different backgrounds, experiences and perspectives. Llama 3.1 addresses users and their needs as they are, without insertion unnecessary judgment or normativity, while reflecting the understanding that even content that may appear problematic in some cases can serve valuable purposes in others. It respects the dignity and autonomy of all users, especially in terms of the values of free thought and expression that power innovation and progress. But Llama 3.1 is a new technology, and like any new technology, there are risks associated with its use. Testing conducted to date has not covered, nor could it cover, all scenarios. For these reasons, as with all LLMs, Llama 3.1’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 3.1 models, developers should perform safety testing and tuning tailored to their specific applications of the model. Please refer to available resources including our Responsible Use Guide, Trust and Safety solutions, and other resources to learn more about responsible development.", + "model_explanation_gemini": "A 4-bit quantized version of Meta's Llama-3.1-8B model optimized for faster fine-tuning and reduced memory usage via Unsloth, primarily designed for multilingual text/code generation and dialogue tasks.\n\n**Features:** \n- Base model: Meta-Llama-3.1-8B \n- 4-bit quantization (bnb-4bit) \n- Optimized for 2.4x faster fine-tuning with 58% less memory \n- Supports multilingual" +} \ No newline at end of file diff --git a/model_data_json/unsloth_Mistral-Small-24B-Instruct-2501-unsloth-bnb-4bit.json b/model_data_json/unsloth_Mistral-Small-24B-Instruct-2501-unsloth-bnb-4bit.json new file mode 100644 index 0000000000000000000000000000000000000000..c8be66a252cd32d00e35b379da6e929dcf718ed2 --- /dev/null +++ b/model_data_json/unsloth_Mistral-Small-24B-Instruct-2501-unsloth-bnb-4bit.json @@ -0,0 +1,26 @@ +{ + "model_id": "unsloth/Mistral-Small-24B-Instruct-2501-unsloth-bnb-4bit", + "downloads": 77836, + "tags": [ + "transformers", + "safetensors", + "mistral", + "text-generation", + "unsloth", + "mistral-instruct", + "instruct", + "conversational", + "en", + "base_model:mistralai/Mistral-Small-24B-Instruct-2501", + "base_model:quantized:mistralai/Mistral-Small-24B-Instruct-2501", + "license:apache-2.0", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "4-bit", + "bitsandbytes", + "region:us" + ], + "description": "--- language: - en library_name: transformers license: apache-2.0 tags: - unsloth - transformers - mistral - mistral-instruct - instruct base_model: mistralai/Mistral-Small-24B-Instruct-2501 --- # Finetune LLMs 2-5x faster with 70% less memory via Unsloth! We have a free Google Colab Tesla T4 notebook for Mistral (7B) here: ## ✨ Finetune for Free All notebooks are **beginner friendly**! Add your dataset, click \"Run All\", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face. | Unsloth supports | Free Notebooks | Performance | Memory use | |-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------| | **Llama-3.2 (3B)** | ▶️ Start on Colab-Conversational.ipynb) | 2.4x faster | 58% less | | **Llama-3.2 (11B vision)** | ▶️ Start on Colab-Vision.ipynb) | 2x faster | 60% less | | **Qwen2 VL (7B)** | ▶️ Start on Colab-Vision.ipynb) | 1.8x faster | 60% less | | **Qwen2.5 (7B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2x faster | 60% less | | **Llama-3.1 (8B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2.4x faster | 58% less | | **Phi-3.5 (mini)** | ▶️ Start on Colab | 2x faster | 50% less | | **Gemma 2 (9B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2.4x faster | 58% less | | **Mistral (7B)** | ▶️ Start on Colab-Conversational.ipynb) | 2.2x faster | 62% less | - This Llama 3.2 conversational notebook-Conversational.ipynb) is useful for ShareGPT ChatML / Vicuna templates. - This text completion notebook-Text_Completion.ipynb) is for raw text. This DPO notebook replicates Zephyr. - \\* Kaggle has 2x T4s, but we use 1. Due to overhead, 1x T4 is 5x faster. # Model Card for Mistral-Small-24B-Instruct-2501 Mistral Small 3 ( 2501 ) sets a new benchmark in the \"small\" Large Language Models category below 70B, boasting 24B parameters and achieving state-of-the-art capabilities comparable to larger models! This model is an instruction-fine-tuned version of the base model: Mistral-Small-24B-Base-2501. Mistral Small can be deployed locally and is exceptionally \"knowledge-dense\", fitting in a single RTX 4090 or a 32GB RAM MacBook once quantized. Perfect for: - Fast response conversational agents. - Low latency function calling. - Subject matter experts via fine-tuning. - Local inference for hobbyists and organizations handling sensitive data. For enterprises that need specialized capabilities (increased context, particular modalities, domain specific knowledge, etc.), we will be releasing commercial models beyond what Mistral AI contributes to the community. This release demonstrates our commitment to open source, serving as a strong base model. Learn more about Mistral Small in our blog post. Model developper: Mistral AI Team ## Key Features - **Multilingual:** Supports dozens of languages, including English, French, German, Spanish, Italian, Chinese, Japanese, Korean, Portuguese, Dutch, and Polish. - **Agent-Centric:** Offers best-in-class agentic capabilities with native function calling and JSON outputting. - **Advanced Reasoning:** State-of-the-art conversational and reasoning capabilities. - **Apache 2.0 License:** Open license allowing usage and modification for both commercial and non-commercial purposes. - **Context Window:** A 32k context window. - **System Prompt:** Maintains strong adherence and support for system prompts. - **Tokenizer:** Utilizes a Tekken tokenizer with a 131k vocabulary size. ## Benchmark results ### Human evaluated benchmarks | Category | Gemma-2-27B | Qwen-2.5-32B | Llama-3.3-70B | Gpt4o-mini | |----------|-------------|--------------|---------------|------------| | Mistral is better | 0.536 | 0.496 | 0.192 | 0.200 | | Mistral is slightly better | 0.196 | 0.184 | 0.164 | 0.204 | | Ties | 0.052 | 0.060 | 0.236 | 0.160 | | Other is slightly better | 0.060 | 0.088 | 0.112 | 0.124 | | Other is better | 0.156 | 0.172 | 0.296 | 0.312 | **Note**: - We conducted side by side evaluations with an external third-party vendor, on a set of over 1k proprietary coding and generalist prompts. - Evaluators were tasked with selecting their preferred model response from anonymized generations produced by Mistral Small 3 vs another model. - We are aware that in some cases the benchmarks on human judgement starkly differ from publicly available benchmarks, but have taken extra caution in verifying a fair evaluation. We are confident that the above benchmarks are valid. ### Publicly accesible benchmarks **Reasoning & Knowledge** | Evaluation | mistral-small-24B-instruct-2501 | gemma-2b-27b | llama-3.3-70b | qwen2.5-32b | gpt-4o-mini-2024-07-18 | |------------|---------------|--------------|---------------|---------------|-------------| | mmlu_pro_5shot_cot_instruct | 0.663 | 0.536 | 0.666 | 0.683 | 0.617 | | gpqa_main_cot_5shot_instruct | 0.453 | 0.344 | 0.531 | 0.404 | 0.377 | **Math & Coding** | Evaluation | mistral-small-24B-instruct-2501 | gemma-2b-27b | llama-3.3-70b | qwen2.5-32b | gpt-4o-mini-2024-07-18 | |------------|---------------|--------------|---------------|---------------|-------------| | humaneval_instruct_pass@1 | 0.848 | 0.732 | 0.854 | 0.909 | 0.890 | | math_instruct | 0.706 | 0.535 | 0.743 | 0.819 | 0.761 | **Instruction following** | Evaluation | mistral-small-24B-instruct-2501 | gemma-2b-27b | llama-3.3-70b | qwen2.5-32b | gpt-4o-mini-2024-07-18 | |------------|---------------|--------------|---------------|---------------|-------------| | mtbench_dev | 8.35 | 7.86 | 7.96 | 8.26 | 8.33 | | wildbench | 52.27 | 48.21 | 50.04 | 52.73 | 56.13 | | arena_hard | 0.873 | 0.788 | 0.840 | 0.860 | 0.897 | | ifeval | 0.829 | 0.8065 | 0.8835 | 0.8401 | 0.8499 | **Note**: - Performance accuracy on all benchmarks were obtained through the same internal evaluation pipeline - as such, numbers may vary slightly from previously reported performance (Qwen2.5-32B-Instruct, Llama-3.3-70B-Instruct, Gemma-2-27B-IT). - Judge based evals such as Wildbench, Arena hard and MTBench were based on gpt-4o-2024-05-13. ### Basic Instruct Template (V7-Tekken) *, and are placeholders.* ***Please make sure to use mistral-common as the source of truth*** ## Usage The model can be used with the following frameworks; - []( See here - []( See here ### vLLM We recommend using this model with the vLLM library to implement production-ready inference pipelines. **Note 1**: We recommond using a relatively low temperature, such as . **Note 2**: Make sure to add a system prompt to the model to best tailer it for your needs. If you want to use the model as a general assistant, we recommend the following system prompt: **_Installation_** Make sure you install []( Also make sure you have []( installed: You can also make use of a ready-to-go docker image or on the docker hub. #### Server We recommand that you use Mistral-Small-24B-Instruct-2501 in a server/client setting. 1. Spin up a server: **Note:** Running Mistral-Small-24B-Instruct-2501 on GPU requires ~55 GB of GPU RAM in bf16 or fp16. 2. To ping the client you can use a simple Python snippet. # /\\_/\\ # ( o.o ) # > ^ < # ### Function calling Mistral-Small-24-Instruct-2501 is excellent at function / tool calling tasks via vLLM. *E.g.:*
Example
#### Offline # /\\_/\\ # ( o.o ) # > ^ < # ### Transformers If you want to use Hugging Face transformers to generate text, you can do something like this.", + "model_explanation_gemini": "A 24B-parameter instruction-tuned multilingual model optimized for fast conversational agents, low-latency function calling, and local inference with advanced reasoning and 32k context window.\n\n**Features:** \n- Multilingual support (English, French, Chinese, etc.) \n- Agent-centric (function calling, JSON output) \n- 32k context window \n- Apache 2.0 licensed \n- Optimized for local deployment (fits RTX 4090/MacBook 32" +} \ No newline at end of file diff --git a/model_data_json/unsloth_Qwen2.5-3B-Instruct-unsloth-bnb-4bit.json b/model_data_json/unsloth_Qwen2.5-3B-Instruct-unsloth-bnb-4bit.json new file mode 100644 index 0000000000000000000000000000000000000000..841be10267d5548908d74895cd56e57d15f6466f --- /dev/null +++ b/model_data_json/unsloth_Qwen2.5-3B-Instruct-unsloth-bnb-4bit.json @@ -0,0 +1,26 @@ +{ + "model_id": "unsloth/Qwen2.5-3B-Instruct-unsloth-bnb-4bit", + "downloads": 75387, + "tags": [ + "transformers", + "safetensors", + "qwen2", + "text-generation", + "unsloth", + "qwen", + "conversational", + "en", + "arxiv:2407.10671", + "base_model:Qwen/Qwen2.5-3B-Instruct", + "base_model:quantized:Qwen/Qwen2.5-3B-Instruct", + "license:apache-2.0", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "4-bit", + "bitsandbytes", + "region:us" + ], + "description": "--- base_model: Qwen/Qwen2.5-3B-Instruct language: - en library_name: transformers license: apache-2.0 tags: - unsloth - transformers - qwen --- We have a free Google Colab Tesla T4 notebook for Qwen2.5 (7B) here: ## ✨ Finetune for Free All notebooks are **beginner friendly**! Add your dataset, click \"Run All\", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face. | Unsloth supports | Free Notebooks | Performance | Memory use | |-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------| | **Llama-3.2 (3B)** | ▶️ Start on Colab-Conversational.ipynb) | 2.4x faster | 58% less | | **Llama-3.2 (11B vision)** | ▶️ Start on Colab-Vision.ipynb) | 2x faster | 60% less | | **Qwen2 VL (7B)** | ▶️ Start on Colab-Vision.ipynb) | 1.8x faster | 60% less | | **Qwen2.5 (7B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2x faster | 60% less | | **Llama-3.1 (8B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2.4x faster | 58% less | | **Phi-3.5 (mini)** | ▶️ Start on Colab | 2x faster | 50% less | | **Gemma 2 (9B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2.4x faster | 58% less | | **Mistral (7B)** | ▶️ Start on Colab-Conversational.ipynb) | 2.2x faster | 62% less | - This Llama 3.2 conversational notebook-Conversational.ipynb) is useful for ShareGPT ChatML / Vicuna templates. - This text completion notebook-Text_Completion.ipynb) is for raw text. This DPO notebook replicates Zephyr. - \\* Kaggle has 2x T4s, but we use 1. Due to overhead, 1x T4 is 5x faster. # Qwen2.5 ## Introduction Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2: - Significantly **more knowledge** and has greatly improved capabilities in **coding** and **mathematics**, thanks to our specialized expert models in these domains. - Significant improvements in **instruction following**, **generating long texts** (over 8K tokens), **understanding structured data** (e.g, tables), and **generating structured outputs** especially JSON. **More resilient to the diversity of system prompts**, enhancing role-play implementation and condition-setting for chatbots. - **Long-context Support** up to 128K tokens and can generate up to 8K tokens. - **Multilingual support** for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more. **This repo contains the base 0.5B Qwen2.5 model**, which has the following features: - Type: Causal Language Models - Training Stage: Pretraining - Architecture: transformers with RoPE, SwiGLU, RMSNorm, Attention QKV bias and tied word embeddings - Number of Parameters: 0.49B - Number of Paramaters (Non-Embedding): 0.36B - Number of Layers: 24 - Number of Attention Heads (GQA): 14 for Q and 2 for KV - Context Length: Full 32,768 tokens **We do not recommend using base language models for conversations.** Instead, you can apply post-training, e.g., SFT, RLHF, continued pretraining, etc., on this model. For more details, please refer to our blog, GitHub, and Documentation. ## Requirements The code of Qwen2.5 has been in the latest Hugging face and we advise you to use the latest version of . With , you will encounter the following error: ## Evaluation & Performance Detailed evaluation results are reported in this 📑 blog. For requirements on GPU memory and the respective throughput, see results here. ## Citation If you find our work helpful, feel free to give us a cite.", + "model_explanation_gemini": "A 4-bit quantized version of the Qwen2.5-3B-Instruct model optimized by Unsloth for faster fine-tuning and reduced memory usage while improving accuracy over standard 4-bit quantization. \n\n**Features**: \n- Based on Qwen2.5-3B-Instruct, a 3B-parameter instruction-tuned LLM. \n- Uses Unsloth’s selective 4-bit quantization for higher accuracy. \n- Enables 2-5x" +} \ No newline at end of file diff --git a/model_data_json/unsloth_Qwen3-32B-GGUF.json b/model_data_json/unsloth_Qwen3-32B-GGUF.json new file mode 100644 index 0000000000000000000000000000000000000000..a21fd927d3d8c3ab3079a5d0bb09175752bc86f4 --- /dev/null +++ b/model_data_json/unsloth_Qwen3-32B-GGUF.json @@ -0,0 +1,24 @@ +{ + "model_id": "unsloth/Qwen3-32B-GGUF", + "downloads": 71036, + "tags": [ + "transformers", + "gguf", + "qwen3", + "text-generation", + "qwen", + "unsloth", + "en", + "arxiv:2309.00071", + "base_model:Qwen/Qwen3-32B", + "base_model:quantized:Qwen/Qwen3-32B", + "license:apache-2.0", + "autotrain_compatible", + "endpoints_compatible", + "region:us", + "imatrix", + "conversational" + ], + "description": "--- base_model: Qwen/Qwen3-32B language: - en library_name: transformers license_link: license: apache-2.0 tags: - qwen3 - qwen - unsloth - transformers --- - Fine-tune Qwen3 (14B) for free using our Google Colab notebook here! - Read our Blog about Qwen3 support: unsloth.ai/blog/qwen3 - View the rest of our notebooks in our docs here. - Run & export your fine-tuned model to Ollama, llama.cpp or HF. | Unsloth supports | Free Notebooks | Performance | Memory use | |-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------| | **Qwen3 (14B)** | ▶️ Start on Colab | 3x faster | 70% less | | **GRPO with Qwen3 (8B)** | ▶️ Start on Colab | 3x faster | 80% less | | **Llama-3.2 (3B)** | ▶️ Start on Colab-Conversational.ipynb) | 2.4x faster | 58% less | | **Llama-3.2 (11B vision)** | ▶️ Start on Colab-Vision.ipynb) | 2x faster | 60% less | | **Qwen2.5 (7B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2x faster | 60% less | | **Phi-4 (14B)** | ▶️ Start on Colab | 2x faster | 50% less | # To Switch Between Thinking and Non-Thinking If you are using llama.cpp, Ollama, Open WebUI etc., you can add and to user prompts or system messages to switch the model's thinking mode from turn to turn. The model will follow the most recent instruction in multi-turn conversations. Here is an example of multi-turn conversation: # Qwen3-32B ## Qwen3 Highlights Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features: - **Uniquely support of seamless switching between thinking mode** (for complex logical reasoning, math, and coding) and **non-thinking mode** (for efficient, general-purpose dialogue) **within single model**, ensuring optimal performance across various scenarios. - **Significantly enhancement in its reasoning capabilities**, surpassing previous QwQ (in thinking mode) and Qwen2.5 instruct models (in non-thinking mode) on mathematics, code generation, and commonsense logical reasoning. - **Superior human preference alignment**, excelling in creative writing, role-playing, multi-turn dialogues, and instruction following, to deliver a more natural, engaging, and immersive conversational experience. - **Expertise in agent capabilities**, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open-source models in complex agent-based tasks. - **Support of 100+ languages and dialects** with strong capabilities for **multilingual instruction following** and **translation**. ## Model Overview **Qwen3-32B** has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parameters: 32.8B - Number of Paramaters (Non-Embedding): 31.2B - Number of Layers: 64 - Number of Attention Heads (GQA): 64 for Q and 8 for KV - Context Length: 32,768 natively and 131,072 tokens with YaRN. For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our blog, GitHub, and Documentation. ## Quickstart The code of Qwen3 has been in the latest Hugging Face and we advise you to use the latest version of . With , you will encounter the following error: The following contains a code snippet illustrating how to use the model generate content based on given inputs. For deployment, you can use or to create an OpenAI-compatible API endpoint: - vLLM: - SGLang: ## Switching Between Thinking and Non-Thinking Mode > [!TIP] > The switch is also available in APIs created by vLLM and SGLang. > Please refer to our documentation for more details. ### By default, Qwen3 has thinking capabilities enabled, similar to QwQ-32B. This means the model will use its reasoning abilities to enhance the quality of generated responses. For example, when explicitly setting or leaving it as the default value in , the model will engage its thinking mode. In this mode, the model will generate think content wrapped in a block, followed by the final response. > [!NOTE] > For thinking mode, use , , , and (the default setting in ). **DO NOT use greedy decoding**, as it can lead to performance degradation and endless repetitions. For more detailed guidance, please refer to the Best Practices section. ### We provide a hard switch to strictly disable the model's thinking behavior, aligning its functionality with the previous Qwen2.5-Instruct models. This mode is particularly useful in scenarios where disabling thinking is essential for enhancing efficiency. In this mode, the model will not generate any think content and will not include a block. > [!NOTE] > For non-thinking mode, we suggest using , , , and . For more detailed guidance, please refer to the Best Practices section. ### Advanced Usage: Switching Between Thinking and Non-Thinking Modes via User Input We provide a soft switch mechanism that allows users to dynamically control the model's behavior when . Specifically, you can add and to user prompts or system messages to switch the model's thinking mode from turn to turn. The model will follow the most recent instruction in multi-turn conversations. Here is an example of a multi-turn conversation: > **Note** > For API compatibility, when , regardless of whether the user uses or , the model will always output a block wrapped in . However, the content inside this block may be empty if thinking is disabled. > When , the soft switches are not valid. Regardless of any or tags input by the user, the model will not generate think content and will not include a block. ## Agentic Use Qwen3 excels in tool calling capabilities. We recommend using Qwen-Agent to make the best use of agentic ability of Qwen3. Qwen-Agent encapsulates tool-calling templates and tool-calling parsers internally, greatly reducing coding complexity. To define the available tools, you can use the MCP configuration file, use the integrated tool of Qwen-Agent, or integrate other tools by yourself. ## Processing Long Texts Qwen3 natively supports context lengths of up to 32,768 tokens. For conversations where the total length (including both input and output) significantly exceeds this limit, we recommend using RoPE scaling techniques to handle long texts effectively. We have validated the model's performance on context lengths of up to 131,072 tokens using the YaRN method. YaRN is currently supported by several inference frameworks, e.g., and for local use, and for deployment. In general, there are two approaches to enabling YaRN for supported frameworks: - Modifying the model files: In the file, add the fields: For , you need to regenerate the GGUF file after the modification. - Passing command line arguments: For , you can use For , you can use For from , you can use > [!IMPORTANT] > If you encounter the following warning > > please upgrade . > [!NOTE] > All the notable open-source frameworks implement static YaRN, which means the scaling factor remains constant regardless of input length, **potentially impacting performance on shorter texts.** > We advise adding the configuration only when processing long contexts is required. > It is also recommended to modify the as needed. For example, if the typical context length for your application is 65,536 tokens, it would be better to set as 2.0. > [!NOTE] > The default in is set to 40,960. This allocation includes reserving 32,768 tokens for outputs and 8,192 tokens for typical prompts, which is sufficient for most scenarios involving short text processing. If the average context length does not exceed 32,768 tokens, we do not recommend enabling YaRN in this scenario, as it may potentially degrade model performance. > [!TIP] > The endpoint provided by Alibaba Model Studio supports dynamic YaRN by default and no extra configuration is needed. ## Best Practices To achieve optimal performance, we recommend the following settings: 1. **Sampling Parameters**: - For thinking mode (), use , , , and . **DO NOT use greedy decoding**, as it can lead to performance degradation and endless repetitions. - For non-thinking mode (), we suggest using , , , and . - For supported frameworks, you can adjust the parameter between 0 and 2 to reduce endless repetitions. However, using a higher value may occasionally result in language mixing and a slight decrease in model performance. 2. **Adequate Output Length**: We recommend using an output length of 32,768 tokens for most queries. For benchmarking on highly complex problems, such as those found in math and programming competitions, we suggest setting the max output length to 38,912 tokens. This provides the model with sufficient space to generate detailed and comprehensive responses, thereby enhancing its overall performance. 3. **Standardize Output Format**: We recommend using prompts to standardize model outputs when benchmarking. - **Math Problems**: Include \"Please reason step by step, and put your final answer within \\boxed{}.\" in the prompt. - **Multiple-Choice Questions**: Add the following JSON structure to the prompt to standardize responses: \"Please show your choice in the field with only the choice letter, e.g., .\" 4. **No Thinking Content in History**: In multi-turn conversations, the historical model output should only include the final output part and does not need to include the thinking content. It is implemented in the provided chat template in Jinja2. However, for frameworks that do not directly use the Jinja2 chat template, it is up to the developers to ensure that the best practice is followed. ### Citation If you find our work helpful, feel free to give us a cite.", + "model_explanation_gemini": "A 32.8B-parameter causal language model optimized for reasoning, multilingual tasks, and agent capabilities, featuring seamless switching between thinking (complex reasoning) and non-thinking (general dialogue) modes. \n\n**Features:** \n- Supports 100+ languages \n- 32K native context length (extendable to 131K with YaRN) \n- Enhanced reasoning, math, and coding in \"thinking mode\" \n- Efficient general dialogue in \"non-thinking mode\" \n- Superior alignment" +} \ No newline at end of file diff --git a/model_data_json/unsloth_llama-2-7b-bnb-4bit.json b/model_data_json/unsloth_llama-2-7b-bnb-4bit.json new file mode 100644 index 0000000000000000000000000000000000000000..02c359a57d96280a53c6b4fa3fe11619423e8aee --- /dev/null +++ b/model_data_json/unsloth_llama-2-7b-bnb-4bit.json @@ -0,0 +1,23 @@ +{ + "model_id": "unsloth/llama-2-7b-bnb-4bit", + "downloads": 70876, + "tags": [ + "transformers", + "safetensors", + "llama", + "text-generation", + "unsloth", + "llama2", + "llama-2", + "en", + "license:apache-2.0", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "4-bit", + "bitsandbytes", + "region:us" + ], + "description": "--- language: - en library_name: transformers license: apache-2.0 tags: - unsloth - transformers - llama - llama2 - llama-2 --- # Finetune Mistral, Gemma, Llama 2-5x faster with 70% less memory via Unsloth! Directly quantized 4bit model with . We have a Google Colab Tesla T4 notebook for Llama 7b here: ## ✨ Finetune for Free All notebooks are **beginner friendly**! Add your dataset, click \"Run All\", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face. | Unsloth supports | Free Notebooks | Performance | Memory use | |-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------| | **Gemma 7b** | ▶️ Start on Colab | 2.4x faster | 58% less | | **Mistral 7b** | ▶️ Start on Colab | 2.2x faster | 62% less | | **Llama-2 7b** | ▶️ Start on Colab | 2.2x faster | 43% less | | **TinyLlama** | ▶️ Start on Colab | 3.9x faster | 74% less | | **CodeLlama 34b** A100 | ▶️ Start on Colab | 1.9x faster | 27% less | | **Mistral 7b** 1xT4 | ▶️ Start on Kaggle | 5x faster\\* | 62% less | | **DPO - Zephyr** | ▶️ Start on Colab | 1.9x faster | 19% less | - This conversational notebook is useful for ShareGPT ChatML / Vicuna templates. - This text completion notebook is for raw text. This DPO notebook replicates Zephyr. - \\* Kaggle has 2x T4s, but we use 1. Due to overhead, 1x T4 is 5x faster.", + "model_explanation_gemini": "A 4-bit quantized Llama-2-7B model optimized for faster fine-tuning (2.2x speed) and reduced memory usage (43% less) using Unsloth, with beginner-friendly Colab notebooks for deployment. \n\n**Features:** \n- 4-bit quantization for efficiency \n- 2.2x faster fine-tuning than standard Llama-2-7B \n- 43% lower memory consumption \n- Supports export to GGUF, vLLM," +} \ No newline at end of file diff --git a/model_data_json/unsloth_llama-3-8b-bnb-4bit.json b/model_data_json/unsloth_llama-3-8b-bnb-4bit.json new file mode 100644 index 0000000000000000000000000000000000000000..cc2fa42282c77e42a7441ec4be2644521e6d725b --- /dev/null +++ b/model_data_json/unsloth_llama-3-8b-bnb-4bit.json @@ -0,0 +1,26 @@ +{ + "model_id": "unsloth/llama-3-8b-bnb-4bit", + "downloads": 76826, + "tags": [ + "transformers", + "safetensors", + "llama", + "text-generation", + "llama-3", + "meta", + "facebook", + "unsloth", + "en", + "base_model:meta-llama/Meta-Llama-3-8B", + "base_model:quantized:meta-llama/Meta-Llama-3-8B", + "license:llama3", + "autotrain_compatible", + "text-generation-inference", + "endpoints_compatible", + "4-bit", + "bitsandbytes", + "region:us" + ], + "description": "--- language: - en library_name: transformers license: llama3 tags: - llama-3 - llama - meta - facebook - unsloth - transformers base_model: - meta-llama/Meta-Llama-3-8B --- # Finetune Llama 3.2, Gemma 2, Mistral 2-5x faster with 70% less memory via Unsloth! We have a free Google Colab Tesla T4 notebook for Llama 3.1 (8B) here: # Finetune Llama 3.3, Gemma 2, Mistral 2-5x faster with 70% less memory via Unsloth! We have a free Google Colab Tesla T4 notebook for Llama 3.1 (8B) here: # unsloth/Llama-3-8B-bnb-4bit For more details on the model, please go to Meta's original model card ## ✨ Finetune for Free All notebooks are **beginner friendly**! Add your dataset, click \"Run All\", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face. | Unsloth supports | Free Notebooks | Performance | Memory use | |-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------| | **Llama-3.2 (3B)** | ▶️ Start on Colab-Conversational.ipynb) | 2.4x faster | 58% less | | **Llama-3.2 (11B vision)** | ▶️ Start on Colab-Vision.ipynb) | 2x faster | 60% less | | **Qwen2 VL (7B)** | ▶️ Start on Colab-Vision.ipynb) | 1.8x faster | 60% less | | **Qwen2.5 (7B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2x faster | 60% less | | **Llama-3.1 (8B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2.4x faster | 58% less | | **Phi-3.5 (mini)** | ▶️ Start on Colab | 2x faster | 50% less | | **Gemma 2 (9B)** | ▶️ Start on Colab-Alpaca.ipynb) | 2.4x faster | 58% less | | **Mistral (7B)** | ▶️ Start on Colab-Conversational.ipynb) | 2.2x faster | 62% less | - This Llama 3.2 conversational notebook-Conversational.ipynb) is useful for ShareGPT ChatML / Vicuna templates. - This text completion notebook-Text_Completion.ipynb) is for raw text. This DPO notebook replicates Zephyr. - \\* Kaggle has 2x T4s, but we use 1. Due to overhead, 1x T4 is 5x faster. ## Special Thanks A huge thank you to the Meta and Llama team for creating and releasing these models. ## Model Details Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety. **Model developers** Meta **Variations** Llama 3 comes in two sizes — 8B and 70B parameters — in pre-trained and instruction tuned variants. **Input** Models input text only. **Output** Models generate text and code only. **Model Architecture** Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
Training Data Params Context length GQA Token count Knowledge cutoff
Llama 3 A new mix of publicly available online data. 8B 8k Yes 15T+ March, 2023
70B 8k Yes December, 2023
**Llama 3 family of models**. Token counts refer to pretraining data only. Both the 8 and 70B versions use Grouped-Query Attention (GQA) for improved inference scalability. **Model Release Date** April 18, 2024. **Status** This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. **License** A custom commercial license is available at: Where to send questions or comments about the model Instructions on how to provide feedback or comments on the model can be found in the model README. For more technical information about generation parameters and recipes for how to use Llama 3 in applications, please go here. ## Intended Use **Intended Use Cases** Llama 3 is intended for commercial and research use in English. Instruction tuned models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. **Out-of-scope** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3 Community License. Use in languages other than English**. **Note: Developers may fine-tune Llama 3 models for languages beyond English provided they comply with the Llama 3 Community License and the Acceptable Use Policy. ## How to use This repository contains two versions of Meta-Llama-3-70B-Instruct, for use with transformers and with the original codebase. ### Use with transformers See the snippet below for usage with Transformers: ### Use with Please, follow the instructions in the repository. To download Original checkpoints, see the example command below leveraging : For Hugging Face support, we recommend using transformers or TGI, but a similar command works. ## Hardware and Software **Training Factors** We used custom training libraries, Meta's Research SuperCluster, and production clusters for pretraining. Fine-tuning, annotation, and evaluation were also performed on third-party cloud compute. **Carbon Footprint Pretraining utilized a cumulative** 7.7M GPU hours of computation on hardware of type H100-80GB (TDP of 700W). Estimated total emissions were 2290 tCO2eq, 100% of which were offset by Meta’s sustainability program.
Time (GPU hours) Power Consumption (W) Carbon Emitted(tCO2eq)
Llama 3 8B 1.3M 700 390
Llama 3 70B 6.4M 700 1900
Total 7.7M 2290
**CO2 emissions during pre-training**. Time: total GPU time required for training each model. Power Consumption: peak power capacity per GPU device for the GPUs used adjusted for power usage efficiency. 100% of the emissions are directly offset by Meta's sustainability program, and because we are openly releasing these models, the pretraining costs do not need to be incurred by others. ## Training Data **Overview** Llama 3 was pretrained on over 15 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over 10M human-annotated examples. Neither the pretraining nor the fine-tuning datasets include Meta user data. **Data Freshness** The pretraining data has a cutoff of March 2023 for the 7B and December 2023 for the 70B models respectively. ## Benchmarks In this section, we report the results for Llama 3 models on standard automatic benchmarks. For all the evaluations, we use our internal evaluations library. For details on the methodology see here. ### Base pretrained models
Category Benchmark Llama 3 8B Llama2 7B Llama2 13B Llama 3 70B Llama2 70B
General MMLU (5-shot) 66.6 45.7 53.8 79.5 69.7
AGIEval English (3-5 shot) 45.9 28.8 38.7 63.0 54.8
CommonSenseQA (7-shot) 72.6 57.6 67.6 83.8 78.7
Winogrande (5-shot) 76.1 73.3 75.4 83.1 81.8
BIG-Bench Hard (3-shot, CoT) 61.1 38.1 47.0 81.3 65.7
ARC-Challenge (25-shot) 78.6 53.7 67.6 93.0 85.3
Knowledge reasoning TriviaQA-Wiki (5-shot) 78.5 72.1 79.6 89.7 87.5
Reading comprehension SQuAD (1-shot) 76.4 72.2 72.1 85.6 82.6
QuAC (1-shot, F1) 44.4 39.6 44.9 51.1 49.4
BoolQ (0-shot) 75.7 65.5 66.9 79.0 73.1
DROP (3-shot, F1) 58.4 37.9 49.8 79.7 70.2
### Instruction tuned models
Benchmark Llama 3 8B Llama 2 7B Llama 2 13B Llama 3 70B Llama 2 70B
MMLU (5-shot) 68.4 34.1 47.8 82.0 52.9
GPQA (0-shot) 34.2 21.7 22.3 39.5 21.0
HumanEval (0-shot) 62.2 7.9 14.0 81.7 25.6
GSM-8K (8-shot, CoT) 79.6 25.7 77.4 93.0 57.5
MATH (4-shot, CoT) 30.0 3.8 6.7 50.4 11.6
### Responsibility & Safety We believe that an open approach to AI leads to better, safer products, faster innovation, and a bigger overall market. We are committed to Responsible AI development and took a series of steps to limit misuse and harm and support the open source community. Foundation models are widely capable technologies that are built to be used for a diverse range of applications. They are not designed to meet every developer preference on safety levels for all use cases, out-of-the-box, as those by their nature will differ across different applications. Rather, responsible LLM-application deployment is achieved by implementing a series of safety best practices throughout the development of such applications, from the model pre-training, fine-tuning and the deployment of systems composed of safeguards to tailor the safety needs specifically to the use case and audience. As part of the Llama 3 release, we updated our Responsible Use Guide to outline the steps and best practices for developers to implement model and system level safety for their application. We also provide a set of resources including Meta Llama Guard 2 and Code Shield safeguards. These tools have proven to drastically reduce residual risks of LLM Systems, while maintaining a high level of helpfulness. We encourage developers to tune and deploy these safeguards according to their needs and we provide a reference implementation to get you started. #### Llama 3-Instruct As outlined in the Responsible Use Guide, some trade-off between model helpfulness and model alignment is likely unavoidable. Developers should exercise discretion about how to weigh the benefits of alignment and helpfulness for their specific use case and audience. Developers should be mindful of residual risks when using Llama models and leverage additional safety tools as needed to reach the right safety bar for their use case. Safety For our instruction tuned model, we conducted extensive red teaming exercises, performed adversarial evaluations and implemented safety mitigations techniques to lower residual risks. As with any Large Language Model, residual risks will likely remain and we recommend that developers assess these risks in the context of their use case. In parallel, we are working with the community to make AI safety benchmark standards transparent, rigorous and interpretable. Refusals In addition to residual risks, we put a great emphasis on model refusals to benign prompts. Over-refusing not only can impact the user experience but could even be harmful in certain contexts as well. We’ve heard the feedback from the developer community and improved our fine tuning to ensure that Llama 3 is significantly less likely to falsely refuse to answer prompts than Llama 2. We built internal benchmarks and developed mitigations to limit false refusals making Llama 3 our most helpful model to date. #### Responsible release In addition to responsible use considerations outlined above, we followed a rigorous process that requires us to take extra measures against misuse and critical risks before we make our release decision. Misuse If you access or use Llama 3, you agree to the Acceptable Use Policy. The most recent copy of this policy can be found at #### Critical risks CBRNE (Chemical, Biological, Radiological, Nuclear, and high yield Explosives) We have conducted a two fold assessment of the safety of the model in this area: * Iterative testing during model training to assess the safety of responses related to CBRNE threats and other adversarial risks. * Involving external CBRNE experts to conduct an uplift test assessing the ability of the model to accurately provide expert knowledge and reduce barriers to potential CBRNE misuse, by reference to what can be achieved using web search (without the model). ### Cyber Security We have evaluated Llama 3 with CyberSecEval, Meta’s cybersecurity safety eval suite, measuring Llama 3’s propensity to suggest insecure code when used as a coding assistant, and Llama 3’s propensity to comply with requests to help carry out cyber attacks, where attacks are defined by the industry standard MITRE ATT&CK cyber attack ontology. On our insecure coding and cyber attacker helpfulness tests, Llama 3 behaved in the same range or safer than models of equivalent coding capability. ### Child Safety Child Safety risk assessments were conducted using a team of experts, to assess the model’s capability to produce outputs that could result in Child Safety risks and inform on any necessary and appropriate risk mitigations via fine tuning. We leveraged those expert red teaming sessions to expand the coverage of our evaluation benchmarks through Llama 3 model development. For Llama 3, we conducted new in-depth sessions using objective based methodologies to assess the model risks along multiple attack vectors. We also partnered with content specialists to perform red teaming exercises assessing potentially violating content while taking account of market specific nuances or experiences. ### Community Generative AI safety requires expertise and tooling, and we believe in the strength of the open community to accelerate its progress. We are active members of open consortiums, including the AI Alliance, Partnership in AI and MLCommons, actively contributing to safety standardization and transparency. We encourage the community to adopt taxonomies like the MLCommons Proof of Concept evaluation to facilitate collaboration and transparency on safety and content evaluations. Our Purple Llama tools are open sourced for the community to use and widely distributed across ecosystem partners including cloud service providers. We encourage community contributions to our Github repository. Finally, we put in place a set of resources including an output reporting mechanism and bug bounty program to continuously improve the Llama technology with the help of the community. ## Ethical Considerations and Limitations The core values of Llama 3 are openness, inclusivity and helpfulness. It is meant to serve everyone, and to work for a wide range of use cases. It is thus designed to be accessible to people across many different backgrounds, experiences and perspectives. Llama 3 addresses users and their needs as they are, without insertion unnecessary judgment or normativity, while reflecting the understanding that even content that may appear problematic in some cases can serve valuable purposes in others. It respects the dignity and autonomy of all users, especially in terms of the values of free thought and expression that power innovation and progress. But Llama 3 is a new technology, and like any new technology, there are risks associated with its use. Testing conducted to date has been in English, and has not covered, nor could it cover, all scenarios. For these reasons, as with all LLMs, Llama 3’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 3 models, developers should perform safety testing and tuning tailored to their specific applications of the model. As outlined in the Responsible Use Guide, we recommend incorporating Purple Llama solutions into your workflows and specifically Llama Guard which provides a base model to filter input and output prompts to layer system-level safety on top of model-level safety. Please see the Responsible Use Guide available at ## Citation instructions @article{llama3modelcard, title={Llama 3 Model Card}, author={AI@Meta}, year={2024}, url = { } ## Contributors Aaditya Singh; Aaron Grattafiori; Abhimanyu Dubey; Abhinav Jauhri; Abhinav Pandey; Abhishek Kadian; Adam Kelsey; Adi Gangidi; Ahmad Al-Dahle; Ahuva Goldstand; Aiesha Letman; Ajay Menon; Akhil Mathur; Alan Schelten; Alex Vaughan; Amy Yang; Andrei Lupu; Andres Alvarado; Andrew Gallagher; Andrew Gu; Andrew Ho; Andrew Poulton; Andrew Ryan; Angela Fan; Ankit Ramchandani; Anthony Hartshorn; Archi Mitra; Archie Sravankumar; Artem Korenev; Arun Rao; Ashley Gabriel; Ashwin Bharambe; Assaf Eisenman; Aston Zhang; Aurelien Rodriguez; Austen Gregerson; Ava Spataru; Baptiste Roziere; Ben Maurer; Benjamin Leonhardi; Bernie Huang; Bhargavi Paranjape; Bing Liu; Binh Tang; Bobbie Chern; Brani Stojkovic; Brian Fuller; Catalina Mejia Arenas; Chao Zhou; Charlotte Caucheteux; Chaya Nayak; Ching-Hsiang Chu; Chloe Bi; Chris Cai; Chris Cox; Chris Marra; Chris McConnell; Christian Keller; Christoph Feichtenhofer; Christophe Touret; Chunyang Wu; Corinne Wong; Cristian Canton Ferrer; Damien Allonsius; Daniel Kreymer; Daniel Haziza; Daniel Li; Danielle Pintz; Danny Livshits; Danny Wyatt; David Adkins; David Esiobu; David Xu; Davide Testuggine; Delia David; Devi Parikh; Dhruv Choudhary; Dhruv Mahajan; Diana Liskovich; Diego Garcia-Olano; Diego Perino; Dieuwke Hupkes; Dingkang Wang; Dustin Holland; Egor Lakomkin; Elina Lobanova; Xiaoqing Ellen Tan; Emily Dinan; Eric Smith; Erik Brinkman; Esteban Arcaute; Filip Radenovic; Firat Ozgenel; Francesco Caggioni; Frank Seide; Frank Zhang; Gabriel Synnaeve; Gabriella Schwarz; Gabrielle Lee; Gada Badeer; Georgia Anderson; Graeme Nail; Gregoire Mialon; Guan Pang; Guillem Cucurell; Hailey Nguyen; Hannah Korevaar; Hannah Wang; Haroun Habeeb; Harrison Rudolph; Henry Aspegren; Hu Xu; Hugo Touvron; Iga Kozlowska; Igor Molybog; Igor Tufanov; Iliyan Zarov; Imanol Arrieta Ibarra; Irina-Elena Veliche; Isabel Kloumann; Ishan Misra; Ivan Evtimov; Jacob Xu; Jade Copet; Jake Weissman; Jan Geffert; Jana Vranes; Japhet Asher; Jason Park; Jay Mahadeokar; Jean-Baptiste Gaya; Jeet Shah; Jelmer van der Linde; Jennifer Chan; Jenny Hong; Jenya Lee; Jeremy Fu; Jeremy Teboul; Jianfeng Chi; Jianyu Huang; Jie Wang; Jiecao Yu; Joanna Bitton; Joe Spisak; Joelle Pineau; Jon Carvill; Jongsoo Park; Joseph Rocca; Joshua Johnstun; Junteng Jia; Kalyan Vasuden Alwala; Kam Hou U; Kate Plawiak; Kartikeya Upasani; Kaushik Veeraraghavan; Ke Li; Kenneth Heafield; Kevin Stone; Khalid El-Arini; Krithika Iyer; Kshitiz Malik; Kuenley Chiu; Kunal Bhalla; Kyle Huang; Lakshya Garg; Lauren Rantala-Yeary; Laurens van der Maaten; Lawrence Chen; Leandro Silva; Lee Bell; Lei Zhang; Liang Tan; Louis Martin; Lovish Madaan; Luca Wehrstedt; Lukas Blecher; Luke de Oliveira; Madeline Muzzi; Madian Khabsa; Manav Avlani; Mannat Singh; Manohar Paluri; Mark Zuckerberg; Marcin Kardas; Martynas Mankus; Mathew Oldham; Mathieu Rita; Matthew Lennie; Maya Pavlova; Meghan Keneally; Melanie Kambadur; Mihir Patel; Mikayel Samvelyan; Mike Clark; Mike Lewis; Min Si; Mitesh Kumar Singh; Mo Metanat; Mona Hassan; Naman Goyal; Narjes Torabi; Nicolas Usunier; Nikolay Bashlykov; Nikolay Bogoychev; Niladri Chatterji; Ning Dong; Oliver Aobo Yang; Olivier Duchenne; Onur Celebi; Parth Parekh; Patrick Alrassy; Paul Saab; Pavan Balaji; Pedro Rittner; Pengchuan Zhang; Pengwei Li; Petar Vasic; Peter Weng; Polina Zvyagina; Prajjwal Bhargava; Pratik Dubal; Praveen Krishnan; Punit Singh Koura; Qing He; Rachel Rodriguez; Ragavan Srinivasan; Rahul Mitra; Ramon Calderer; Raymond Li; Robert Stojnic; Roberta Raileanu; Robin Battey; Rocky Wang; Rohit Girdhar; Rohit Patel; Romain Sauvestre; Ronnie Polidoro; Roshan Sumbaly; Ross Taylor; Ruan Silva; Rui Hou; Rui Wang; Russ Howes; Ruty Rinott; Saghar Hosseini; Sai Jayesh Bondu; Samyak Datta; Sanjay Singh; Sara Chugh; Sargun Dhillon; Satadru Pan; Sean Bell; Sergey Edunov; Shaoliang Nie; Sharan Narang; Sharath Raparthy; Shaun Lindsay; Sheng Feng; Sheng Shen; Shenghao Lin; Shiva Shankar; Shruti Bhosale; Shun Zhang; Simon Vandenhende; Sinong Wang; Seohyun Sonia Kim; Soumya Batra; Sten Sootla; Steve Kehoe; Suchin Gururangan; Sumit Gupta; Sunny Virk; Sydney Borodinsky; Tamar Glaser; Tamar Herman; Tamara Best; Tara Fowler; Thomas Georgiou; Thomas Scialom; Tianhe Li; Todor Mihaylov; Tong Xiao; Ujjwal Karn; Vedanuj Goswami; Vibhor Gupta; Vignesh Ramanathan; Viktor Kerkez; Vinay Satish Kumar; Vincent Gonguet; Vish Vogeti; Vlad Poenaru; Vlad Tiberiu Mihailescu; Vladan Petrovic; Vladimir Ivanov; Wei Li; Weiwei Chu; Wenhan Xiong; Wenyin Fu; Wes Bouaziz; Whitney Meers; Will Constable; Xavier Martinet; Xiaojian Wu; Xinbo Gao; Xinfeng Xie; Xuchao Jia; Yaelle Goldschlag; Yann LeCun; Yashesh Gaur; Yasmine Babaei; Ye Qi; Yenda Li; Yi Wen; Yiwen Song; Youngjin Nam; Yuchen Hao; Yuchen Zhang; Yun Wang; Yuning Mao; Yuzi He; Zacharie Delpierre Coudert; Zachary DeVito; Zahra Hankir; Zhaoduo Wen; Zheng Yan; Zhengxing Chen; Zhenyu Yang; Zoe Papakipos", + "model_explanation_gemini": "\"Optimizes Llama-3-8B for faster fine-tuning (2-5x speed) and reduced memory usage (70% less) via Unsloth, supporting conversational and text-completion tasks.\"\n\n**Features**: \n- 4-bit quantization (bnb-4bit) \n- 2-5x faster fine-tuning than standard methods \n- 70% lower memory consumption \n- Supports GGUF/vLLM export and HF uploads \n- Beginner-friendly Colab notebooks" +} \ No newline at end of file diff --git a/model_data_json/unslothai_4.json b/model_data_json/unslothai_4.json new file mode 100644 index 0000000000000000000000000000000000000000..d9b9a47419392e681edba3042da400f30271fcee --- /dev/null +++ b/model_data_json/unslothai_4.json @@ -0,0 +1,15 @@ +{ + "model_id": "unslothai/4", + "downloads": 71368, + "tags": [ + "transformers", + "safetensors", + "llama", + "feature-extraction", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- library_name: transformers tags: [] ---", + "model_explanation_gemini": "The provided description for the model 'unslothai_4' does not contain any information about its functionality, purpose, features, or comparisons with other models. \n\nFeatures: None specified. \nComparative Explanation: Not applicable." +} \ No newline at end of file diff --git a/model_data_json/unslothai_azure.json b/model_data_json/unslothai_azure.json new file mode 100644 index 0000000000000000000000000000000000000000..763de657b20f362a25f8e07c59b580aed353f301 --- /dev/null +++ b/model_data_json/unslothai_azure.json @@ -0,0 +1,15 @@ +{ + "model_id": "unslothai/azure", + "downloads": 79899, + "tags": [ + "transformers", + "safetensors", + "llama", + "feature-extraction", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- {} --- We log statistics to see if any envs are breaking", + "model_explanation_gemini": "Logs statistics to monitor environment stability.\n\nFeatures: \n- Tracks and logs environment statistics. \n- Detects potential environment issues or breaks. \n\nComparison: Not applicable (no comparative details provided in the description)." +} \ No newline at end of file diff --git a/model_data_json/unslothai_vram-96.json b/model_data_json/unslothai_vram-96.json new file mode 100644 index 0000000000000000000000000000000000000000..c1ee11b94edaaabf7c61e6b477b285c5f4792227 --- /dev/null +++ b/model_data_json/unslothai_vram-96.json @@ -0,0 +1,16 @@ +{ + "model_id": "unslothai/vram-96", + "downloads": 71977, + "tags": [ + "transformers", + "safetensors", + "llama", + "feature-extraction", + "arxiv:1910.09700", + "text-generation-inference", + "endpoints_compatible", + "region:us" + ], + "description": "--- library_name: transformers tags: [] --- # Model Card for Model ID ## Model Details ### Model Description This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. - **Developed by:** [More Information Needed] - **Funded by [optional]:** [More Information Needed] - **Shared by [optional]:** [More Information Needed] - **Model type:** [More Information Needed] - **Language(s) (NLP):** [More Information Needed] - **License:** [More Information Needed] - **Finetuned from model [optional]:** [More Information Needed] ### Model Sources [optional] - **Repository:** [More Information Needed] - **Paper [optional]:** [More Information Needed] - **Demo [optional]:** [More Information Needed] ## Uses ### Direct Use [More Information Needed] ### Downstream Use [optional] [More Information Needed] ### Out-of-Scope Use [More Information Needed] ## Bias, Risks, and Limitations [More Information Needed] ### Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. ## How to Get Started with the Model Use the code below to get started with the model. [More Information Needed] ## Training Details ### Training Data [More Information Needed] ### Training Procedure #### Preprocessing [optional] [More Information Needed] #### Training Hyperparameters - **Training regime:** [More Information Needed] #### Speeds, Sizes, Times [optional] [More Information Needed] ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data [More Information Needed] #### Factors [More Information Needed] #### Metrics [More Information Needed] ### Results [More Information Needed] #### Summary ## Model Examination [optional] [More Information Needed] ## Environmental Impact Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). - **Hardware Type:** [More Information Needed] - **Hours used:** [More Information Needed] - **Cloud Provider:** [More Information Needed] - **Compute Region:** [More Information Needed] - **Carbon Emitted:** [More Information Needed] ## Technical Specifications [optional] ### Model Architecture and Objective [More Information Needed] ### Compute Infrastructure [More Information Needed] #### Hardware [More Information Needed] #### Software [More Information Needed] ## Citation [optional] **BibTeX:** [More Information Needed] **APA:** [More Information Needed] ## Glossary [optional] [More Information Needed] ## More Information [optional] [More Information Needed] ## Model Card Authors [optional] [More Information Needed] ## Model Card Contact [More Information Needed]", + "model_explanation_gemini": "The model description for 'unslothai_vram-96' provides insufficient information to determine its core functionality, purpose, or features, as key details like model type, task, and training data are marked \"[More Information Needed].\"\n\nFeatures: \n- No discernible features are listed due to incomplete documentation. \n\nComparison: \n- No comparative analysis is possible as the description lacks details about the model's performance, architecture, or benchmarks against other models." +} \ No newline at end of file diff --git a/model_data_json/vblagoje_bert-english-uncased-finetuned-pos.json b/model_data_json/vblagoje_bert-english-uncased-finetuned-pos.json new file mode 100644 index 0000000000000000000000000000000000000000..c8e9b0f41cc56e8c905b65301a8862dda4adc4aa --- /dev/null +++ b/model_data_json/vblagoje_bert-english-uncased-finetuned-pos.json @@ -0,0 +1,17 @@ +{ + "model_id": "vblagoje/bert-english-uncased-finetuned-pos", + "downloads": 77652, + "tags": [ + "transformers", + "pytorch", + "jax", + "safetensors", + "bert", + "token-classification", + "autotrain_compatible", + "endpoints_compatible", + "region:us" + ], + "description": "## Part-of-Speech (PoS) Tags Below are the Part-of-Speech (PoS) tags used in the model: | **Tag** | **Meaning** | **Examples** | |-----------|------------------------------------------------------|--------------------------------| | ADP | Adposition (prepositions or postpositions) | in, on, by | | ADJ | Adjective | significant, global | | ADV | Adverb | quickly, often | | AUX | Auxiliary verb | is, was | | CCONJ | Coordinating conjunction | and, but | | DET | Determiner | the, a | | INTJ | Interjection | oh, wow | | NOUN | Noun | man, city | | NUM | Number | one, 2022 | | PART | Particle | 's, to | | PRON | Pronoun | he, which | | PROPN | Proper noun | Neil Armstrong, Paris | | PUNCT | Punctuation mark | ,, . | | SCONJ | Subordinating conjunction | because, although | | SYM | Symbol | $, % | | VERB | Verb | run, is | | X | Other (generally words that do not fit into other categories) | [not defined] |", + "model_explanation_gemini": "Finetuned for part-of-speech tagging, this BERT-based model identifies and classifies words into grammatical categories like nouns, verbs, adjectives, and more. \n\n**Model Features:** \n- Uncased English BERT finetuned for PoS tagging \n- Supports standard PoS tags (e.g., NOUN, VERB, ADJ, ADV, PUNCT) \n- Handles diverse word categories, including symbols (SYM) and interjections (INTJ)" +} \ No newline at end of file diff --git a/model_data_json/vidore_colpali-v1.3.json b/model_data_json/vidore_colpali-v1.3.json new file mode 100644 index 0000000000000000000000000000000000000000..f1271e21b6ad35176d4756f79076cc8c83249586 --- /dev/null +++ b/model_data_json/vidore_colpali-v1.3.json @@ -0,0 +1,22 @@ +{ + "model_id": "vidore/colpali-v1.3", + "downloads": 81111, + "tags": [ + "colpali", + "safetensors", + "vidore", + "vidore-experimental", + "visual-document-retrieval", + "en", + "dataset:vidore/colpali_train_set", + "arxiv:2004.12832", + "arxiv:2407.01449", + "arxiv:2106.09685", + "base_model:vidore/colpaligemma-3b-pt-448-base", + "base_model:finetune:vidore/colpaligemma-3b-pt-448-base", + "license:mit", + "region:us" + ], + "description": "--- license: mit library_name: colpali base_model: vidore/colpaligemma-3b-pt-448-base language: - en tags: - vidore - vidore-experimental datasets: - vidore/colpali_train_set pipeline_tag: visual-document-retrieval --- # ColPali: Visual Retriever based on PaliGemma-3B with ColBERT strategy ## This version is trained with 256 batch size for 3 epochs on the same data as the original ColPali model. ColPali is a model based on a novel model architecture and training strategy based on Vision Language Models (VLMs) to efficiently index documents from their visual features. It is a PaliGemma-3B extension that generates ColBERT- style multi-vector representations of text and images. It was introduced in the paper ColPali: Efficient Document Retrieval with Vision Language Models and first released in this repository